Attending ACL in Vienna!

We were at ACL to host the TRL workshop and present our paper on LLM’s tabular reasoning capabilities!

We’ve been hosting the fourth Table Representation Learning (TRL) workshop at ACL, which featured exciting keynotes by Dan Roth, Tao Yu, and others, among excellent research studies. Check out the full overview of talks and papers at: https://table-representation-learning.github.io/ACL2025/#accepted-papers.

Cornelius also presented our analysis of the reasoning capabilities of LLM’s over multi-table inputs in various flavors of complexity. In this paper, we empirically show the unreliability of lexical metrics such as BLUE for tabular QA, and used an LLM-as-a-judge to show the large discrepancy in performance assessment compared to existing benchmarks (TQA-Bench). LLMs aren’t so good as we think with multi-table analytical question answering! We also surface numerous weaknesses of LLMs in the presence of data issues (e.g. duplicates, missing values). Read more in our paper here: https://arxiv.org/abs/2505.07453.