ALOT - Amsterdam Lunch On Table

Description

Amsterdam Lunch on Table (ALOT) is a reading group focussed on table representation learning and generally neural models for structured data that takes place on every first and third Wednesday of the month over lunch (at 12:00). Our objective is to foster a collaborative environment where researchers from the Amsterdam region can discuss and explore the intersection of AI and structured data. Each session is designed to be interactive, encouraging participants to engage in discussions that deepen their understanding of the latest research and methodologies. Through these sessions, we aim to inspire research ideas, support growth as researcher, and facilitate networking opportunities within the community.

Where & when?

When: Every first and third Wednesday of the month at 12:00 (over lunch)
Where: Centrum Wiskunde & Informatica (CWI), room L302 (Science Park 123, 1098 XG Amsterdam)

Please respond to our message on Discord if you are joining so we can pick you up at the CWI entrance. We are considering to move the location to the UvA campus at Science Park to make it more accessible for everyone. We will keep you updated on this!

How it works

We discuss one paper in each session. The paper is selected by the group and is announced at least a week in advance. One person is responsible for chairing the session and preparing a short introduction to the paper. The session chair is also responsible for facilitating the discussion and ensuring that everyone has a chance to contribute. We expect participants to read the paper in advance and send some questions or discussion points to the session chair to enable a more comprehensive and engaging discussion.
We are meeting for our reading group over lunch, and we encourage people to eat while we are discussing the paper. We have catered lunch for the group, so you don't have to bring your own lunch. Please indicate if you are coming to the session on the announcement on Discord so we can order enough food for everyone!

Want to join the reading group? Then join the ALOT Discord channel. We manage the reading group via Discord and will announce the papers and sessions there.

Next Session

The next session of the ALOT reading group will take place on Wednesday, November 05, 2025 at 12:00 and we will discuss:

Paper: SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL
Authors: G. Qu et al.
Venue: ACL (2025)
Session Chair: Cornelius Wolf

Abstract

Current self-correction approaches in text-to-SQL face two critical limitations: 1) Conventional self-correction methods rely on recursive self-calls of LLMs, resulting in multiplicative computational overhead, and 2) LLMs struggle to implement effective error detection and correction for monolithic SQL queries, as they fail to demonstrate the underlying reasoning path. In this work, we propose **SHARE**, a **S**LM-based **H**ierarchical **A**ction cor**RE**ction assistant that enables LLMs to perform more precise error localization and efficient correction. SHARE orchestrates three specialized Small Language Models (SLMs) in a sequential pipeline, where it first transforms monolithic SQL queries into stepwise action trajectories that reveal underlying reasoning, followed by a two-phase granular refinement. We further propose a novel hierarchical self-evolution strategy for data-efficient training. Our experimental results demonstrate that SHARE effectively enhances self-correction capabilities while proving robust across various LLMs. Furthermore, our comprehensive analysis shows that SHARE maintains strong performance even in low-resource training settings, which is particularly valuable for text-to-SQL applications with data privacy constraints.

Previous Sessions

2025

Wednesday, April 16, 2025 - Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding - A Survey

Title: Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding - A Survey
Authors: X. Fang et al.
Venue: Transactions on Machine Learning Research (2024)
Session Chair: Daniel Gomm

Abstract: Recent breakthroughs in large language modeling have facilitated rigorous exploration of their application in diverse tasks related to tabular data modeling, such as prediction, tabular data synthesis, question answering, and table understanding. Each task presents unique challenges and opportunities. However, there is currently a lack of comprehensive review that summarizes and compares the key techniques, metrics, datasets, models, and optimization approaches in this research domain. This survey aims to address this gap by consolidating recent progress in these areas, offering a thorough survey and taxonomy of the datasets, metrics, and methodologies utilized. It identifies strengths, limitations, unexplored territories, and gaps in the existing literature, while providing some insights for future research directions in this vital and rapidly evolving field. It also provides relevant code and datasets references. Through this comprehensive review, we hope to provide interested readers with pertinent references and insightful perspectives, empowering them with the necessary tools and knowledge to effectively navigate and address the prevailing challenges in the field

Notes: We are reading this survey paper in our inaugural session to get a good overview of the field.

Synopsis of Reading Group Session

Wednesday, May 21, 2025 - TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

Title: TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Authors: N. Hollmann et al.
Venue: ICLR (2023)
Session Chair: Zeyu Zhang

Abstract: We present TabPFN, a trained Transformer that can do supervised classification for small tabular datasets in less than a second, needs no hyperparameter tuning and is competitive with state-of-the-art classification methods. TabPFN performs in-context learning (ICL), it learns to make predictions using sequences of labeled examples (x, f(x)) given in the input, without requiring further parameter updates. TabPFN is fully entailed in the weights of our network, which accepts training and test samples as a set-valued input and yields predictions for the entire test set in a single forward pass. TabPFN is a Prior-Data Fitted Network (PFN) and is trained offline once, to approximate Bayesian inference on synthetic datasets drawn from our prior. This prior incorporates ideas from causal reasoning: It entails a large space of structural causal models with a preference for simple structures. On the 18 datasets in the OpenML-CC18 suite that contain up to 1 000 training data points, up to 100 purely numerical features without missing values, and up to 10 classes, we show that our method clearly outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with up to 230× speedup. This increases to a 5 700× speedup when using a GPU. We also validate these results on an additional 67 small numerical datasets from OpenML. We provide all our code, the trained TabPFN, an interactive browser demo and a Colab notebook at [this https URL](https://github.com/PriorLabs/TabPFN).

Notes: We will use the [nature publication](https://www.nature.com/articles/s41586-024-08328-6) as supplementary material.

Wednesday, June 04, 2025 - TableGPT2: A Large Multimodal Model with Tabular Data Integration

Title: TableGPT2: A Large Multimodal Model with Tabular Data Integration
Authors: A. Su et al.
Venue: arXiv (2024)
Session Chair: Xue Li

Abstract: The emergence of models like GPTs, Claude, LLaMA, and Qwen has reshaped AI applications, presenting vast new opportunities across industries. Yet, the integration of tabular data remains notably underdeveloped, despite its foundational role in numerous real-world domains. This gap is critical for three main reasons. First, database or data warehouse data integration is essential for advanced applications; second, the vast and largely untapped resource of tabular data offers immense potential for analysis; and third, the business intelligence domain specifically demands adaptable, precise solutions that many current LLMs may struggle to provide. In response, we introduce TableGPT2, a model rigorously pre-trained and fine-tuned with over 593.8K tables and 2.36M high-quality query-table-output tuples, a scale of table-related data unprecedented in prior research. This extensive training enables TableGPT2 to excel in table-centric tasks while maintaining strong general language and coding abilities. One of TableGPT2's key innovations is its novel table encoder, specifically designed to capture schema-level and cell-level information. This encoder strengthens the model's ability to handle ambiguous queries, missing column names, and irregular tables commonly encountered in real-world applications. Similar to visual language models, this pioneering approach integrates with the decoder to form a robust large multimodal model. We believe the results are compelling: over 23 benchmarking metrics, TableGPT2 achieves an average performance improvement of 35.20% in the 7B model and 49.32% in the 72B model over prior benchmark-neutral LLMs, with robust general-purpose capabilities intact.

Wednesday, June 18, 2025 - AOP: Automated and Interactive LLM Pipeline Orchestration for Answering Complex Queries

Title: AOP: Automated and Interactive LLM Pipeline Orchestration for Answering Complex Queries
Authors: J. Wang, G. Li
Venue: CIDR (2025)
Session Chair: Daniel Gomm

Abstract: Current data lakes are limited to basic put/get functions on unstructured data and analytical queries on structured data. They fall short in handling complex queries that require multi-hop semantic retrieval and linking, multi-step logical reasoning, and multi-stage semantic analytics across unstructured, semi-structured, and structured data in data lakes. The introduction of large language models (LLMs) has significantly transformed the landscape of traditional data search and analytics across different fields due to their semantic comprehension and reasoning skills. Utilizing LLMs opens up new opportunities to efficiently handle these complex queries for data search and analytics, spanning structured, semi-structured, and unstructured data types in data lakes. However, LLMs struggle with complex queries that require complex task decomposition, pipeline orchestration, pipeline optimization, interactive execution, and self-reflection. In this work, we propose AOP, the first systematic system for automated pipeline orchestration in LLMs for answering complex queries on data lakes. AOP pre-defines standard semantic operators crucial for building execution workflows, such as semantic retrieval, filtering, aggregation, and validation. Then given an online query, AOP extracts relevant operators and uses these operators to automatically and interactively compose optimized pipelines with the assistance of LLMs. This enables AOP to adaptively and accurately address diverse and complex queries on data lakes. To further improve efficiency, we introduce query optimization techniques, including prefetching and parallel execution, to enhance overall efficiency without sacrificing accuracy. Through extensive experiments on real-world datasets, we demonstrate that AOP significantly improves the accuracy for answering complex queries. For instance, on a challenging test set, AOP increases answer accuracy by 45%.

Friday, July 04, 2025 - Artificial Intelligence and Actor-Specific Decisions

Title: Artificial Intelligence and Actor-Specific Decisions
Authors: T. Felin et al.
Venue: Social Science Research Network (2025)
Session Chair: Cornelius Wolff

Abstract: Artificial intelligence (AI) is increasingly seen as potentially replacing humans in decision making and problem solving across numerous domains. We argue that AI is useful for a broad range of decisions, but not for actor-specific ones. Actor-specific decisions are (a) forward-looking, (b) individual and idiosyncratic, and require (c) reasoning, and some form of (d) experimentation or intervention. These four criteria—informally captured by the “FIRE” acronym—demarcate when decisions are more conducive to being made by humans rather than AI. The “actor” in actor-specificity refers to the focal decision maker, thus highlighting the need for a first-person point of view to decision making—an approach that cannot be modeled from a third-person, population-level perspective (which is also the basis of AI). We also show how the FIRE criteria jointly implicate and offer normative guidance for actor-specific, strategic decision making. We discuss the implications of our arguments for the theory-based view and the key cognitive processes of search, representation, and aggregation.

Wednesday, September 17, 2025 - On Finetuning Tabular Foundation Models

Title: On Finetuning Tabular Foundation Models
Authors: I. Rubachev et al.
Venue: arXiv (2025)
Session Chair: Cornelius Wolf & Daniel Gomm

Abstract: Foundation models are an emerging research direction in tabular deep learning. Notably, TabPFNv2 recently claimed superior performance over traditional GBDT-based methods on small-scale datasets using an in-context learning paradigm, which does not adapt model parameters to target datasets. However, the optimal finetuning approach for adapting tabular foundational models, and how this adaptation reshapes their internal mechanisms, remains underexplored. While prior works studied finetuning for earlier foundational models, inconsistent findings and TabPFNv2's unique architecture necessitate fresh investigation. To address these questions, we first systematically evaluate various finetuning strategies on diverse datasets. Our findings establish full finetuning as the most practical solution for TabPFNv2 in terms of time-efficiency and effectiveness. We then investigate how finetuning alters TabPFNv2's inner mechanisms, drawing an analogy to retrieval-augmented models. We reveal that the success of finetuning stems from the fact that after gradient-based adaptation, the dot products of the query-representations of test objects and the key-representations of in-context training objects more accurately reflect their target similarity. This improved similarity allows finetuned TabPFNv2 to better approximate target dependency by appropriately weighting relevant in-context samples, improving the retrieval-based prediction logic. From the practical perspective, we managed to finetune TabPFNv2 on datasets with up to 50K objects, observing performance improvements on almost all tasks. More precisely, on academic datasets with I.I.D. splits, finetuning allows TabPFNv2 to achieve state-of-the-art results, while on datasets with gradual temporal shifts and rich feature sets, TabPFNv2 is less stable and prior methods remain better.

Notes: We are reading this paper as preparation for the upcoming presentation by Ivan Rubachev (Yandex) in the tabular data alphaxiv community.

Wednesday, October 15, 2025 - The Anatomy of a Personal Health Agent

Title: The Anatomy of a Personal Health Agent
Authors: A. Heydari et al.
Venue: arXiv (2025)
Session Chair: Xue Li

Abstract: Health is a fundamental pillar of human wellness, and the rapid advancements in large language models (LLMs) have driven the development of a new generation of health agents. However, the application of health agents to fulfill the diverse needs of individuals in daily non-clinical settings is underexplored. In this work, we aim to build a comprehensive personal health agent that is able to reason about multimodal data from everyday consumer wellness devices and common personal health records, and provide personalized health recommendations. To understand end-users' needs when interacting with such an assistant, we conducted an in-depth analysis of web search and health forum queries, alongside qualitative insights from users and health experts gathered through a user-centered design process. Based on these findings, we identified three major categories of consumer health needs, each of which is supported by a specialist sub-agent: (1) a data science agent that analyzes personal time-series wearable and health record data, (2) a health domain expert agent that integrates users' health and contextual data to generate accurate, personalized insights, and (3) a health coach agent that synthesizes data insights, guiding users using a specified psychological strategy and tracking users' progress. Furthermore, we propose and develop the Personal Health Agent (PHA), a multi-agent framework that enables dynamic, personalized interactions to address individual health needs. To evaluate each sub-agent and the multi-agent system, we conducted automated and human evaluations across 10 benchmark tasks, involving more than 7,000 annotations and 1,100 hours of effort from health experts and end-users. Our work represents the most comprehensive evaluation of a health agent to date and establishes a strong foundation towards the futuristic vision of a personal health agent accessible to everyone.