ALOT - Amsterdam Lunch On Table
Description
Amsterdam Lunch on Table (ALOT) is a reading group focussed on table representation learning and generally neural models for structured data that takes place on every first and third Wednesday of the month over lunch (at 12:00). Our objective is to foster a collaborative environment where researchers from the Amsterdam region can discuss and explore the intersection of AI and structured data. Each session is designed to be interactive, encouraging participants to engage in discussions that deepen their understanding of the latest research and methodologies. Through these sessions, we aim to inspire research ideas, support growth as researcher, and facilitate networking opportunities within the community.
Where & when?
- When: Every first and third Wednesday of the month at 12:00 (over lunch)
- Where: For now, Centrum Wiskunde & Informatica (CWI), room L015 (Science Park 123, 1098 XG Amsterdam)
How it works
We discuss one paper in each session. The paper is selected by the group and is announced at least a week in advance. One person is responsible for chairing the session and preparing a short introduction to the paper. The session chair is also responsible for facilitating the discussion and ensuring that everyone has a chance to contribute. We expect participants to read the paper in advance and send some questions or discussion points to the session chair to enable a more comprehensive and engaging discussion.We are meeting for our reading group over lunch, and we encourage people to eat while we are discussing the paper. We have catered lunch for the group, so you don't have to bring your own lunch. Please indicate if you are coming to the session on the announcement on Discord so we can order enough food for everyone!
Want to join the reading group? Then join the ALOT Discord channel. We manage the reading group via Discord and will announce the papers and sessions there.
Next Session
The next session of the ALOT reading group will take place on Wednesday, May 21, 2025 at 12:00 and we will discuss:
Paper: TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Authors: N. Hollmann et al.
Venue: ICLR (2023)
Session Chair: Zeyu Zhang
Abstract
We present TabPFN, a trained Transformer that can do supervised classification for small tabular datasets in less than a second, needs no hyperparameter tuning and is competitive with state-of-the-art classification methods. TabPFN performs in-context learning (ICL), it learns to make predictions using sequences of labeled examples (x, f(x)) given in the input, without requiring further parameter updates. TabPFN is fully entailed in the weights of our network, which accepts training and test samples as a set-valued input and yields predictions for the entire test set in a single forward pass. TabPFN is a Prior-Data Fitted Network (PFN) and is trained offline once, to approximate Bayesian inference on synthetic datasets drawn from our prior. This prior incorporates ideas from causal reasoning: It entails a large space of structural causal models with a preference for simple structures. On the 18 datasets in the OpenML-CC18 suite that contain up to 1 000 training data points, up to 100 purely numerical features without missing values, and up to 10 classes, we show that our method clearly outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with up to 230× speedup. This increases to a 5 700× speedup when using a GPU. We also validate these results on an additional 67 small numerical datasets from OpenML. We provide all our code, the trained TabPFN, an interactive browser demo and a Colab notebook at [this https URL](https://github.com/PriorLabs/TabPFN).Note: We will use the nature publication as supplementary material.
Previous Sessions
2025
Wednesday, April 16, 2025 - Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding - A Survey
- Title: Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding - A Survey
- Authors: X. Fang et al.
- Venue: Transactions on Machine Learning Research (2024)
- Session Chair: Daniel Gomm
Abstract: Recent breakthroughs in large language modeling have facilitated rigorous exploration of their application in diverse tasks related to tabular data modeling, such as prediction, tabular data synthesis, question answering, and table understanding. Each task presents unique challenges and opportunities. However, there is currently a lack of comprehensive review that summarizes and compares the key techniques, metrics, datasets, models, and optimization approaches in this research domain. This survey aims to address this gap by consolidating recent progress in these areas, offering a thorough survey and taxonomy of the datasets, metrics, and methodologies utilized. It identifies strengths, limitations, unexplored territories, and gaps in the existing literature, while providing some insights for future research directions in this vital and rapidly evolving field. It also provides relevant code and datasets references. Through this comprehensive review, we hope to provide interested readers with pertinent references and insightful perspectives, empowering them with the necessary tools and knowledge to effectively navigate and address the prevailing challenges in the field
Notes: We are reading this survey paper in our inaugural session to get a good overview of the field.