TRL Seminar
Info
The Table Representation Learning (TRL) Seminar hosts talks on recent research in representation learning and generative models for structured data. This includes fundamental mechanisms for modeling structured data, retrieval from tabular data sources, multi-modal tabular learning, applications from data management, to reasoning and predicting over tabular data.
Organization
The TRL Seminar is an initiative from the TRL Lab under the affiliated Table Representation Learning Research Theme in the ELLIS unit Amsterdam, and is organized by Madelon Hulsebos (CWI).
Logistics
- When: every second Friday of the month, 4-5pm, with drinks afterwards.
- Where: room L3.36, LAB42 (University of Amsterdam), Science Park.
- How: Talks are in-person, streamed and recorded through Zoom (link TBC).
Upcoming talks
Past talks
Marine Le Morvan, Inria
Friday 11 April 4-5pm, L3.36 at LAB42, Amsterdam Science Park, in-person talk and streamed through Zoom
Unfold Bio
Marine Le Morvan is an INRIA research scientist in the SODA team in Paris-Saclay. Her research lies at the intersection of statistical learning and trustworthy AI, with a focus on:
- Tabular foundation models, which unlock new possibilities through large-scale pretraining.
- Model auditing, to enhance the trustworthiness and reliability of machine learning systems.
- Learning from incomplete data, a challenge pervasive in fields like healthcare and social sciences.
“TabICL: A Tabular Foundation Model for In-Context Learning on Large Data”
Abstract: The long-standing dominance of gradient-boosted decision trees on tabular data is currently challenged by tabular foundation models using In-Context Learning (ICL): setting the training data as context for the test data and predicting in a single forward pass without parameter updates. While the very recent TabPFNv2 foundation model (2025) excels on tables with up to 10K samples, its alternating column- and row-wise attentions make handling large training sets computationally prohibitive. So, can ICL be effectively scaled and deliver a benefit for larger tables? We introduce TabICL, a tabular foundation model pre-trained on datasets with up to 60K samples and handling 500K samples on affordable resources. This is enabled by a novel two-stage architecture: a column-then-row attention mechanism to build fixed-dimensional embeddings of rows, followed by a transformer for efficient ICL. On the TALENT benchmark with 200 datasets, TabICL is on par with TabPFNv2 while being systematically faster (up to 10 times), and significantly outperforms all other approaches. On the 56 datasets with over 10K samples, TabICL surpasses both TabPFNv2 and CatBoost, demonstrating the potential of ICL for large data.
Join the seminar remotely via Zoom: https://cwi-nl-zoom.zoom.us/j/86928893058?pwd=0tFURmzfFWXtWyN4xqkx15urhoui7b.1
Vaishali Pal, University of Amsterdam
Thursday 22 May 4-5pm, L3.33 at LAB42, Amsterdam Science Park, live through Zoom
Unfold Bio
Vaishali is a final-year PhD candidate at the Information Retrieval Lab at the University of Amsterdam. Her research interests are in the natural language processing and information retrieval, with a focus on semi-structured tables.
Table Question Answering
In this talk, I discuss my research on question answering over semi-structured tables. Semi-structured tables are fact-heavy and pose significant challenges to language models aiming to effectively meet a user's information needs. To understand these challenges, I discuss various tasks such as question answering and summarization over multiple tabular contexts and low-resource table question answering. Finally, I briefly discuss information retrieval over tables.
Join the seminar remotely via Zoom: https://cwi-nl-zoom.zoom.us/j/86928893058?pwd=0tFURmzfFWXtWyN4xqkx15urhoui7b.1