Stanford Tabular and Relational Project

Tabular and relational data require foundation models of their own.

Foundation models have transformed natural language, vision, and code, and a recent wave of tabular foundation models is extending this paradigm to individual tables. We develop foundation models for structured data across this spectrum, from a single table to the many interconnected tables of a relational database, where each new schema has traditionally required a model trained from scratch.

We release the synthetic and real-world datasets used to train these models, and make all papers, code, and weights openly available.

Learn more

Preprint · 2026

RT-J: Large-Scale Pretraining of Relational Transformers for Label-Efficient Predictions

Rishabh Ranjan, Vignesh Kothapalli, Harshvardhan Agarwal, Charilaos Kanatsoulis, Roshan Reddy Upendra, Tom Palczewski, Carlos Guestrin, Jure Leskovec

A Relational Transformer pretrained on THE JOIN (6,255 forecasting tasks across 650 real-world databases) that makes state-of-the-art few-shot predictions from only hundreds of in-context labels, matching strong in-context pipelines with 23–32× fewer examples.

PluRel: Synthetic Data unlocks Scaling Laws for Relational Foundation Models

Vignesh Kothapalli, Rishabh Ranjan, Valter Hudovernik, Vijay Prakash Dwivedi, Johannes Hoffart, Carlos Guestrin, Jure Leskovec

A framework for synthesizing multi-table relational databases. The resulting synthetic data yields power-law scaling during pretraining and improves performance on real-world RelBench tasks.

Paper ↗ Code ↗ Read more →

ICLR 2026 · arXiv:2510.06377

Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data

Rishabh Ranjan, Valter Hudovernik, Mark Znidar, Charilaos Kanatsoulis, Roshan Upendra, Mahmoud Mohammadi, Joe Meyer, Tom Palczewski, Carlos Guestrin, Jure Leskovec

An architecture that transfers across relational databases without fine-tuning. A 22M-parameter model attains 93% of fully-supervised AUROC in a single forward pass, exceeding a 27B-parameter language model.

Paper ↗ Code ↗ Read more →

ICLR 2026 · arXiv:2505.10960

Relational Graph Transformer

Vijay Prakash Dwivedi, Sri Jaladi, Yangyi Shen, Federico López, Charilaos I. Kanatsoulis, Rishi Puri, Matthias Fey, Jure Leskovec

A graph transformer architecture designed for the structure of relational entity graphs.

Paper ↗ Code ↗

ICLR 2026 DATA-FM Workshop · arXiv:2602.12606

RelBench v2: A Large-Scale Benchmark and Repository for Relational Data

Justin Gu, Rishabh Ranjan, Charilaos Kanatsoulis, Haiming Tang, Martin Jurkovic, Valter Hudovernik, Mark Znidar, Pranshu Chaturvedi, Parth Shroff, Fengyu Li, Jure Leskovec

A large-scale benchmark and repository for relational data, providing standardized datasets and tasks for evaluation.

Paper ↗ Read more →

KDD 2025 · arXiv:2506.16654

Relational Deep Learning: Challenges, Foundations and Next-Generation Architectures

Vijay Prakash Dwivedi, Charilaos Kanatsoulis, Shenyang Huang, Jure Leskovec

A comprehensive review of relational deep learning, surveying its challenges, foundations, and next-generation architectures.

Paper ↗

ACL 2025 · arXiv:2506.05725

Large Language Models are Good Relational Learners

Fang Wu, Vijay Prakash Dwivedi, Jure Leskovec

Rel-LLM, an architecture that pairs a graph neural network encoder with a large language model through retrieval-augmented generation, bringing the reasoning of LLMs to relational databases.

Paper ↗ Code ↗

ICML 2025 · arXiv:2502.06784

RelGNN: Composite Message Passing for Relational Deep Learning

Tianlang Chen, Charilaos Kanatsoulis, Jure Leskovec

A composite message-passing scheme for relational deep learning that addresses many-to-many relationships, achieving state-of-the-art results on RelBench with improvements of up to 25%.

Paper ↗ Code ↗

NeurIPS 2024 · arXiv:2407.20060

RelBench: A Benchmark for Deep Learning on Relational Databases

Joshua Robinson, Rishabh Ranjan, Weihua Hu, Kexin Huang, Jiaqi Han, Alejandro Dobles, Matthias Fey, Jan E. Lenssen, Yiwen Yuan, Zecheng Zhang, Xinwei He, Jure Leskovec

A benchmark for deep learning on relational databases, comprising seven databases and thirty predictive tasks across diverse domains.

Paper ↗ Read more →

ICML 2024 · PMLR

Position: Relational Deep Learning, Graph Representation Learning on Relational Databases

Matthias Fey, Weihua Hu, Kexin Huang, Jan Eric Lenssen, Rishabh Ranjan, Joshua Robinson, Rex Ying, Jiaxuan You, Jure Leskovec

A position paper introducing relational deep learning, which represents relational databases as graphs to enable end-to-end learning without manual feature engineering.

Paper ↗

Dataset	Description	Type
relbench	7 real-world relational databases with diverse predictive tasks.	benchmark	open ↗
relbench-v2-extra	Additional real-world databases extending the RelBench v2 collection.	benchmark	open ↗
plurel	2,000 synthetic relational databases for scaling-law pretraining.	synthetic	open ↗
the-join	650 real-world relational databases for large-scale pretraining.	corpus	open ↗
redelex	71 databases ported from the CTU Prague Relational Learning Repository.	corpus	open ↗
tgb	The Temporal Graph Benchmark, 12 dynamic graphs for relational learning.	temporal	open ↗
dbinfer	7 databases from the 4DBInfer benchmark for graph-centric predictive modeling.	tasks	open ↗

Dataset

Description

Type

relbench

7 real-world relational databases with diverse predictive tasks.

benchmark

open ↗

relbench-v2-extra

Additional real-world databases extending the RelBench v2 collection.

benchmark

open ↗

plurel