Research Scientist, Foundational Data Science
Prior Labs
Seniority
Midweight
Model
In-Office
Sector
Salary
Undisclosed
Contract
Full-Time
This role is foundational data science: building the foundations of tabular foundation models so a single model can solve data-science problems across the board. Roughly half the work is inventing new frontier tools for TFMs, and half is building the dataset and benchmark bedrock they stand on.
What you'll do
- Invent and build the frontier tools that extend TabPFN, including its thinking, scaling, and agentic capabilities, and the new methods that let one model generalize across the full landscape of data-science problems.
- Set the research direction by deciding which model capabilities and benchmarks are worth pursuing, choosing what is worth solving rather than optimizing a score someone else set.
- Bring in external research and real customer needs to shape new model and tooling directions, and publish frontier results that move the field forward.
- Build trustworthy benchmarks from the structured data behind real, high-impact problems, so the team optimizes for real-world performance rather than one leaderboard.
- Faithfully implement the baselines and competitor models that set the gold standard of applied data science, giving the team a read on where TabPFN leads and where there is room to improve.
- Build an automated, agentic pipeline with a human in the loop so this data and benchmark foundation scales to far larger volumes without losing rigor.
What you'll need
- You have solved data-science problems across many domains and datasets to a high standard, optimizing for strong performance across a whole suite of tasks rather than the single best score on one.
- You work undogmatically across the ML toolbox, including getting strong results with gradient-boosted trees (such as XGBoost) and not only with deep learning.
- You understand the common categories of dataset defects (leakage, label noise, distribution shift, duplication, mislabeled targets, and similar) and why each corrupts a training or benchmark signal.
- You are energized by foundational work, valuing the dataset and benchmark bedrock as much as the frontier tooling, and you have taken on hard problems others passed over.
- You thrive as a senior individual contributor in an ambiguous, early-stage, low-process environment. You are opinionated on best practice in Data Science and can make good judgement calls on approaches to complex problems.
Nice to have
- Experience building or extending evaluation harnesses, benchmark suites, or experiment frameworks that others rely on.
- Experience building LLM- or agent-assisted pipelines with a human in the loop to scale a previously manual workflow.
- Experience acting as the link between external research or customer needs and an internal model or product roadmap.
- Prior work on tabular, structured-data, or foundation-model problems, or helping shape an emerging research subfield through community work.
What they offer
- Work with world-class researchers and builders on one of the hardest problems in AI.
- Based in Berlin, Freiburg, or New York with frequent travel to offices and regular company offsites.
