Senior Data Scientist
Forto
Seniority
Senior
Model
In-Office
Sector
Salary
Undisclosed
Contract
Full-Time
About the role
As a Data Scientist in the Data Science team at Forto, you will take ownership of production ML systems that extract structured intelligence from unstructured logistics data. You will work closely with the Engineering Manager and Product Manager across document data extraction, vocabulary mapping, and rate sheet parsing, using a combination of LLMs, custom models, and rule-based postprocessing.
What you'll do
- Design, build, and maintain end-to-end ML pipelines for document extraction, classification, and data enrichment in production.
- Develop and improve LLM-based extraction systems for complex logistics documents (packing lists, booking confirmations, invoices, rate sheets).
- Build prompt evaluation frameworks and feedback-based optimization loops to systematically improve extraction accuracy.
- Train custom in-house models using human-in-the-loop (HITL) data to move from assisted to fully automated extraction.
- Build and maintain semantic similarity models for free-text to standardized TMS vocabulary across ports, terminals, container types, legal entities, and line items.
- Improve pipeline reliability through redesign, testing, monitoring, and alerting for non-deterministic ML systems.
- Evaluate and introduce disruptive approaches to achieve step-change accuracy improvements when incremental optimization plateaus.
- Scope and build out next generation DS workstreams: demand forecasting, churn prediction, route optimization, and predictive analytics for logistics operations.
What you'll need
- 3+ years of professional experience in data science or machine learning engineering.
- Ability to design, deploy, and maintain ML systems in production, including pipeline architecture, monitoring, reliability, and handling non-deterministic outputs at scale.
- Strong proficiency in Python.
- Hands-on experience with LLMs (prompting, fine-tuning, evaluation) and understanding of their limitations in production environments.
- Strong foundation in classical data science and statistics: regression, classification, time series analysis, data leakage, experimental design, and hypothesis testing.
- Strong analytical and problem-solving skills and stakeholder management skills.
Nice to have
- Experience in logistics, supply chain, or freight forwarding domains.
- Experience with semantic similarity and entity resolution techniques.
- Experience with human-in-the-loop (HITL) workflows and designing feedback loops for model improvement.
- Experience with demand forecasting, time series modeling, or churn prediction in a business context.

