Research Intern, Small Language Models (Mitacs)

You will work on the end-to-end pipeline that turns one expert’s private corpus into a small language model that is faithful to them, not just in voice but in what it knows, what it cites, and the judgment calls it makes. The expert grades it.

Onix is the Personal Intelligence platform. Each onix is a small language model trained exclusively on a single expert’s private corpus: clinical notes, unpublished research, proprietary methods that have never been on the internet and never will be. Fully isolated, never bleeding across experts.

Privacy is an engineering surface, not a compliance line. Per-expert isolation, on-device inference paths, federated update strategies, and grounding guarantees are open research areas you will work on.

Most of the work is SLM training: per-expert fine-tuning, distillation, preference learning, and the Expert Fidelity evals that gate every release, scoring each model on accuracy, groundedness, persona, and judgment. You will also work on the data layer that feeds it: ingesting and chunking a private corpus, and the agentic RAG that grounds the model. Training is the core. The data and retrieval layer is where that training meets the product.

Every session generates refinement signal. Every expert validates outputs in the loop. This is exclusive, expert-graded data that Big AI cannot scrape, replicate, or buy.

20+ founding experts are live in App Store early access, including Dave Rabin, Mark Sisson, Ashley Koff, William Li, and Elissa Epel. NYT bestsellers and category-defining voices. The data moat compounds with every conversation.

This is a Mitacs Accelerate research internship. It is project-scoped, four or six months to start, with the option to extend. You work in person from our office in Old Montreal, collaborating closely with our R&D and AI team, our CTO, and your academic supervisor. We are recruiting primarily from Mila, and open to any grad student or postdoc eligible for Mitacs.

The next great AI lab will prove that human expertise is a moat, not a training set. That is what we are proving.

What You Will Do

Train and fine-tune per-expert SLMs alongside the R&D team: distillation, preference learning (RLHF / RLVR), and the Expert Fidelity eval loop that gates them. This is the bulk of the internship.
Work across the SLM training stack with the team: corpus, training, evals, deploy.
Help sharpen the Expert Fidelity evals, and push the model to clear the bar.
Work on the data layer that feeds the model: ingest and chunk a private corpus, and the agentic RAG and grounding around it. Training is the core; this data layer is where your work touches the product.
Ship to production during the internship. Real experts. Real corpora. Real users. Real consequences if we get it wrong.

What You Will Work With

Training stack: modern PyTorch and HuggingFace tooling for accelerated per-expert fine-tuning. Distillation pipelines that generate training data without exposing a private corpus. Expert Fidelity evals as the gating bar.
Preference learning at the per-expert level: RLHF where each expert is the literal human in the loop, and RLVR grounded in our Expert Fidelity reward surface. The data and the experts are exclusive.
Inference stack: per-expert endpoints. You work on quantization, caching, and the per-expert latency budget.
On-device and edge deployment for iOS. Latency and battery are first-class constraints, not afterthoughts.
The data layer that feeds training: corpus ingestion, document chunking, agentic RAG, and grounding over a single expert’s corpus. Mostly SLM training, but your work here has direct product impact.
Per-expert isolation infrastructure on real users from day one. No data crosses expert boundaries. Privacy is architecture.

Who You Are

You can read a paper, prototype the model, and ship it to production in the same week. You are a current grad student or postdoc, primarily from Mila, with substantive work in small language models, efficient training, distillation, retrieval, RAG, grounding, or on-device inference. You want your next system to ship to real users instead of a benchmark.

You came here for the mission. Technology should amplify human genius, not replace it.

We publish on our timeline, not a journal’s.

You have:

ML technical chops across training, eval, and retrieval.
Substantive shipped or published work in SLMs, distillation, RAG, retrieval, grounding, or on-device inference.
A track record of defending technical positions on open research questions with evidence, not vibes.
Comfort working across the whole data-to-eval path, not just the model.
Eligibility for a Mitacs internship: registered as a grad student or postdoc at a Canadian institution.

You are:

A research engineer first. Engineers here do research. Researchers here do engineering.
Opinionated. You can articulate a position on three open research questions in our space within an hour of joining.
Direct. You tell teammates they are wrong when they are, and accept the same in return.
In Old Montreal in person for the internship.

We do not care which lab you came from. We care what you have shipped, what you have measured, and what you would publish next.

What Success Looks Like

A per-expert SLM you helped train is in production and passes an Expert Fidelity bar an expert signs off on.
You moved a Fidelity eval metric the team cares about.
The data and retrieval layer you worked on measurably improves grounding for a live expert.
You publish or present what you built with the team.

How to Apply

Download Onix from the App Store using invite code OnixCareers. Use it. Try multiple onixes. Push them until they break.

Then submit:

What you noticed at the model layer. Fidelity wins, fidelity failures, refusal patterns, persona drift, latency tells, places the answer was clearly ungrounded. The first thing you would eval if you joined Monday.
Pick one of these technical debates and take a position in under 300 words: per-expert SLM versus shared base with per-expert adapters; agentic RAG over a per-expert corpus versus baking the corpus into the weights; fine-grained document chunking versus long-context retrieval; synthetic corpus augmentation for thin experts versus strict corpus-only training; open-weight base (Llama, Mistral, Qwen) versus custom pretrain for per-expert SLMs.

That is the application. The work tells us what we need to know.

Research Intern, Small Language Models (Mitacs)

What You Will Do

What You Will Work With

Who You Are

What Success Looks Like

How to Apply

Related Posts

Onix Named to Canada's Top 100 AI Startups for ALL IN 2026

"Stop Sitting on the Sidelines": Five Questions With Dr. Kwadwo Kyeremanteng on His onix, His Book, and Why Healthcare's Best Chapter Is Still Ahead

Onix on Uncover Your Eyes: What AI Gets Wrong About Your Health

You Can't Scrape an Expert