· Careers · 5 min read
Member of Technical Staff, Small Language Models
You will co-own the SLM training stack and the Expert Fidelity evals that already beat frontier models on the dimensions our experts and users care about.
You will co-own the SLM training stack and the Expert Fidelity evals that already beat frontier models on the dimensions that matter.
Onix is the Personal Intelligence platform. Each onix is a small language model trained exclusively on a single expert’s private corpus: clinical notes, unpublished research, proprietary methods that have never been on the internet and never will be. Fully isolated, never bleeding across experts.
Privacy is an engineering surface, not a compliance line. Per-expert isolation, on-device inference paths, federated update strategies, and grounding guarantees are open research areas you will own.
Every session generates refinement signal. Every expert validates outputs in the loop. This is exclusive, expert-graded data that Big AI cannot scrape, replicate, or buy.
Our SLMs run at orders of magnitude lower cost and faster latency than frontier models. We own the inference stack. We are not a wrapper. That economics is what makes a profitable consumer subscription business possible, and what makes the category structurally impossible for Big AI to follow into without cannibalizing their core.
20+ founding experts are live in App Store early access, including Dave Rabin, Mark Sisson, Ashley Koff, William Li, and Elissa Epel. NYT bestsellers and category-defining voices. They pull in their peers unprompted. The data moat compounds with every conversation.
Small senior team in Old Montreal. In person. You report to our CTO and partner closely with engineering.
The next great AI lab will prove that human expertise is a moat, not a training set. That is what we are proving.
What You Will Do
- Co-own the SLM training stack: corpus, training, evals, deploy, monitor. Every layer.
- Push per-expert grounding, preference learning (RLHF / RLVR), and Fidelity research that makes our privacy-as-architecture story technically undeniable.
- Co-own Expert Fidelity evals. Make the bar harder. Make us pass it.
- Ship to production every sprint. Real users. Real experts. Real corpora. Real consequences if we get it wrong.
- Sit on calls with our experts directly when their voice, fidelity, or persona needs ML-level attention.
What You Will Work With
- Training stack: modern PyTorch and HuggingFace tooling for accelerated per-expert fine-tuning at scale. Distillation pipelines that generate training data without exposing a private corpus. Expert Fidelity evals as the gating bar.
- Inference stack: autoscaled per-expert endpoints. You own quantization, caching, and the per-expert latency budget.
- Preference learning at the per-expert level: RLHF where each expert is the literal human in the loop, and RLVR grounded in our Expert Fidelity reward surface. The data and the experts are exclusive. The technique stack is yours to push.
- On-device and edge deployment for iOS. Latency and battery are first-class constraints, not afterthoughts.
- Per-expert isolation infrastructure. No data crosses expert boundaries. Privacy is architecture, you own the surface.
- Production scale on real users from day one. Real corpora. Real consequences if you get it wrong.
Who You Are
You can read a paper, prototype the model, and ship it to production in the same week. You have substantive work in small language models, efficient training, distillation, on-device inference, federated learning, retrieval, or grounding. You came from Mila, Vector, Cohere, or a frontier lab (Anthropic, OpenAI, DeepMind, Hugging Face, Mistral). Or you are finishing a PhD and want your next system to ship to real users instead of a benchmark.
You came here for the mission. Technology should amplify human genius, not replace it.
We publish on our timeline, not a journal’s.
You have:
- Deep ML technical chops across training, eval, and deploy.
- Substantive shipped work in SLMs, distillation, on-device inference, federated learning, retrieval, or grounding.
- A track record of defending technical positions on open research questions with evidence, not vibes.
- Comfort owning infra alongside modeling. The deploy is your problem too.
You are:
- A research engineer first. Engineers here do research. Researchers here do engineering.
- Opinionated. You can articulate a position on three open research questions in our space within an hour of joining.
- Direct. You tell teammates they are wrong when they are, and accept the same in return.
- In Old Montreal in person, or willing to relocate.
We do not care which lab you came from. We care what you have shipped, what you have measured, and what you would publish next.
What Success Looks Like
- You own SLM and Expert Fidelity end to end. You decide what we train next, what we deprecate, what we publish.
- Our published technical narrative is undeniable. Experts and serious researchers read it and say “they have a real moat.”
- The ML team rallies to your technical direction.
- Big AI watches what we publish and copies us.
How to Apply
Download Onix from the App Store using invite code OnixCareers. Use it. Try multiple onixes. Push them until they break.
Then submit:
- What you noticed at the model layer. Fidelity wins, fidelity failures, refusal patterns, persona drift, latency tells. The first thing you would eval if you joined Monday.
- Pick one of these technical debates and take a position in under 300 words: per-expert SLM versus shared base with per-expert adapters; on-device inference versus edge inference; synthetic corpus augmentation for thin experts versus strict corpus-only training; open-weight base (Llama, Mistral, Qwen) versus custom pretrain for per-expert SLMs.
That is the application. The work tells us what we need to know.



