AI Researcher (Toronto, ON, CA, M5C 3G7) at Deloitte Canada

Job Type: Permanent
Work Model: Hybrid
Reference code: 130069
Primary Location: Toronto, ON
All Available Locations: Toronto, ON

Our Purpose

At Deloitte, our Purpose is to make an impact that matters. We exist to inspire and help our people, organizations, communities, and countries to thrive by building a better future. Our work underpins a prosperous society where people can find meaning and opportunity. It builds consumer and business confidence, empowers organizations to find imaginative ways of deploying capital, enables fair, trusted, and functioning social and economic institutions, and allows our friends, families, and communities to enjoy the quality of life that comes with a sustainable future. And as the largest 100% Canadian-owned and operated professional services firm in our country, we are proud to work alongside our clients to make a positive impact for all Canadians.

By living our Purpose, we will make an impact that matters.

Have many careers in one Firm.
Enjoy flexible, proactive, and practical benefits that foster a culture of well-being and connectedness.
Learn from deep subject matter experts through mentoring and on the job coaching

--

We are looking for a passionate AI Researcher to join our team. You will work at the intersection of cutting-edge AI research and product engineering—designing, evaluating, and deploying generative AI (GenAI) systems that are both reliable and impactful. This role blends fundamental research, model evaluation, and practical software engineering to push forward the next generation of intelligent applications.

What will your typical day look like?

Responsibilities
• Collaborate with product managers, engineers, and stakeholders to design AI-driven solutions that meet technical and business requirements.
• Research, prototype, and develop generative AI applications by combining non-deterministic LLMs with deterministic software engineering techniques.
• Build evaluation frameworks and benchmarks to measure model quality, reliability, and business impact.
• Generate regular reports on model accuracy, drift, and performance.
• Debug, optimize, and enhance GenAI applications using prompt engineering, reinforcement learning, fine-tuning, and software engineering best practices.
• Train and fine-tune large language models using Hugging Face Transformers.
• Apply reinforcement learning fine-tuning techniques using Hugging Face TRL (Transformers Reinforcement Learning).
• Manage training workflows with experiment tracking tools and distributed training accelerators (DeepSpeed, Accelerate, FSDP).
• Run and optimize multi-GPU training and inference, leveraging vLLM for high-throughput, low-latency serving.
• Contribute to the design of scalable MLOps/DevOps pipelines for model deployment, monitoring, and continuous training.
• Ensure compliance with data privacy, security, and responsible AI guidelines when handling training or test datasets.
• Stay current with emerging research in LLMs, RLHF/RLAIF, multimodal AI, and generative models; apply findings to improve our systems.
• Author technical documentation and contribute to publications, patents, or open-source projects where applicable.

About the team

Preferred Tech Stack
Candidates with experience in the following tools and frameworks will be strongly preferred:
• Transformers (Hugging Face) for model training, fine-tuning, and inference
• TRL (Transformers Reinforcement Learning) for RL-based fine-tuning (PPO, DPO, GRPO, RLAIF)
• DeepSpeed, Accelerate, or FSDP for multi-GPU and distributed training
• vLLM for optimized inference and serving of large models
• Weights & Biases (W&B) or MLflow for experiment tracking and reproducibility
• LangChain, AutoGen (A2A), or MCP for GenAI application development
• PyTorch as the primary deep learning framework

Enough about us, let’s talk about you

Qualifications
• 3+ years experience in machine learning engineering, data engineering, or applied research (industry or academic).
• Strong programming skills in Python and experience with frameworks such as PyTorch, TensorFlow, JAX.
• Hands-on experience with Hugging Face Transformers for pretraining, fine-tuning, or inference.
• Experience with Hugging Face TRL for reinforcement learning fine-tuning (e.g., PPO, DPO, GRPO, RLAIF).
• Practical experience managing multi-GPU training and distributed training at scale using DeepSpeed, Accelerate, or FSDP.
• Experience running inference on large models using vLLM or similar optimized serving frameworks.
• Familiarity with experiment tracking and reproducibility tools (e.g., W&B, MLflow).
• Knowledge of MLOps practices including continuous training, continuous monitoring, and model lifecycle management.
• Experience with GenAI frameworks such as LangChain, AutoGen (A2A), or MCP.
• Demonstrated ability to write clean, maintainable, production-ready code.
• Experience building or supporting cloud-based AI systems (GCP, AWS, or Azure; certifications preferred).
• Strong grasp of reinforcement learning, NLP, and/or generative modeling (transformers, diffusion, RAG, etc.).
• Track record of research contributions (papers, patents, open-source projects) is a plus.
Nice-to-Have
• Experience with reinforcement learning from human/AI feedback (RLHF/RLAIF).
• Contributions to open-source AI frameworks.
• Familiarity with scaling laws, evaluation metrics, and benchmarking large models.
• Interest in pushing the boundaries of trustworthy, explainable, and safe AI.

Total Rewards

The salary range for this position is $72,000 - $138,000, and individuals may be eligible to participate in our bonus program. Deloitte is fair and competitive when it comes to the salaries of our people. We regularly benchmark across a variety of positions, industries, sectors, targets, and levels. Our approach is grounded on recognizing people's unique strengths and contributions and rewarding the value that they deliver.

Our Total Rewards Package extends well beyond traditional compensation and benefit programs and is designed to recognize employee contributions, encourage personal wellness, and support firm growth. Along with a competitive base salary and variable pay opportunities, we offer a wide array of initiatives that differentiate us as a people-first organization. On top of our regular paid vacation days, some examples include: $4,000 per year for mental health support benefits, a $1,300 flexible benefit spending account, firm-wide closures known as "Deloitte Days", dedicated days of for learning (known as Development and Innovation Days), flexible work arrangements and a hybrid work structure.

Our promise to our people: Deloitte is where potential comes to life.

Be yourself, and more.

We are a group of talented people who want to learn, gain experience, and develop skills. Wherever you are in your career, we want you to advance.

You shape how we make impact.

Diverse perspectives and life experiences make us better. Whoever you are and wherever you’re from, we want you to feel like you belong here. We provide flexible working options to support you and how you can contribute.

Be the leader you want to be

Some guide teams, some change culture, some build essential expertise. We offer opportunities and experiences that support your continuing growth as a leader.

Have as many careers as you want.

We are uniquely able to offer you new challenges and roles – and prepare you for them. We bring together people with unique experiences and talents, and we are the place to develop a lasting network of friends, peers, and mentors.

The next step is yours

At Deloitte, we are all about doing business inclusively – that starts with having diverse colleagues of all abilities. Deloitte encourages applications from all qualified candidates who represent the full diversity of communities across Canada. This includes, but is not limited to, people with disabilities, candidates from Indigenous communities, and candidates from the Black community in support of living our values, creating a culture of Diversity Equity and Inclusion and our commitment to our AccessAbility Action Plan, Reconciliation Action Plan and the BlackNorth Initiative.

We encourage you to connect with us at accessiblecareers@deloitte.ca if you require an accommodation for the recruitment process (including alternate formats of materials, accessible meeting rooms or other accommodations) or indigenouscareers@deloitte.ca for any questions relating to careers for Indigenous peoples at Deloitte (First Nations, Inuit, Métis).

By applying to this job you will be assessed against the Deloitte Global Talent Standards. We’ve designed these standards to provide our clients with a consistent and exceptional Deloitte experience globally.

Deloitte Canada has 20 offices with representation across most of the country. We acknowledge that Deloitte offices stand on traditional, treaty, and unceded territories in what is now known as Canada. We recognize that Indigenous Peoples have been the caretakers of this land since time immemorial, nurturing its resources and preserving its natural beauty. We acknowledge this land is still home to many First Nations, Inuit, and Métis Peoples, who continue to maintain their deep connection to the land and its sacred teachings. We humbly acknowledge that we are all Treaty people, and we commit to fostering a relationship of respect, collaboration, and stewardship with Indigenous communities in our shared goal of reconciliation and environmental sustainability.