Meta is seeking a Research Scientist to advance the field of multi-modal understanding. This role focuses on developing models and systems that can reason across multiple modalities including text, images, video, and audio. You will work on cutting-edge research to enable AI systems to perceive, interpret, and generate content across diverse data types, contributing to products that impact billions of users worldwide.
Responsibilities
- Conduct research on multi-modal learning, including vision-language models, audio-visual understanding, and cross-modal reasoning
- Develop novel architectures and training methodologies for models that integrate and reason across multiple modalities
- Design and execute experiments to evaluate multi-modal model capabilities and identify areas for improvement
- Publish research findings at top-tier conferences and contribute to Meta's research community
- Collaborate with cross-functional teams to translate research innovations into product applications
- Mentor and guide other researchers on multi-modal AI projects
Minimum Qualifications
- Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
- PhD in Computer Science, Machine Learning, Artificial Intelligence, or a related field
- Experience with multi-modal learning, vision-language models, or cross-modal representation learning demonstrated through publications or projects
- Experience programming in Python and with deep learning frameworks such as PyTorch
- Experience with large-scale model training and distributed computing
Preferred Qualifications
- Experience building end-to-end multi-modal systems from research to production
- Experience with video understanding or audio-visual learning
- Publications at venues such as NeurIPS, ICML, ICLR, CVPR, ACL, or EMNLP focused on multi-modal learning
- Experience with large language models, vision transformers, or foundation models
Compensation
- $154,000/year - $217,000/year; Country: US; Bonus eligible; Equity eligible
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today—beyond the constraints of screens, the limits of distance, and even the rules of physics.
California Notice
For those who live in or expect to work from California if hired for this position, please click here for additional information.
Equal Opportunity
Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Accommodations
Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.