Full-Time
Machine Learning Engineer, Multi Modality
Apply for job
Apply for this job
To apply for this job please visit job-boards.greenhouse.io.
Expiration Date:
March 11, 2026
What You Will Contribute To Altos
As part of our team, you will help to accelerate and optimize our progress in developing multi-modal generative foundation models for multiscale biology. As a ML Engineer you will be an integral part of our multidisciplinary teams building the computational platforms that will enable Altos to achieve its mission. The successful candidate will thrive in a fast-paced environment that stresses teamwork, transparency, scientific excellence, originality, and integrity.
Responsibilities
- Applying software engineering skills to develop reliable, scalable, performant distributed systems in a cloud environment.
- Develop efficient data loading strategy and performance tracking to train large models with distributed training across multiple nodes.
- Pre-train and fine-tune large-scale machine learning systems using multimodal biological data and natural language inputs.
- Partner with other machine learning scientists and engineers to establish automated, robust, and efficient analytical pipelines for reproducible research.
- Stay up-to-date on the latest developments in deep learning, particularly in the areas of Transformers, NLP, Reinforcement Learning, and DDPMs.
- Contribute to seminars and other scientific initiatives within Altos and the broader scientific community.
Who You Are
Minimum Qualifications
- MS in Computer Science, Statistics, Machine Learning, Artificial Intelligence, or a related discipline
0-5 years of relevant work experience in either an academic or industry setting. - Very strong programming skills, including experience with Python and deep learning libraries (PyTorch, Hugging Face Transformers, H-F Datasets, H-F Accelerate)
- Ideally, experience in a distributed training framework, like DDP, FSDP, Deepspeed, Megatron, or Hugging Face Accelerate, Ray.
- Expertise in a subset of the following: transformers, natural language processing, multi-modality in language and/or in biology, explainability, diffusion models.
- Someone with a highly collaborative mindset, who is self-motivated.
- Ability to communicate and explain the design, results, conclusions and the impact of findings to both scientific and nonscientific staff.
Preferred Qualifications
- Familiarity with multimodal data integration, including early and/or late fusion strategies.
- Track record of ML applied to NGS data (e.g.RNA-seq, ATAC-seq, ChIP-seq, DNA methylation), biological imaging modalities (e.g. microscopy, H&E, IF), and/or spatial transcriptomics.
Share
Facebook
X
LinkedIn
Telegram
Tumblr
Whatsapp
VK
Mail