Site logo

What You Will Contribute To Altos

As part of our team, you will help to accelerate and optimize our progress in developing multi-modal generative foundation models for multiscale biology. As a ML Engineer you will be an integral part of our multidisciplinary teams building the computational platforms that will enable Altos to achieve its mission. The successful candidate will thrive in a fast-paced environment that stresses teamwork, transparency, scientific excellence, originality, and integrity.

Responsibilities

  • Applying software engineering skills to develop reliable, scalable, performant distributed systems in a cloud environment.
  • Develop efficient data loading strategy and performance tracking to train large models with distributed training across multiple nodes.
  • Pre-train and fine-tune large-scale machine learning systems using multimodal biological data and natural language inputs.
  • Partner with other machine learning scientists and engineers to establish automated, robust, and efficient analytical pipelines for reproducible research.
  • Stay up-to-date on the latest developments in deep learning, particularly in the areas of Transformers, NLP, Reinforcement Learning, and DDPMs.
  • Contribute to seminars and other scientific initiatives within Altos and the broader scientific community.

Who You Are

Minimum Qualifications

  • MS in Computer Science, Statistics, Machine Learning, Artificial Intelligence, or a related discipline
    0-5 years of relevant work experience in either an academic or industry setting.
  • Very strong programming skills, including experience with Python and deep learning libraries (PyTorch, Hugging Face Transformers, H-F Datasets, H-F Accelerate)
  • Ideally, experience in a distributed training framework, like DDP, FSDP, Deepspeed, Megatron, or Hugging Face Accelerate, Ray.
  • Expertise in a subset of the following: transformers, natural language processing, multi-modality in language and/or in biology, explainability, diffusion models.
  • Someone with a highly collaborative mindset, who is self-motivated.
  • Ability to communicate and explain the design, results, conclusions and the impact of findings to both scientific and nonscientific staff.

Preferred Qualifications

  • Familiarity with multimodal data integration, including early and/or late fusion strategies.
  • Track record of ML applied to NGS data (e.g.RNA-seq, ATAC-seq, ChIP-seq, DNA methylation), biological imaging modalities (e.g. microscopy, H&E, IF), and/or spatial transcriptomics.

Job Overview

Print Job Listing

Share