Senior Data Engineer

New Yesterday

Job Description

Senior Data Engineer – AI & Neuroscience

Location: London (Hybrid, Kings Cross) or San Francisco (Onsite/Hybrid)

Employment: Full-time


About the Company

We are partnered with a pioneering biotech building the world’s first brain foundation models — large-scale AI systems designed to deeply understand, protect, and enhance the human brain. By generating our own data, developing novel machine learning approaches, and working closely with world-leading neuroscientists, the company aims to accelerate discoveries that transform how neurological diseases are treated and prevented.

This is a rare opportunity to join at an early stage, with direct influence on the technical systems that will underpin scientific and clinical breakthroughs. The work is ambitious, interdisciplinary, and sits at the cutting edge of AI, biology, and data engineering.


The Role

We are seeking a Senior Data Engineer to lead the design and scaling of the company’s core data infrastructure. You will be responsible for building robust, production-grade systems that handle multi-omic, neuroscience, and clinical datasets at scale. Your work will provide the backbone for training next-generation AI models, enabling researchers to extract real-world insights from raw biological data.


Key Responsibilities

  • Design and implement distributed data pipelines for multi-omic, neuroscience, and clinical datasets
  • Build a unified feature store to serve ML training and downstream biological analysis
  • Develop scalable storage, ingestion, and validation systems with a focus on robustness, observability, and versioning
  • Collaborate with ML researchers and biologists to translate raw data into actionable insights and high-quality training data
  • Scale distributed systems using Kubernetes, Terraform, and orchestration tools such as Airflow, Flyte, or Temporal
  • Write clean, extensible, and well-tested code to ensure long-term maintainability and collaboration across teams


About You

We are looking for data and platform engineers who are motivated by applying their skills at the intersection of AI and neuroscience. You will be comfortable operating in an early-stage, fast-moving environment, with a strong focus on engineering excellence and scientific impact.


Ideal Experience

  • 4+ years of experience building data infrastructure or data platforms, with proven ability to solve complex distributed systems challenges independently
  • Expertise in large-scale data processing pipelines (batch and streaming) using technologies such as Spark, Kafka, Flink, or Beam
  • Experience designing and implementing large-scale data storage systems (feature stores, timeseries databases, warehouses, or object stores)
  • Strong distributed systems and infrastructure skills (Kubernetes, Terraform, orchestration frameworks such as Airflow/Flyte/Temporal)
  • Hands-on cloud engineering experience (AWS, GCP, or Azure)
  • Strong software engineering fundamentals, with a track record of writing maintainable, testable, and extensible code
  • Familiarity with ML infrastructure and prior experience supporting ML teams in production environments
  • Bioinformatics or biological data exposure (preferred, not required) — willingness to learn is essential
  • Excellent communication skills and ability to collaborate across interdisciplinary teams


Why Join

  • Build infrastructure that directly supports breakthroughs in brain health and neurological disease
  • Work with first-of-its-kind, multi-modal datasets and cutting-edge AI approaches
  • Join a world-class interdisciplinary team at the ground floor, with significant scope for ownership and growth
  • Competitive compensation package including equity

Location:
City Of London
Job Type:
FullTime
Category:
Technology

We found some similar jobs based on your search