Staff Software Engineer in City of London

New Yesterday

Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub. We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy and engineering jobs, and work with the leading energy companies worldwide.
We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.
Job Description
Senior Software Engineer - AI Infrastructure
We’re working with a hyper growth company.
They are building the GPU infrastructure to the best ai labs and the biggest enterprise companies.
They are building the solution that allows researches to focus on their models, while utilising the phenomenal scale and reliability of the world best ai cloud platform.
The engineering team is small, ambitious, and deeply technical, building the orchestration systems that keep thousands of GPUs running at peak performance across global data centres.
This role sits at the heart of it, designing and scaling the systems that make AI at exascale possible.
What You’ll Focus On
You’ll help shape the orchestration layer for one of the most advanced AI compute environments in the world. Your work will involve:
Designing core platform services for cluster provisioning, workload orchestration, and resource management APIs. Building integrations with schedulers (Kubernetes, Slurm) and container runtimes for reliable, high-performance GPU workloads. Developing automation for deployment, imaging, and multi-tenant resource allocation. Optimising scheduler performance and resource utilisation across diverse workloads. Building lifecycle management and automated remediation systems for large-scale clusters. Creating Infrastructure-as-Code modules to support rapid, repeatable deployments across varied environments.
About You
You’re a pragmatic systems builder who thrives in complexity, enjoys autonomy, and understands what it means to own production at scale. You’ll likely bring:
5+ years’ experience building distributed systems in Go within cloud- environments. Deep hands-on experience with Kubernetes and container orchestration. A strong grasp of Infrastructure-as-Code (Terraform) and configuration management tools (Ansible, Puppet, or similar). Experience deploying and operating large-scale GPU clusters or HPC systems. Working knowledge of ML infrastructure and familiarity with GPU drivers, CUDA, and container runtimes. A low-ego, collaborative approach and a clear, proactive communication style.
In short: This is a role for engineers who like big systems, hard problems, and meaningful ownership. You’ll be joining a team operating at the intersection of software, hardware, and AI.
If you are interested in applying for this job please press the Apply Button and follow the application process. Energy Jobline wishes you the very best of luck in your next career move.
Location:
City Of London
Job Type:
FullTime
Category:
Engineer, Software Engineer, Staff, Engineering, Software

We found some similar jobs based on your search