Senior Site Reliability Engineer (SRE) in London
New Today
Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub. We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy and engineering jobs, and work with the leading energy companies worldwide.
We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.
Senior Site Reliability Engineer (SRE)
Remote
12-month contract (high chance of extension)
Job Description
Join a global pioneer in the video game industry and own the reliability of high-traffic, revenue-critical platforms used by millions worldwide. As a Senior SRE, you'll shape the architecture, improve platform-wide resiliency, and ensure services stay performant, scalable, and secure. This isn't just about maintaining a single system, you'll influence reliability across multiple services, driving improvements that touch the entire ecosystem.
Key Responsibilities
Lead incident response and troubleshooting for production systems, resolving high-severity issues and driving post-incident improvements.
Influence architecture to improve platform-wide reliability, resiliency, and operational efficiency, ensuring services remain available under heavy load.
Drive containerisation best practices and manage Kubernetes-based workloads at scale.
Build and maintain event-driven architectures that scale globally while ensuring fault-tolerance and high availability.
Automate infrastructure provisioning, deployment, and monitoring using Infrastructure as Code (Terraform, CloudFormation, Ansible, CDK).
Collaborate with engineering, product, and security teams to define SLOs, SLIs, and error budgets across services.
Provide mentorship, advocate SRE best practices, and ensure teams are empowered to deliver resilient, reliable systems. Experience / Must-Have Skills
Extensive experience in AWS and AWS-managed services (EC2, Lambda, S3, VPC, CloudWatch, CloudTrail, IAM, EKS, Service Catalog, multi-account environments).
Strong Kubernetes / container orchestration experience, including EKS, OpenShift, Docker, and service mesh.
Deep understanding of networking fundamentals: DNS, VPCs, routing, load balancing, TCP/IP, firewall policies.
Proven track record in incident response and troubleshooting at scale.
Hands-on experience with infrastructure automation and CI/CD pipelines.
Experience designing event-driven architectures and resilient systems.
High level of autonomy, able to influence platform-wide decisions and architect for reliability across services.
Ability and desire to mentor junior staff
Bonus: experience in gaming, interactive entertainment, or other high-traffic, global-scale platforms.If you are interested in this role, please feel free to submit your CV
If you are interested in applying for this job please press the Apply Button and follow the application process. Energy Jobline wishes you the very best of luck in your next career move.
- Location:
- London
- Job Type:
- FullTime
- Category:
- Engineer, Reliability Engineer, Reliability, Senior, Engineering, Site