Site Reliability Engineer, Cheltenham

This Job position is no longer available

We encourage you to browse other open positions on our website.

Thank you for your interest!

Site Reliability Engineer

New Today

Site Reliability Engineer (SRE)£65,000 base salaryRole OverviewOur National Security business in Gloucester is expanding, creating vital opportunities to support National Security clients through innovative technical solutions. We are looking for a Site Reliability Engineer to join a growing team that prioritizes both client delivery and community engagement, helping to build tech and cyber skills within the region.As an SRE, you will bridge the gap between software engineering and systems operations. You will use your engineering expertise to replace manual tasks with automation, ensuring that traditional operational work (incidents, on-call, etc.) never exceeds 50% of your team's capacity.Core AccountabilitiesService Excellence: Support and maintain essential services for core mission applications, proactively enhancing availability, performance, and stability.Automation First: Replace repetitive manual labor with innovative automated solutions.Consultative Engineering: Work alongside product teams to advise on best practices for system design and resilience.Observability: Instrument applications to improve monitoring and use data-driven insights to demonstrate daily system improvements.Systems Architecture: Leverage your understanding of the relationship between software and infrastructure to build scalable, failure-resilient systems.Community Engagement: Actively participate in the wider internal DevOps and SRE communities.Candidate Background & ExperienceWe are looking for candidates with experience in the following areas:Development: Software development in Java and web technologies (JavaScript, HTML).Data & Infrastructure: Familiarity with database technologies (Elastic, Mongo) and cloud platforms (AWS, Azure, or OpenStack).Scripting & OS: Proficiency in Linux and Windows command lines (Bash, PowerShell).Configuration & Deployment: Hands-on experience with tools like Chef, Puppet, and Docker (container management/micro-services).Monitoring: Expertise in monitoring large-scale systems using technologies such as ELK.Problem Solving: Strong diagnostic skills across all levels of the tech stack and experience troubleshooting service outages.Agile Methodology: Experience working within an Agile Scrum team and using supporting tools like Jira.Testing & Open Source: Familiarity with automation frameworks (Selenium) and a track record of improving Open Source Software.JBRP1_UKTJ

Location:: Cheltenham
Job Type:: FullTime

Start a New Search

This Job position is no longer available