Site Reliability Engineer
New Today
Enso Recruitment is working on behalf of our client, a leading organisation in the financial services sector, to find an experienced Site Reliability Engineer. This role offers a great opportunity to contribute to the performance and stability of high-availability trading platforms while supporting a transition from on-premise to cloud infrastructure. This position will work closely with global teams to ensure the resilience and observability of critical systems, playing a key role in automation, incident response, and continuous improvement of monitoring solutions. Key Responsibilities: Lead oversight and coordination with an offshore operations team, ensuring timely and accurate issue escalation. Develop and maintain robust monitoring solutions, following clear standards and documentation practices. Actively participate in incident management, offering real-time diagnostics and support during major incidents. Assist with post-incident analysis to improve system observability and close any identified monitoring gaps. Support the integration of new systems into the observability framework, developing bespoke tools when needed. Work alongside infrastructure and DevOps teams to review changes from a monitoring and stability perspective. Contribute to automation initiatives, especially those aimed at reducing manual effort through post-deployment validation and smoke testing. Provide occasional weekend support during critical system upgrades or releases. Skills & Experience: At least 5 years of experience in a Site Reliability, SysOps, or NOC role, ideally within a financial services environment. Strong scripting and automation skills (e.g., Python, PERL, Powershell, Bash). Experience supporting Unix/Linux and Windows Server platforms. Solid understanding of networking fundamentals, including firewall and routing troubleshooting. Proven capability with monitoring tools in a complex technical landscape. Exposure to cloud platforms (preferably AWS) with relevant certification a plus. Proactive approach to issue resolution, strong communication skills, and ability to manage competing priorities. Desirable: Familiarity with observability tools and standards such as Prometheus, ITRS, OTEL, and STATSD. Experience with messaging platforms (e.g., Tibco, MQ, Solace). Understanding of DevOps pipelines and collaborative development environments. Knowledge of database systems (MSSQL, Oracle, Sybase). ITIL Foundation certification or practical experience in ITIL-based environments. Basic knowledge of the FIX protocol. Skills: Site Reliability Engineering NOC Cloud AWS CI/CD
- Location:
- Belfast
- Category:
- IT | Infrastructure
We found some similar jobs based on your search
-
New Today
Site Reliability Engineer
-
Belfast
- IT | Infrastructure
Enso Recruitment is working on behalf of our client, a leading organisation in the financial services sector, to find an experienced Site Reliability Engineer. This role offers a great opportunity to contribute to the performance and stability of hig...
More Details -
-
11 Days Old
Site Reliability Engineer I
-
Belfast
- IT | Infrastructure
Site Reliability Engineer Ito will help ensure the reliability and performance of our Clearing applications. The successful joiner must be able to solve problems creatively, communicate effectively, and work both independently and collaboratively. Key Responsibilities: Contribute to incident response and post-mortem analysis to...
More Details -