Site Reliability Engineer
About us GoCardless is a global bank payment company. Over 100,000 businesses, from start-ups to household names, use GoCardless to collect…
Site Reliability Engineer (SRE) | AWS | Kubernetes
Fully Remote (UK)
24/7 Shift Pattern (28-day rota including days & nights)
£ Competitive + Bonus + Excellent Benefits
Build resilient cloud platforms that support critical national services. We're recruiting Site Reliability Engineers to join a global leader in AI-powered customer experience and cloud technology. Following the award of a major government programme, they're expanding their engineering teams to build and support highly secure, cloud-native platforms that deliver sensitive communication services.
This is an opportunity to join an organisation investing heavily in modern cloud engineering, automation and reliability. Working as part of a collaborative SRE team, you'll help ensure large-scale production environments remain secure, available and resilient, whilst continuously improving the way they're operated through automation and engineering best practice.
If you enjoy solving production challenges, improving reliability and automating away operational toil, we'd love to hear from you. What you'll be doing
Monitoring and maintaining highly available production platforms running in AWS
Responding to and managing production incidents across a 24/7 service
Investigating complex technical issues and restoring services quickly and effectively
Developing automation to reduce manual operational tasks and improve platform resilience
Building and improving monitoring, alerting and observability across cloud environments
Working alongside Software, Platform, Cloud and Security Engineers to improve reliability and operational excellence
Contributing to post-incident reviews and driving continuous service improvements
Supporting containerised workloads using Kubernetes and Docker
What we're looking for You'll ideally have experience in a Site Reliability Engineering, Production Engineering, Cloud Operations or NOC environment with exposure to:
Linux systems administration
AWS cloud infrastructure
Kubernetes and Docker
Production support and incident management
Python, Bash or Go scripting
Monitoring and observability platforms such as Grafana, Prometheus, Datadog, Splunk or CloudWatch
Networking fundamentals including DNS, TCP/IP and load balancing
A passion for automation, continuous improvement and operational excellence
Experience with Infrastructure as Code (Terraform), SRE principles (SLIs, SLOs), or regulated environments would be beneficial but isn't essential. Why join? This is far more than a traditional NOC role.
You'll be joining an engineering-led organisation where reliability, automation and continuous improvement sit at the heart of the platform. Rather than simply responding to incidents, you'll work to prevent them by improving systems, automating operational processes and helping shape the future of highly resilient cloud services.
If you're passionate about building reliable cloud platforms and enjoy solving complex technical problems in large-scale production environments, we'd love to hear from you.
Apply today or contact Dave Carlisle at Spectrum IT Recruitment for a confidential discussion.
Spectrum IT Recruitment (South) Limited is acting as an Employment Agency in relation to this vacancy.
Neutral 2–4 sentence summary of what working at this company is like, drawn from public reviews and press coverage. Tone, collaboration style, pace, benefits highlights.
£45,000 – £60,000 (Glassdoor, Levels.fyi, 2025)
About us GoCardless is a global bank payment company. Over 100,000 businesses, from start-ups to household names, use GoCardless to collect…
About Remote Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. We make it pos…
At our company, it’s all about #OneTeam! Join gridscale and help shape the future of the cloud together with OVH.As a leading tech company,…
Minimum qualifications: Bachelor’s degree in Computer Science, a related field, or equivalent practical experience. 5 years of experience wi…
Senior Software Engineer, Site Reliability Engineering, Cloud IRT _corporate_fare_ Google _place_ London, UK Mid Experience driving progress…
Are you interested in working with the latest cloud computing technologies and becoming a core part of the largest cloud infrastructure on t…