MongoDB’s Storage Layer Services (SLS) team is re-architecting the MongoDB cloud storage layer and sits at the heart of our next-generation…
Site Reliability Engineer, Cloud Cost Utilization
Skills & Technologies
Job Description
GitLab is the intelligent orchestration platform for DevSecOps. GitLab enables organizations to increase developer productivity, improve operational efficiency, reduce security and compliance risk, and accelerate digital transformation. More than 50 million registered users and more than 50% of the Fortune 100* trust GitLab to ship better, more secure software faster.
The same principles built into our products are reflected in how our team works: we embrace AI as a core productivity multiplier, with all team members expected to incorporate AI into their daily workflows to drive efficiency, innovation, and impact. GitLab is where careers accelerate, innovation flourishes, and every voice is valued. Our high-performance culture is driven by our values and continuous knowledge exchange, enabling our team members to reach their full potential while collaborating with industry leaders to solve complex problems. Co-create the future with us as we build technology that transforms how the world develops software.
*Fortune 500® is a registered trademark of Fortune Media IP Limited, used under license. Claim based on GitLab data. Fortune 100 refers to the top 20% ranked companies in the 2025 Fortune 500 list, published in June 2025. Fortune and Fortune Media IP Limited are not affiliated with, and do not endorse products or services of GitLab.
An overview of this role
As a Cloud Cost Utilization SRE at GitLab, you will make cloud spending visible, understandable, and actionable across our infrastructure. This role sits at the intersection of engineering and financial accountability, where you will partner with Engineering, Finance, and Product to improve how cloud usage is tracked, attributed, and optimized across GitLab.
You will build and improve the systems, standards, and workflows that help teams understand the real cost of the services they run. That includes developing resource tagging and labeling approaches, improving billing data quality, and creating tooling that supports better decisions across AWS and GCP.
In this role, you will work through technical and organizational ambiguity, connect infrastructure data with business context, and help teams act on cost signals with confidence. This is a strong fit for someone who enjoys systems thinking, cross-functional collaboration, and building practical solutions in GitLab's all-remote, asynchronous, and values-driven environment.
Some examples of our projects
Building cloud billing data pipelines that normalize multi-cloud cost data using the FinOps Open Cost and Usage Specification (FOCUS)
Improving cloud resource tagging and labeling standards so teams can understand spend by service, environment, and ownership
Developing cost anomaly detection, forecasting, and alerting workflows that give teams timely insight into infrastructure usage
Extending observability systems so cost signals can be reviewed alongside reliability and operational data
What you'll do
Design and maintain cloud resource tagging and labeling strategies across GCP and AWS to support accurate cost attribution
Develop tooling and pipelines to ingest, normalize, and report on cloud billing data using the FOCUS specification
Automate cost anomaly detection, forecasting, and alerting so engineering teams can respond quickly to changes in infrastructure spend
Contribute to GitLab's observability and monitoring stacks, including Prometheus, LGTM (Loki, Grafana, Tempo, and Mimir), and ELK, with a focus on surfacing cost efficiency signals
Partner with Finance and Engineering leadership to support cloud cost forecasting for planning and budget discussions
Act as a subject matter expert for cloud cost attribution, tagging strategy, and FOCUS adoption across GitLab Infrastructure
Collaborate with Finance and Compliance teams on audits, certifications, and financial reporting needs related to cloud infrastructure usage
Contribute to infrastructure-as-code efforts, including Terraform and Ansible, so
Company & Role Analysis
JobSeeker+Neutral 2–4 sentence summary of what working at this company is like, drawn from public reviews and press coverage. Tone, collaboration style, pace, benefits highlights.
£45,000 – £60,000 (Glassdoor, Levels.fyi, 2025)
Similar roles
See moreABOUT POSTHOG We're shipping every product that companies need https://posthog.com/handbook/why-does-posthog-exist to run their business fr…
Site Reliability Engineer
GAQ127R40 Team: IT Infrastructure and Operations About the Role At Databricks Information Technology, we are a product-led organization…
Asana’s rapid growth brings new challenges in keeping our systems fast, reliable, and resilient. As our product evolves, we’re making a majo…
Description Be an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top-notch reliability…
Salary: £40,000 - 50,000 per year Requirements: Strong experience in a Site Reliability Engineering, DevOps, or Platform Engineering role St…