KeyStep

Network Reliability Engineer

Cloudflare
Hybrid
about 5 hours ago
full-timeInfrastructure

Skills & Technologies

PythonC++CGoRustSoftware DevelopmentSite ReliabilitySite Reliability EngineeringSOLIDAnsibleCloudflareLinuxPrometheusGrafanaAirflowLLMRoadmapDeploymentMakeAI

Job Description

About Us

At Cloudflare, we are on a mission to help build a better Internet. Today the company runs one of the world’s largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare all have web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was named to Entrepreneur Magazine’s Top Company Cultures list and ranked among the World’s Most Innovative Companies by Fast Company.

At Cloudflare, we’re not looking for people who wait for a polished roadmap; we’re looking for the builders who see the cracks in the Internet that everyone else has simply learned to live with. We value candidates who have the instinct to spot a "normalized" problem and the AI-native curiosity to create a solution using the latest tools. Our culture is built on iteration, leveraging AI to ship faster today to make it better tomorrow, while ensuring that every improvement, no matter how small, is shared across the team to lift everyone up. If you’re the type of person who values curiosity over bureaucracy, and that AI is a partner in solving tough problems to keep the Internet moving forward, you’ll fit right in.

Available Locations: Austin, Atlanta, Denver, Seattle, Washington D.C. (Hybrid)

About the Role (or What you'll do)

Cloudflare operates a large global network spanning hundreds of cities (data centers). You will join a team of talented network engineers who are building software solutions to improve network resilience and reduce operational toil.

This position will be responsible for the technical operation and engineering of the Cloudflare's core data center network, including the planning, installation and management of the hardware and software as well as the day-to-day operations of the network. The core network supports our critical internal needs such as databases, high volume logging, and internal application clusters. This is an opportunity to be part of the team that is building a high­-performance network that is accessible to any web property online.

You will build tools to automate operational tasks, streamline deployment processes and provide a platform for other engineering teams to build upon. You will nurture a passion for an “automate everything” approach that makes systems failure-resistant and ready-to-scale. Furthermore, you will be required to play a key role in system design and demonstrate the ability to bring an idea from design all the way to production.

Examples of desirable skills, knowledge and experience

3 years of relevant Network/Site Reliability Engineering experience

BA/BS in Computer Science or equivalent experience

Solid foundation on configuration management frameworks: Saltstack, Ansible, Chef

Experience with NX-OS, JUNOS, EOS, Cumulus, or Sonic Network Operating Systems

AI-native: being able to leverage LLM to

build agentic deployment and troubleshooting tools on top of the Cloudflare stack

automate configurations (SaltStack + Temporal)

parse complex log files, and streamline documentation

Solid Linux systems administration experience

Linux networking - iproute2, Traffic Control, Devlink, etc.

Strong software development skills in Go and Python

Bonus Points

Deep knowledge of BGP and other routing protocols

Workflow Management (AirFlow, Temporal)

Open Source Routing Daemons (FRR, Bird, GoBGP)

Experience with bare metal switching

Experience with network programming in C, C++ or rust

Experience with the Linux kernel and Linux software packaging

Strong tooling and automations development experience

Time series databases (Prometheus, Grafana, Thanos, C

Company & Role Analysis

JobSeeker+
Likely perks
Private MedicalPension25+ Days HolidayStock OptionsLearning BudgetFlexible Hours
Culture & working style

Neutral 2–4 sentence summary of what working at this company is like, drawn from public reviews and press coverage. Tone, collaboration style, pace, benefits highlights.

Market salary range

£45,000 – £60,000 (Glassdoor, Levels.fyi, 2025)

Unlock the full analysis for this job
Sign in to unlock →

Similar roles

See more
Cloudflare
Hybrid
Full-time
about 5 hours ago

About Us At Cloudflare, we are on a mission to help build a better Internet. Today the company runs one of the world’s largest networks tha…

View Job
Apply NowApply with CV Improver