Site Reliability Engineer 5 – Core

Posted 2026-05-05
Remote, USA Full-time Immediate Start
    Job Description:
  • Design, implement, and maintain scalable and reliable infrastructure to support Netflix Streaming Suite.
  • Collaborate with engineering and product teams to integrate observability, reliability, and security considerations into the entire software development lifecycle.
  • Develop and implement automation tools for monitoring, deployment, and incident response to ensure efficient and reliable operations.
  • Participate in on-call rotations to ensure the 24/7 health of the Netflix Streaming and contribute to incident response, diagnosis, and resolution.
  • Implement and maintain a robust incident response framework, including blame-aware incident reviews to learn from operational surprises.
  • Proactively identify sources of instability in distributed systems and analyze how complex systems fail from a reliability and resilience perspective.
  • Champion and embed a culture of reliability across the Ads organization.
  • Act as a force multiplier by creating clear documentation, developing best-practice guides, and building tooling to roll out reliability enhancements automatically.
    Requirements:
  • 5+ years of experience as a Site Reliability Engineer (SRE), Production Engineer, or similar role supporting business-critical, high-traffic services.
  • Write code to solve problems.
  • Proficient in one or more languages like Python, Go, or Java.
  • Fluent in modern cloud infrastructure.
  • Hands-on experience with cloud providers such as AWS/Azure/GCP.
  • Experience with Infrastructure as Code such as Terraform.
  • Experience with container orchestration systems like Kubernetes.
  • Understand large-scale distributed systems, their common failure modes and edge cases.
  • Excellent communication skills and a proven ability to build relationships with engineering partners.
  • Experience with incident management and response.
  • Calmly navigate complex production issues, identify root causes, and implement effective, lasting solutions.
  • Possess a growth mindset. Relentlessly curious and committed to continuous improvement.
    Benefits:
  • Health Plans
  • Mental Health support
  • 401(k) Retirement Plan with employer match
  • Stock Option Program
  • Disability Programs
  • Health Savings and Flexible Spending Accounts
  • Family-forming benefits
  • Life and Serious Injury Benefits
  • paid leave of absence programs
  • Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off.
  • Full-time salaried employees are immediately entitled to flexible time off.

Similar Jobs

Back to Job Board