Senior SRE

Remote, USA

Everyday we tackle new and exciting challenges to empower developers to build responsive and flexible cloud, mobile, and edge computing applications that scale effortlessly. Couchbase delivers unmatched versatility, performance, scalability and financial value across cloud, on premises, hybrid, distributed cloud and edge commuting deployments. The database market is one of the largest undisturbed markets for enterprise software. The main catalyst for this is the need for digital transformation. Join Couchbase to be a part of a greater change. Here you’ll have the opportunity to learn and grow with some of the most innovative, passionate and humble individuals in the database industry.

The SRE Leader will join Couchbase’s Cloud team to lead the  Cloud Operations Teams and help build the function within the Cloud organization.

The SRE team is responsible for the availability and support of the service to customers by performing full-stack observability, level one alerting. reliability engineering and incident response for Couchbase’s cloud organization.

In this role, you will set the strategy and operational KPI’s for the SRE organization and the applications supported by the cloud organization.

Partnering with our engineering leadership and cloud leadership, you will work to build our Service level indicators (SLI), Service Level Objectives (SLO), Service level agreements (SLA’s) and Error budgets for our services.

As part of this role, you will lead customer escalations and will build a close relationship with our engineering and product organizations.


  • Own the end-to-end availability (SLO/SLA), reliability, and performance of Couchbase’s Cloud offerings.
  • Develop automation, processes and metrics to ensure maximum reliability and uptime for our customers
  • Establish an on-call cadence with the team and ensure adequate coverage areas
  • Foster a healthy and collaborative culture, in line with Couchbase’ core values
  • Participate in 24x7 Site Reliability rotations and escalation workflows
  • Serve as a change board approver and incident manager
  • Serve as project manager or scrum master for major initiatives and train the team to be the first line of support
  • Present quarterly operations review in addition to other more routine reporting obligations
  • Represent Couchbase in customer meetings and serve as a customer advocate in influencing product roadmap and improvements
  • Take ownership of many controls, processes, and risks required to maintain our compliance portfolio (SOC 2, PCI-DSS, GDPR, and HIPAA, among others)
  • Collaborate with the Cloud Engineering team to understand deployment practices and processes and work towards iteratively improving the release pipeline to ensure a highly resilient deployment strategy, ideally with zero downtime


  • At least 5 years of work experience in Site Reliability/Infrastructure Engineering for a team operating in public cloud
  • A passion for SRE/DevOps and running highly resilient/automated systems
  • Proficient working with Terraform configuration management tools, version control systems (Git), integrating with CI/CD platforms and tool chains such as CircleCI, GitHub.
  • Deep working experience on cloud platforms like Amazon Web Services and open source software like Kubernetes, Prometheus, Datadog etc.
  • Experience developing or integrating Chaos Engineering tool chains or methodologies
  • Manage on-call rotations across continents, using a follow-the-sun model and handle incidence response to ensure high-availability
  • Regularly report on availability and incidents to senior management
  • Build a team culture to aim for high service availability, scalability and observability goals
  • Bias towards data driven decisions and ensuring key metrics are agreed on, visible and actionable
  • BS/BE/Masters in Computer Science

The anticipated starting base pay range for this role is $200,000 - $238,000 per year. Base salary is not the only component of our competitive total rewards package - you will also be eligible for bonus, equity, and other benefits as described below. Actual compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location.

Why Couchbase?

Couchbase is named one of DTBA’s top 100 companies that matter in data. At Couchbase, we believe data is at the heart of the enterprise. We empower developers and architects to build, deploy, and run their most mission-critical applications. Couchbase delivers end-to-end technical solutions for all our customers with high-performance, flexible and scalable modern databases that run across the data centers and any cloud. Many of the world’s largest enterprises rely on Couchbase to power the core applications their businesses depend on. See our recent awards to learn what makes Couchbase such a great company to work at.

We are honored to be a part of the best workplace award in the Bay Area. In 2022, Couchbase is recognized as the best workspace in the Bay Area. Couchbase offers a total rewards approach to benefits for the value you create here, so that you in turn may best serve yourself and your family. Some benefits include:

- Unlimited time off (DTO)

- Matching 401K contributions


- Medical, Dental & Vision

- Monthly credit towards a lifestyle spending account

- An ergonomic and comfortable in-office setup, with food and supporting technology, to assist in the setup of an efficient WFH and office environment

-And much more!


Apply for this job
Share this job opening

DevOps and Dev jobs in your inbox every week.

Thank you! You'll receive a confirmation shortly
Oops! Something went wrong while submitting the form.
Made with love️ by Mohamed Labouardy.