As an SRE in Vehicle Software, you will keep Wayve’s autonomous driving fleet reliable, observable, and safe while it operates on public roads. You will work at the boundary of software, hardware, and operations, turning real-world incidents and performance bottlenecks into lasting engineering improvements. This role offers a direct line of sight from what you build to safer deployments, faster iteration, and greater fleet scale.

§ 02KEY RESPONSIBILITIES

Own and improve the reliability, availability, and performance of vehicle software systems used across the dev fleet.
Take part in a team on-call rotation, providing out-of-hours support for live systems when required.
Build and operate monitoring, logging, alerting, and on-call tooling that enables fast detection, diagnosis, and recovery.
Drive incident response and post-incident learning, translating root causes into durable fixes and preventative controls.
Design and deliver automation for fleet operations, deployments, and repetitive workflows to reduce manual intervention.
Partner closely with Vehicle SW, operations, and platform teams to define SLOs, reliability metrics, and release readiness.
Continuously harden the production environment through capacity planning, change management, and reliability-focused reviews.

§ 03ABOUT YOU

In order to set you up for success as a Site Reliability Engineer at Wayve, we’re looking for the following skills and experience.

§ 04ESSENTIAL SKILLS

Proven experience in an SRE, production reliability, or platform operations role for complex distributed systems.
Strong Linux fundamentals and hands-on experience with CI/CD, containers (Docker), and orchestration (Kubernetes).
Proficiency in at least one systems or scripting language (Python, C++, or Rust) with a bias for automation.
Deep troubleshooting skills across networking, distributed systems, and databases, including performance and availability issues.
Experience designing observability stacks and using tools such as Datadog, Prometheus, Grafana, OpenTelemetry, Splunk, or Humio.
Clear communication skills, including incident leadership, writing postmortems, and influencing engineering priorities.

§ 05DESIRABLE SKILLS

Cloud platform experience (AWS, GCP, or Azure), including infrastructure-as-code and secure production operations.
Experience with real-time or safety-critical systems, hardware-in-the-loop, or embedded/robotics environments.
Familiarity with fleet operations, telemetry pipelines, and operating software on edge devices at scale.
Experience defining and running SLOs/SLIs and reliability programs across multiple teams.

This is a full-time role based in our office in Germany, Baden-Württemberg (Hybrid 3 days a week min). At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.

[ APPLICATION ROUTE ]ASHBY · External ATS

APPLY VIA ASHBY→

Apply links open in the employer's official ATS. Always verify recruitment messages on the company's careers page before sharing personal information.

Founded	2017
HQ	London, UK
Total Funding	$1.3B
Last Round	Series C · $1.1B
Round Date	May 2024
Open Roles	109
Status	Hiring