Infrastructure Product Engineer - AI Infrastructure

1666626 Posted: 11/02/2026

$300,000 to $350,000 gross per year
San Francisco, California
Permanent
300000
Artificial Intelligence
AI Network
AI Software

Join a stealth-mode startup building a next-generation AI and cloud platform powered by thousands of H100s, H200s, and B200s, designed for rapid experimentation, full-scale model training, and production inference. As a Senior Infrastructure Product Engineer, you’ll sit at the intersection of platform architecture, product thinking, and large-scale systems engineering, shaping how AI infrastructure is exposed, consumed, and scaled.

This role goes beyond keeping systems running. You’ll architect the underlying primitives that power new infrastructure products, defining how compute, networking, scheduling, and observability come together as a coherent platform. You’ll work closely with product, ML, and hardware teams to turn raw GPU capacity into reliable, developer-friendly capabilities.

If you want to architect infrastructure as a product, define the building blocks behind frontier AI platforms, and influence how thousands of GPUs are consumed at scale, this is a rare chance to do it from first principles.

Get in touch and apply today!

Responsibilities:

Architect and evolve large-scale GPU platforms (H100/H200/B200) to support training, inference, and emerging AI workloads.
Design infrastructure abstractions and platform primitives that enable new AI and cloud products.
Build scalable automation frameworks for provisioning, scheduling, and lifecycle management across Slurm, Kubernetes, and bare-metal environments.
Partner with product and ML teams to translate user requirements into infrastructure architecture and platform capabilities.
Define reliability, scalability, and performance standards as architectural constraints rather than reactive fixes.
Develop observability and capacity models that inform platform design, roadmap decisions, and customer-facing SLAs.
Identify systemic bottlenecks across compute, network, and storage layers and drive architectural improvements.

Skills/Must have:

7+ years of experience in Infrastructure Engineering, Platform Engineering, SRE, or Systems Architecture roles.
Proven experience designing and operating large-scale GPU or HPC platforms.
Deep hands-on expertise with Kubernetes and Slurm, including scheduler behaviour and workload optimisation.
Strong Linux systems and networking fundamentals in high-performance environments.
Proficiency in Python, Go, or Bash for building platform tooling and automation.
Experience treating infrastructure as a product, with a focus on usability, interfaces, and scalability.
Familiarity with observability platforms (Prometheus, Grafana, Loki) and performance analysis at scale.

Benefits:

Equity

Salary:

$300,000 to $350,000 gross per year

Ben Davies Director Global AI Infrastructure

Apply for this role

First Name

Last Name

Telephone Number

Email Address

CV, LinkedIn or Dropbox URL

CV Upload

Choose File

LinkedIn / Dropbox URL

Message

By submitting this form you agree to our Terms & Conditions, Privacy Policy & Cookie Policy.

Quick CV Dropoff

Infrastructure Product Engineer - AI Infrastructure

Apply for this role

Featured Jobs

Contact Us

Find us on social

Useful Links

Legal

Infrastructure Product Engineer - AI Infrastructure

Apply for this role

Featured Jobs

Contact Us

Find us on social

Useful Links

Legal

Sign up to our newsletter