Network Automation Engineer - Hosting

1677776
  • $200,000 base salary
  • San Francisco, California, United States
  • Permanent
  • 200000
  • Artificial Intelligence
  • AI Network


Ready to architect the high-speed networks powering the AI era?

Join a trailblazing leader in GPU-accelerated computing, designing and operating the network fabric that supports some of the most demanding computational workloads on the planet.

They are looking for a Network Automation Engineer to act as the architect of a high-speed, low-latency fabric within a massive-scale environment. The engineer will pioneer the "Network-as-Code" movement, working with 400G and 800G connectivity as the baseline. Collaborating with an elite engineering team, they will build and optimize ultra-low-latency, high-speed networks that underpin the world’s most advanced large language models and AI infrastructure.

Gain exposure to cutting-edge network automation and help shape the future of AI infrastructure. Apply now!


Responsibilities:

  • Driving the Automation Roadmap: You will be leading the transition from manual network management to a fully declarative, GitOps-driven deployment model.
  • Optimising AI Clusters: You will be writing sophisticated code to manage and monitor InfiniBand and RoCEv2 fabrics, ensuring zero-packet-loss for massive GPU training jobs.
  • Building Source of Truth: You will be integrating NetBox or Nautobot into the heart of the deployment pipeline to ensure the network is always synchronised with the intended state.
  • Proactive Monitoring: You will be developing custom telemetry tools and streaming pipelines to provide real-time visibility into the performance of thousands of interconnected nodes.


Skills/Must have:

  • The Automation Toolbox: Expert-level Python (not just scripting, but building tools) and deep experience with Ansible, Terraform, or SaltStack.
  • High-Scale Networking: A strong foundation in BGP, EVPN-VXLAN, and specific experience with InfiniBand or RDMA (RoCEv2).
  • CI/CD Pipeline Expertise: Experience building and maintaining pipelines in GitLab CI or GitHub Actions, specifically for network infrastructure.
  • Infrastructure-as-Code (IaC): A proven track record of managing datacenter networks through code and version control rather than the console.
  • The "Problem Solver" DNA: A background in large-scale DataCenter environments (Hyper-scalers, FinTech, or HPC) where performance is the primary metric.


Benefits:

  • Equity
  • Company bonus 


Salary:

  • $200,000 base salary 
Ben Davies Director Global AI Infrastructure

Apply for this role