Network Automation Engineer - Hosting

1677777
  • $200,000 base salary
  • Albany, New York, United States
  • Permanent
  • Artificial Intelligence
  • AI Network


Ready to architect the high-speed networks powering the AI era?

Join a trailblazing leader in GPU-accelerated computing, designing and operating the network fabric that supports some of the most demanding computational workloads on the planet. 

They are looking for a Network Automation Engineer to design and operate the network fabric that supports some of the most demanding computational workloads on the planet. Pioneer "Network-as-Code" approaches within a massive-scale environment, where 400G and 800G connectivity is the baseline. Collaborate with an elite engineering team, building and optimizing ultra-low-latency, high-speed networks that enable the next generation of large language models and AI infrastructure.

Gain exposure to cutting-edge network automation and shape the future of AI infrastructure. Apply now!


Responsibilities:

  • Driving the Automation Roadmap: You will be leading the transition from manual network management to a fully declarative, GitOps-driven deployment model.
  • Optimising AI Clusters: You will be writing sophisticated code to manage and monitor InfiniBand and RoCEv2 fabrics, ensuring zero-packet-loss for massive GPU training jobs.
  • Building Source of Truth: You will be integrating NetBox or Nautobot into the heart of the deployment pipeline to ensure the network is always synchronised with the intended state.
  • Proactive Monitoring: You will be developing custom telemetry tools and streaming pipelines to provide real-time visibility into the performance of thousands of interconnected nodes.


Skills/Must have:

  • The Automation Toolbox: Expert-level Python (not just scripting, but building tools) and deep experience with Ansible, Terraform, or SaltStack.
  • High-Scale Networking: A strong foundation in BGP, EVPN-VXLAN, and specific experience with InfiniBand or RDMA (RoCEv2).
  • CI/CD Pipeline Expertise: Experience building and maintaining pipelines in GitLab CI or GitHub Actions, specifically for network infrastructure.
  • Infrastructure-as-Code (IaC): A proven track record of managing datacenter networks through code and version control rather than the console.
  • The "Problem Solver" DNA: A background in large-scale DataCenter environments (Hyper-scalers, FinTech, or HPC) where performance is the primary metric.


Benefits:

  • Equity
  • Company bonus 


Salary:

  • $200,000 base salary 
Ben Davies Director Global AI Infrastructure

Apply for this role