Senior ML Infrastructure Engineer (Research Initiatives) - Systems Integrator

1668855
  • Super Competitive (Base + Bonus + High-Upside Equity)
  • Remote, UK
  • Permanent
  • Artificial Intelligence
  • AI Software


Looking to architect next-generation AI infrastructure that transforms engineering simulations?

Join a world-class AI research laboratory, currently in Series B moving to C, that is redefining engineering with Large Physical Models that replace traditional numerical simulations with high-speed AI inference. The role involves building and managing the infrastructure required to train complex simulation models on hundreds of GPUs, leading the orchestration of massive-scale training environments, and contributing to pioneering Physical AI architectures that go far beyond standard generative models. Engineers will work in a high-growth, NVIDIA-backed environment with exposure to cutting-edge AI hardware and sovereign cloud technologies.

Ready to design and scale infrastructure that powers the future of AI-driven engineering? Apply now.


Responsibilities:

  • Design and scale training environments using PyTorch Distributed, JAX, or NVIDIA NeMo across multi-node/multi-GPU clusters.
  • Manage a mixed-compute strategy spanning public clouds (AWS/Azure) and sovereign industrial clouds for sensitive data.
  • Implement optimization techniques like FSDP and custom kernels to maximize FLOPS for irregular mesh and 3D geometric data.
  • Build high-performance pipelines for ingesting CAE/CFD/FEA engineering data, ensuring zero I/O bottlenecks.
  • Integrate traditional physics solvers (OpenFOAM/Simcenter) into ML pipelines for active learning and model refinement.
  • Setup "physics-aware" CI/CD and experiment tracking (Kubeflow/MLFlow) that validates physical consistency laws.


Skills/Must have:

  • Orchestration: Expert-level Kubernetes (AKS/EKS) is essential.
  • ML Frameworks: Strong proficiency in PyTorch, JAX, or NVIDIA NeMo.
  • HPC/Data: Solid experience with Python, Go, and distributed data tools (Dask/Spark).
  • Background: Experience in AI research labs (e.g., DeepMind, OpenAI) or Neocloud environments.


Benefits:

  • Competitive Equity: Significant stock option packages in a fast-scaling Series B/C firm.
  • Flexible "London-Plus" Setup: Remote-first within Europe/UK, with roughly 1 week per month in London (all travel/accommodation fully paid).
  • High-Impact Culture: Work alongside world-renowned physicists, mathematicians, and Formula 1 simulation veterans.
  • Sponsorship: Full visa sponsorship available for top-tier global talent.


Salary:

  • Super Competitive (Base + Bonus + High-Upside Equity) 
  • Tailored to attract the best in the industry.
Jamie Maher Senior Network Consultant

Apply for this role