Technical Operations Manager (AI Infra) - Hosting
- $200,000 base per year
- Miami, Florida, United States
- Permanent
- 200000
- Artificial Intelligence
- AI Data Center
- AI Network
- AI Software
Looking to advance a career in AI and high-performance computing while working with next-generation GPU infrastructure?
Join a technology team that provides scalable GPU computing solutions and global infrastructure for AI and compute-intensive workloads. The organization is seeking a high-impact leader to build a world-class support engine for a Founders Fund-backed NVIDIA cloud partner. This role is not a typical SaaS position; it operates at the intersection of physical hardware and digital platforms, ensuring accountability when GPU clusters experience downtime or enterprise SLAs are on the line. The team collaborates closely with experienced professionals, gaining hands-on experience deploying, managing, and optimizing high-performance systems across multiple environments. The role provides exposure to cloud and bare metal platforms and the opportunity to support advanced AI workloads powering foundation models worldwide.
Apply now to grow expertise and contribute to the future of GPU infrastructure and AI computing!
Responsibilities:
- Orchestrate Multi-Tier Support: Manage complex operations for demand-side developers, enterprise clients, and supply-side datacenter operators, each with unique escalation paths.
- Bridge Engineering & Operations: Partner with infra engineers to troubleshoot deep technical issues, driving resolutions and documenting postmortems so problems never repeat.
- Drive Vendor Accountability: Act as the "quarterback" when hardware fails or deliveries are delayed, tracking performance, escalating to suppliers, and negotiating remediation.
- Master SLA Compliance: Define and enforce service standards across all tiers, coordinating incident responses and managing high-stakes customer communications during breaches.
- Architect the Infrastructure: Implement the ticketing systems, on-call rotations, and knowledge bases required to sustain rapid growth.
- Build the Culture: Hire and mentor a team of support professionals, instilling a culture of craftsmanship and proactive customer obsession.
Skills/Must-have:
- Technical B2B Foundation: 4+ years in customer support or operations at a technical company, specifically where uptime and reliability are mission-critical.
- Infrastructure Literacy: You are comfortable discussing APIs, server infrastructure, networking basics, and cloud platforms.
- The "Builder" Mindset: A proven track record of creating support processes from the ground up and scaling them through hyper-growth.
- Strategic Communication: You can translate complex technical failures into clear, actionable updates for non-technical stakeholders.
- Systems Thinking: You don't just "close tickets"; you identify root causes and build permanent solutions.
- Bias Toward Action: You see a gap in a process or a failure in the stack and you take immediate ownership of the fix.
Benefits:
- Bonus 10%
- Stock options
- Greenfield Opportunity
- Work at the Epicentre of AI
- High-Stakes Ownership
Salary:
- $200,000 base per year