Job Description
Our team runs the GPU fleet that serves the models backing ChatGPT and the API. We build automation to provision and manage one of the largest cutting edge GPU inference fleets in the world, exposing it as a singular platform for other OpenAI teams to seamlessly run production applied AI workloads.
We seek to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is more important to us than unfettered growth.
About the Role
We are looking for an experienced engineering manager to help lead our Fleet Clusters team. You’ll be responsible for building, scaling, and operating the massive GPU fleet clusters that power AI inference and general purpose training at OpenAI. This role focuses on designing and managing large-scale, high-availability GPU clusters across multiple environments, ensuring reliability, scalability, and efficiency. You will partner closely with product, research, and infrastructure teams to rapidly ship and support advanced AI products at global scale.
In this role, you will:
Manage and build a diverse team of high performing infrastructure engineers
Guide the roadmap for automation for a fleet that can grow an order of magnitude in size or more
Build a world-class, secure compute fleet that serves users at scale
Set technical direction on evolving our compute and abstractions to support a growing business
Collaborate closely with a broad set of stakeholders, including product engineering, inference, security, research and finance
Work with external partners to unlock bleeding edge compute and making it available as a turnkey resource for scheduling workloads
Coach and nurture engineers to accelerate their growth and learning
You might thrive in this role if you:
10+ years of experience in infrastructure software engineering, including 5+ years in engineering management.
Proven track record of building high-performance computing infrastructure teams at scale.
Hands-on experience provisioning bare-metal server data centers interconnected across WANs.
Experience designing and operating hybrid-cloud platforms.
Strong commitment to diversity, equity, and inclusion, with a history of building inclusive teams.
Ownership mentality: willing to pick up new skills and knowledge to solve problems end-to-end. Comfortable being hands-on when needed to help debug systems and support the team.
Ability to operate effectively in fast-paced environments with loosely defined priorities and competing deadlines.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.
For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.
We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via thislink .
At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.
#J-18808-Ljbffr OpenAI
Job Tags
Similar Jobs
TEKsystems
...Job Description Description Sr Product Owner, IT Enterprise Applications - Salesforce \tTEKsystems \tInformation Technology \t95381\tUSD $150,000.00/Yr. \tUSD $160,000\t100% Onsite in Irvine, CA \tShort term contract to hire _______________...
Sugaring LA
...Responsive recruiter Benefits: Unpaid Flexible schedule Training & development Who We Are At sugaringLA... ...to Go Beyond the Surface? If you're looking for an inspiring internship experience where your ideas are heard and your work makes a...
American Income Life
100% Remote!! Please Note: We are currently only hiring U.S. residents who are legally authorized to work in the United States with a social security # (US Only)... ...nation, all while working from the comfort of your home. No experience in the industry? No problem! We provide...
Courtyard by Marriott - Mason, OH
Job Summary: We are looking for a Night Auditor to prepare a summary of cash, check and credit card activities reflecting the hotels... ...Hours are 11pm to 7am Benefits* Competitive Pay* Paid Time Off* Employee Rate Discounts for Hotel Stays* Team Work...
US Tsubaki Automotive, LLC
...automotive industry. Essential Duties and Responsibilities: Responsible for sorting, storage and distribution of dunnage for the warehouse, assembly departments, and suppliers. Move finished products from assembly departments to the warehouse. Act as a liaison...