Senior Technical Program Manager
Company: Microsoft
Location: San Jose
Posted on: May 24, 2025
Job Description:
Overview:We are on a mission to create the leading AI experience
using the world's most capable AI frontier models enabling human
level intelligence and beyond. Building this experience will
require us to have a seamless platform that will span training and
inference across one of the world's most foremost GPU clusters,
pushing the boundaries of scale, performance, and reliability.The
AI Product Acceleration team at Microsoft AI needs to track closely
all aspects of infrastructure including cluster acquisition,
capacity deployment, large scale GPU buildout, high-speed fabric
buildout and validation, benchmarking, and performance
optimizations to support and accelerate our model pre-training,
post-training fine-tuning, and inferencing operations. We are an
interdisciplinary team of engineers and scientists, learning from
each other, and collaborating to create the best models, methods
and products. We work closely with the teams that transform
pre-trained models into the models that power the consumer Copilot
experience.We are looking for an outstanding Member of Technical
Staff, AI Product Acceleration Technical Program Manager who would
be the driving force behind all the activities mentioned above,
track and manage the priorities across the various engineering
teams delivering both hardware capacity and software components for
pretraining. The individual should be excited and proud about
contributing to the next generation of systems that will transform
the field. We are looking for candidates who:
- Are passionate about managing high stakes time-sensitive
large-scale programs.
- Are keenly aware and able to manage timelines and schedules of
mission critical cluster hardware, software and services.
- Have experience in navigating the timelines of datacenter
capacity deploying thousands of GPUs and enable large-scale AI
model training or inference clusters.
- Will thrive in a highly collaborative, fast-paced
environment.
- Have a high degree of craftsmanship and pay close attention to
details.
- Demonstrate a proactive attitude and enthusiasm for exploring
new methods and technologies.
- Effectively manage multiple responsibilities and can adjust to
shifting priorities.Microsoft's mission is to empower every person
and every organization on the planet to achieve more. As employees
we come together with a growth mindset, innovate to empower others,
and collaborate to realize our shared goals. Each day we build on
our values of respect, integrity, and accountability to create a
culture of inclusion where everyone can thrive at work and
beyond.By applying to this U.S. Mountain View, CA OR Redmond, WA
position, you are required to be local to the San Francisco OR
Seattle area in office 3 days a week.Responsibilities:
- Deeply understand, track and manage the timelines of datacenter
construction and bring-up.
- Deeply understand, track and manage node, rack and cluster
validation processes.
- Hold execution-focused meetings with various stakeholders to
accelerate GPU delivery timelines.
- Track and manage the capacity deployment, validation and
benchmarking of AI supercomputers.
- Collaborate with the product team and other engineers and
researchers across Microsoft and other vendors to identify gaps and
drive timelines towards resolutions and mitigations.
- Work with a cross-disciplined crew across design, research,
engineering, and data analysis to deliver a high-quality product
and evaluate success towards business goals.
- Embody our and .Qualifications:Required Qualifications:
- Bachelor's Degree AND 4+ years experience in engineering,
product/technical program management, data analysis, or product
development.
- OR equivalent experience.
- 2+ years of experience managing cross-functional and/or
cross-team projects.Preferred Qualifications:
- Bachelor's Degree AND 6+ years experience in engineering,
product/technical program management, data analysis, or product
development.
- OR equivalent experience.
- 6+ years experience managing cross-functional and/or cross-team
projects.
- 1+ year(s) experience reading and/or writing code (e.g., sample
documentation, product demos).
- 2+ years of tracking and managing data center bring-up.
- 1+ years of tracking and managing capacity deployment,
validation and benchmarking of GPU clusters for AI
training.Technical Program Management IC4 - The typical base pay
range for this role across the U.S. is USD $117,200 - $229,200 per
year. There is a different range applicable to specific work
locations, within the San Francisco Bay area and New York City
metropolitan area, and the base pay range for this role in those
locations is USD $153,600 - $250,200 per year.Certain roles may be
eligible for benefits and other compensation. Find additional
benefits and pay information .Microsoft will accept applications
and processes offers for these roles on an ongoing basis.#Copilot
#MicrosoftAI
#J-18808-Ljbffr
Keywords: Microsoft, Elk Grove , Senior Technical Program Manager, Executive , San Jose, California
Didn't find what you're looking for? Search again!
Loading more jobs...