We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Principal Manager, Product Management

Microsoft
$163,000.00 - $296,400.00 / yr
United States, Washington, Redmond
Apr 02, 2026
Overview

Microsoft is building the world's most trusted AI platform. To do that at global scale, we need first-class observability for AI systems: deep, real-time insight into model and agent behavior, safety, quality, performance, cost, and customer experience across Copilot, M365, Azure AI Foundry, and first-party services. We are hiring a Principal Manager, Product Management who will define and lead the product strategy for Microsoft's AI Foundry Observability platform - the telemetry, evaluations, and controls that make AI reliable and accountable by default.

Scope & impact

  • Own the vision, long-term roadmap, and cross-company execution for AI observability, spanning data/telemetry, model & prompt evaluations, distributed tracing, SLOs, incident response, and cost/efficiency analytics across Microsoft's AI surfaces.
  • Orchestrate company-wide adoption: unify patterns and platforms so product teams instrument once and get metrics/logs/traces, online/offline evals, safety dashboards, and alerting everywhere.
  • Drive safety and reliability outcomes by fusing model telemetry with red-team results, policy violations, drift, and regressions; partner with Safety, Responsible AI, Security, and SRE to make issues visible and actionable.
  • Represent the product with executives and external partners; influence standards (e.g., OpenTelemetry-based LLM traces), regulatory readiness, and customer trust for Microsoft AI.
  • Partner closely with Responsible AI (RAI) teams to embed ethical, safety, and compliance standards into observability tooling and telemetry pipelines. Ensure that observability platforms support policy enforcement, risk detection, and regulatory readiness across all AI surfaces.
  • Collaborate with management plane partners to integrate observability controls, configuration, and governance at scale. Drive unified management experiences for telemetry, evaluation, and incident response, enabling seamless operations and compliance across customers' AI solutions.


Responsibilities

Define & deliver the platform

  • Define and refine the north start success of Foundry's Observability Platform. Create clear success metrics and deliver on a grounded plan to achieve industry wide adoption.
  • Lead a high-impact PM team, attracting and growing top talent in AI observability.
  • Model leadership expectations: set vision, drive cross-company clarity, and create a culture of accountability and inclusion.
  • Amplify impact by developing frameworks and standards that scale across Microsoft engineering teams.
  • Ship unified pipelines for metrics, logs, traces, and events across training, fine-tuning, inference, and retrieval; enable real-time tracing from user signal prompt tool calls model responses post-processing.
  • Build and collaborate with teams that build evaluation systems for quality & safety (hallucination, groundedness, toxicity, bias, jailbreak attempts), and online guardrails with policy-aware alerting.
  • Provide experience-level SLOs (latency, reliability, quality) and cost/perf analytics for model and prompt variants; support A/B and interleaving experiments.

Drive adoption & standards

  • Publish SDKs/instrumentation patterns and default dashboards for engineers, data scientists, applied researchers, and on-call responders.
  • Develop frameworks and standards for integrating agentic workflows into CI/CD pipelines, empowering teams to leverage AI-driven automation for testing, evaluation, and remediation.
  • Land design reviews and program increments with M365, Windows, Azure AI, Gaming, and Security to ensure consistency and measurable impact.

Operate at hyperscale

  • Partner with SRE to automate anomaly detection, causal analysis, and incident response playbooks.
  • Integrate with security observability for end-to-end risk posture and compliance reporting.
  • Foster a culture of continuous improvement by leveraging observability insights to guide agentic interventions, optimize developer experience, and drive measurable business outcomes.

Voice of customer & business

  • Engage flagship customers and ISVs to validate scenarios (regulated, sovereign, enterprise); define SLAs/SLOs and report health in exec forums.


Qualifications

Required/Minimum Qualifications:

  • Bachelor's Degree AND 10+ years experience in product/service/program management or software development
    • OR equivalent experience.
  • 3+ years people management experience.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred/Additional Qualifications:

  • 10+ years building platform or infrastructure products in cloud/AI, including observability, reliability, or safety domains; proven success driving org-wide adoption and measurable reliability/quality outcomes at scale.
  • 8+ years of managing a team at a high scale technology company.
  • Demonstrated ability to set strategy and ship platforms that serve thousands of engineers, with API/SDK intuition and data platform literacy (streaming, time-series, OLAP).
  • Expertise across metrics/logs/traces, LLM/ML evaluation, A/B testing, drift detection, and incident management.
  • Fluency in AI system architecture (RAG, tool use, fine-tuning, inference optimization) and governance/safety considerations.
  • Executive presence: can align CVPs/Partners and lead cross-org programs; exceptional written narratives and crisp data storytelling.
  • Experience with OpenTelemetry, distributed tracing for LLM calls, safety signal taxonomies, and privacy/security controls in telemetry.
  • Built/led platforms used by multiple product lines (consumer + enterprise) with strict SLO/SLA contracts.
  • Familiarity with Azure AI stack and Microsoft's engineering rhythms.
  • Background partnering with GTM/Customer Success to close the loop from telemetry product improvements.


Product Management M6 - The typical base pay range for this role across the U.S. is USD $163,000 - $296,400 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $220,800 - $331,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Applied = 0

(web-bd9584865-rvxnf)