Data Quality & Infrastructure for Copilot
Software Engineer @ Microsoft · 2021–Present
The Problem
Millions of people use Microsoft Copilot every day. Behind every AI-generated response is a pipeline of signals that needs to be accurate, fast, and reliable — because when the data is wrong, users get wrong answers. As Copilot scaled rapidly, there was no systematic way to validate the billions of daily signals flowing into evaluation pipelines. Regressions could silently degrade the experience for end users, and detection was manual and slow.
What I Did
I became the primary engineer for the data quality layer of this ecosystem. This wasn't a single project — it was the intersection of three problems: building reliable distributed data infrastructure, creating automated validation frameworks for LLM signals, and designing scalable APIs that other teams could build on.
- Designed and built data validation systems processing billions of daily signals, using C# (.NET) and U-SQL to catch quality regressions before they could reach model evaluation
- Architected distributed infrastructure across Azure services, ensuring petabyte-scale data durability and consistency
- Built automated regression detection frameworks that replaced manual checks — reducing detection time by 60%
- Designed scalable APIs for internal platform services, including high-security identity management with complex permission hierarchies and high-concurrency access patterns
- Created reusable tooling and patterns that were adopted as the standard across multiple teams in the Copilot ecosystem
The Hard Parts
The interesting challenge wasn't any single technical problem — it was that data quality at this scale is fundamentally a user-facing problem disguised as a systems design problem. A silent regression in signal validation doesn't just show up in a dashboard — it shows up as worse answers for real people using Copilot. You can't just add validation checks; you need to design pipelines where quality is observable by default, regressions are caught before they reach users, and the whole system degrades gracefully when upstream signals change unexpectedly.
Impact
- Reduced regression detection time by 60% through automated validation
- Built the data quality standard adopted across the Copilot ecosystem
- Infrastructure serves billions of daily events with petabyte-scale durability
- Directly improved the reliability of AI outputs for millions of Copilot users