Ajay YadavOpen to AI PM roles

IshipAIproducts
peopleactuallyuse

Not demos. Real products, real users, real outcomes.

4 yrs ·Innovaccer·Shopify·Freshworks

AI feature types shipped

12+

hospital networks

11%

retention lift · Shopify

See my work Download resume

scroll to explore

AI work

Every layer of the AI stack — shipped

Not just roadmaps. Production features with real adoption metrics.

Predictive ML

Risk stratification models

Defined model objectives, success metrics, and eval criteria for patient risk models used across 12+ hospital networks. Built the clinician feedback loop that turned model output into trusted, actionable recommendations.

Risk modelingEval criteriaClinical adoption

NLP

Clinical notes and claims extraction

Wrote specs for NLP pipelines processing unstructured clinical text and claims data. Defined entity extraction requirements and precision targets in a HIPAA-regulated environment.

Clinical NLPEntity extractionHIPAA

AI UX

Alerting and recommendation design

Designed the recommendation UX so clinicians understood why the AI flagged a patient. Reduced alert fatigue while increasing care team follow-through on high-risk cases.

Alert designTrust & adoptionAI UX

Generative AI

LLM summaries and copilot features

Shipped LLM-powered clinical summaries — owned the hallucination policy, eval rubric, and phased rollout across regulated hospital environments. First LLM feature in production at Innovaccer.

LLMHallucination guardrailsPrompt design

Case studies

How I think, decide, and ship

The process behind the product, including what I got wrong.

Innovaccer·Product Manager·2022–present

How do you get clinicians to trust an AI recommendation enough to act on it?

The situation

Led the 0-to-1 launch of a population health analytics module for enterprise hospital networks. The module surfaced AI-generated patient risk scores — but early pilots showed clinicians were seeing alerts and ignoring them.

What I discovered

Interviews with care coordinators revealed the issue was not model accuracy — it was explainability. Clinicians could not trust a score they could not interrogate. The UI had no answer to: why is this patient high risk?

The decision

Prioritized an explainability layer over building more alert types. Each risk score now showed contributing factors ranked by weight. Drove cross-functional delivery across engineering, clinical SMEs, and GTM while maintaining HIPAA compliance and enterprise scalability.

The outcome

Improved care-coordination efficiency by 13%. Accelerated compliant feature releases by 9%. Module became the anchor product in Innovaccer's enterprise GTM motion, deployed across 12+ hospital networks.

AI explainability0-to-1Enterprise GTMClinical adoptionHIPAA

Shopify·Product Manager·2020–2023

72% of failed payment authorizations were not being recovered. How do you fix a silent revenue leak merchants do not even know exists?

The situation

Owned the subscriptions and recurring payments roadmap for SMB merchants. On the surface, merchants wanted easier subscription setup. Digging into the data revealed a much bigger problem hiding underneath.

What I discovered

Behavioral analysis showed 72% of failed authorization attempts were never retried. Merchants had no visibility into this — they were losing subscribers silently with no way to diagnose or act on it.

The decision

Redesigned retry logic and billing rules before touching onboarding. Counterintuitive call — merchants asked for setup improvements, but the data said the real money was in recovery. Validated via A/B tests using Amplitude and LaunchDarkly.

The outcome

Increased merchant retention by 11% within two quarters. Reduced time-to-first-transaction by 18% for newly onboarded merchants. Drove 8% increase in merchant expansion revenue through pricing tier experimentation.

PaymentsRetentionA/B testingData-drivenSMB

CompeteIQ·Builder + PM·2024

Can you build a competitive intelligence tool that updates itself — so teams stop finding out about competitor moves from customers?

The situation

Built at OpenClaw Hackathon at AWS Builder Loft SF. Most competitive intel tools are expensive or require manual updates. Web scraping plus LLM synthesis could automate 80% of the work — if you solved the signal vs noise problem.

What I discovered

The real problem was not data collection — it was the diff. Raw scrape data was overwhelming. The value was in what changed since last week, not everything that exists.

The decision

Built a before/after diff engine as the core feature. Used Apify for scraping, Redis for caching previous states, Claude API for synthesizing what each change meant in plain English. Shipped a working demo in one day.

The outcome

Presented live at AWS Builder Loft SF. Real-time change detection across 10 competitor sites. Post-hackathon: iterating with a full dashboard UI, Slack alert integration, and automated weekly digests.

Claude APIApifyRedisLLM synthesis0-to-1

Deep Dives

Product problems I've gone deep on

From raw data to working prototype — scoping, analyzing, and building toward a real product decision.

Outcome Health·Voice AI · Observability

1,048 voice AI calls. 0.1% goal completion rate. The operator had no visibility into why.

The situation

Voice AI platforms give operators rich conversation logs but no outcome-level observability. An AI agent can run thousands of calls, sound coherent in every one, and still fail its primary goal — with no dashboard to surface that failure.

What I discovered

The biggest gap wasn't conversation quality — it was outcome visibility. Of 97 callers who showed explicit booking intent across 1,048 calls, only 1 confirmed demo was booked. The booking action was never wired. No existing surface showed this.

The decision

Designed 'Outcome Health' — a Conversation Intelligence dashboard with 5 panels: Goal Funnel, Qualification Scorecard, Risk Monitor, Failure Mode Analysis, and Suggested Next 3 Fixes. Built a working HTML prototype to demonstrate the product end-to-end.

The outcome

Five failure categories surfaced: goal completion gap, BANT qualification at 2.46/7, hallucination rate at 21.5%, template variable errors including one confirmed lost prospect, and 125 wasted calls. Phase 1 roadmap: 4 weeks, zero new ML.

PythonpandasData analysisProduct strategyHTML prototype

Full case study →View prototype ↗

Selected work

Things I've built

Side projects, hackathon builds, and academic work.

CompeteIQ

AI competitive intelligence, built in a day

AI Tool→

CineMatch

Movie discovery with taste-based matching

Consumer→

CArbi AI

Carb counting for Arabic-speaking diabetics

Healthcare AI→

Aidly

Healthcare equity for underserved communities

Social Impact→

Spin the Meal

AI meal planning to kill decision fatigue

Consumer AI→

How I think

My AI PM principles

Things I have learned the hard way shipping AI in production.

Trust is the product

In AI, model accuracy matters less than whether a user trusts and acts on the output. I design the feedback loop before I design the feature.

Eval criteria before engineering

I write the success rubric — precision, recall, latency, hallucination rate — before a single model is trained. If you cannot measure it, you cannot ship it responsibly.

Adoption is a product problem

Low AI adoption almost always traces back to UX, not model quality. I instrument the gap between recommendation and action — that gap is the real product surface.

Get in touch

Let's build something

Open to AI PM roles across enterprise, consumer, and developer tools. Based in the Bay Area.

View LinkedIn Email me Download resume

IshipAIproductspeopleactuallyuse

Every layer of the AI stack — shipped

Risk stratification models

Clinical notes and claims extraction

Alerting and recommendation design

LLM summaries and copilot features

How I think, decide, and ship

Product problems I've gone deep on

Things I've built

My AI PM principles

Trust is the product

Eval criteria before engineering

Adoption is a product problem

Let's build something

IshipAIproducts
peopleactuallyuse