Question
Your product-search service embeds the query with model M and does ANN retrieval against a vector index of item embeddings. Overnight you upgraded the embedding model from M to M2 and rebuilt the item index with M2. This morning relevance is terrible: searches return near-random items, recall@10 cratered, though the service is fast and returns 200s. Dashboards: the query-embedding service was NOT redeployed and still encodes queries with M, while the item index now holds M2 vectors; the index version tag reads 'M2' but the query path's model version reads 'M'. No latency or error anomaly. How do you triage and fix?
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.