title: Runbook: Downstream Dependency Failure description: Diagnose and resolve failures talking to OpenSearch or Backend API (hydration). icon: material/server-network-off
Downstream Dependency Failure¶
Impact varies by dependency
- OpenSearch down →
/retrieve,/retrieve_packfail (critical). - Backend API down →
/retrieve_packdegrades (no hydration),/retrieveis fine.
Triage (≤5 minutes)¶
-
Scan logs for connection errors
-
Manual connectivity from container
-
Resolve name/DNS issues
-
Validate env
Remediation¶
- Fix OS service first (check its runbook). AI-Box cannot serve search without it.
- Verify
OPENSEARCH_URLpoints at the right host/port.
- AI-Box should still serve;
/retrieve_packreturns null metadata fields. - Verify
API_REST_BASEand token; consider turning off hydration temporarily.
Post-incident¶
- Add quick dep probes to
/health(cached) to surface degraded state. - Implement a circuit breaker for hydration to reduce noise when API is down.