Runbook — NLP Lab Sidecar¶
Use during incidents
The checklists below focus on stabilising the NLP Lab sidecar, which currently hosts the SinaTools SDK models and related datasets. Copy/paste commands as-is. Reference the linked sections for deeper analysis.
First 5 Minutes¶
# Health probe
curl -fsS ${NLP_LAB_URL:-http://localhost:8000}/health | jq
# Inspect logs
docker logs --tail=100 sinatools
# Quick functional test (NER)
curl -fsS ${NLP_LAB_URL:-http://localhost:8000}/ner \
-H 'Content-Type: application/json' \
-d '{"text": "اختبار", "mode": "flat"}' | jq
Decision Tree (High-Level)¶
flowchart TD
A[API errors or 5xx] --> B{Health endpoint ok?}
B -- no --> C[Restart sinatools container]:::act
B -- yes --> D{Specific feature failing?}
D -- NER/WSD -->> E[Inspect SDK model availability]:::act
D -- Dialect -->> F[Verify /app/Nabra mount and CSV integrity]:::act
D -- Morph/Glossary -->> G[Check include_* parameters and logs]:::act
classDef act fill:#e6ffed,stroke:#4CAF50;
Degradation Patterns¶
- Model warmup latency: When
SINA_WARM=all, initial requests may hang if weights are missing; confirm the SinaTools SDK bundle is present and reinstall if necessary. - Missing gloss data:
/wsd?include_gloss=truereturningnullgloss values indicatesglosses_dicfailed to load; rebuild image or re-run SDK download script. - Dialect lookup failures:
404with suggestion indicates the Nabra dataset cannot find a close match. If this happens for known sentences, verify/app/NabraCSVs and inspect logs forNabraDataMissingError.
Operational Commands¶
Restart Service¶
Tail Logs¶
Run Tests In-Container¶
Regenerate OpenAPI¶
docker exec sinatools python3 -c "from src.app import create_app; import json, pathlib; pathlib.Path('/tmp/openapi.json').write_text(json.dumps(create_app().openapi()))"
docker cp sinatools:/tmp/openapi.json sinatools/openapi.json
Troubleshooting Playbooks¶
- 5xx Responses:
- Check container logs for tracebacks.
- Run health checks and a sample request.
-
If SDK import errors appear, rebuild the image (
docker compose build sinatools). -
Glossary returning empty matches:
- Confirm
Nabra-dataset.csvexists and is readable. - Recreate the lexicon cache by bouncing the container.
-
Verify
similarity_thresholdis not set too high. -
Inconsistent morphology results:
- Ensure lemma metadata was requested with
include_lemma=true. - Check the logs for
ValueErrorconversions; these were fixed in the modular build—redeploy if old image is in use.
Escalation¶
- Primary: NLP Lab on-call (#nlp-lab Slack channel).
- Secondary: Platform SRE if container-level issues persist.
- Provide
docker logs, failing requests, and current compose hash when escalating.