انتقل إلى المحتوى

Runbook — NLP Lab Sidecar

Use during incidents

The checklists below focus on stabilising the NLP Lab sidecar, which currently hosts the SinaTools SDK models and related datasets. Copy/paste commands as-is. Reference the linked sections for deeper analysis.

First 5 Minutes

# Health probe
curl -fsS ${NLP_LAB_URL:-http://localhost:8000}/health | jq

# Inspect logs
docker logs --tail=100 sinatools

# Quick functional test (NER)
curl -fsS ${NLP_LAB_URL:-http://localhost:8000}/ner \
  -H 'Content-Type: application/json' \
  -d '{"text": "اختبار", "mode": "flat"}' | jq

Decision Tree (High-Level)

flowchart TD
  A[API errors or 5xx] --> B{Health endpoint ok?}
  B -- no --> C[Restart sinatools container]:::act
  B -- yes --> D{Specific feature failing?}
  D -- NER/WSD -->> E[Inspect SDK model availability]:::act
  D -- Dialect -->> F[Verify /app/Nabra mount and CSV integrity]:::act
  D -- Morph/Glossary -->> G[Check include_* parameters and logs]:::act
classDef act fill:#e6ffed,stroke:#4CAF50;

Degradation Patterns

  • Model warmup latency: When SINA_WARM=all, initial requests may hang if weights are missing; confirm the SinaTools SDK bundle is present and reinstall if necessary.
  • Missing gloss data: /wsd?include_gloss=true returning null gloss values indicates glosses_dic failed to load; rebuild image or re-run SDK download script.
  • Dialect lookup failures: 404 with suggestion indicates the Nabra dataset cannot find a close match. If this happens for known sentences, verify /app/Nabra CSVs and inspect logs for NabraDataMissingError.

Operational Commands

Restart Service

docker compose restart sinatools

Tail Logs

docker logs -f sinatools

Run Tests In-Container

docker exec sinatools pytest /app/tests/test_api.py

Regenerate OpenAPI

docker exec sinatools python3 -c "from src.app import create_app; import json, pathlib; pathlib.Path('/tmp/openapi.json').write_text(json.dumps(create_app().openapi()))"
docker cp sinatools:/tmp/openapi.json sinatools/openapi.json

Troubleshooting Playbooks

  • 5xx Responses:
  • Check container logs for tracebacks.
  • Run health checks and a sample request.
  • If SDK import errors appear, rebuild the image (docker compose build sinatools).

  • Glossary returning empty matches:

  • Confirm Nabra-dataset.csv exists and is readable.
  • Recreate the lexicon cache by bouncing the container.
  • Verify similarity_threshold is not set too high.

  • Inconsistent morphology results:

  • Ensure lemma metadata was requested with include_lemma=true.
  • Check the logs for ValueError conversions; these were fixed in the modular build—redeploy if old image is in use.

Escalation

  • Primary: NLP Lab on-call (#nlp-lab Slack channel).
  • Secondary: Platform SRE if container-level issues persist.
  • Provide docker logs, failing requests, and current compose hash when escalating.

References