Skip to content

AI-Box Service: Environment & Configuration

This document is the definitive reference for configuring the AI-Box service. It covers all environment variables used by the application, categorized by their function.

Source of Truth

The canonical list of all available environment variables is maintained in the ai-box/.env.example file.


1. Core Service Configuration

These variables control the fundamental behavior of the FastAPI application.

Variable Default Required Description
SERVICE_NAME ai-box No The name of the service, returned in the /health endpoint.
SERVICE_VERSION 0.1.0 No The version of the service, returned in the /health endpoint.
SERVICE_LANGS ar,en No Comma-separated list of supported languages, in order of preference.
DEBUG false No If true, backend error details are included in 502 responses. (Optional; wire in Settings before relying on this in prod.)

2. Search (OpenSearch)

Configuration for connecting to and interacting with the OpenSearch cluster.

Variable Default Required Description
OPENSEARCH_URL http://opensearch:9200 Yes The base URL of the OpenSearch service. Container-resolvable host (use service name or host-gateway).
OS_INDEX news_docs Yes The alias or concrete index to target for all search operations.
OS_READ_INDEX null No Optional read alias overriding OS_INDEX.
VECTOR_FIELD embedding Yes The name of the knn_vector field in the OpenSearch index mapping. Must match mapping field of type knn_vector.
EMBED_DIM 768 Yes The dimension of the text embedding vectors. Must equal mapping dimension.

3. Retrieval Defaults

Default parameters for the /retrieve and /retrieve_pack endpoints. These can be overridden in the request body.

Variable Default Required Description
DEFAULT_TOPK_BM25 20 No The default number of results to fetch from the BM25 (keyword) search leg. Default BM25 K.
DEFAULT_TOPK_KNN 20 No The default number of results to fetch from the k-NN (vector) search leg. Default kNN K.
DEFAULT_K_RRF 60 No The ranking constant for Reciprocal Rank Fusion (RRF). RRF K constant.
ENABLE_RERANK false No If true, enables the cross-encoder reranking step. Enable cross-encoder rerank.
RERANK_MODEL_ID oddadmix/arabic-reranker No Model id for the cross-encoder reranker (used when ENABLE_RERANK=true). Informative ID (placeholder).

| RERANK_MODEL_ID | oddadmix/arabic-reranker | No | Model id for the cross-encoder reranker (used when ENABLE_RERANK=true). Informative ID (placeholder). |

4. Platform API Integration

Configuration for connecting to the main Labeeb API service for hydration and taxonomy lookup.

Variable Default Required Description
API_BASE_URL http://api No API host for hydration and taxonomy lookup (optional).
API_TOKEN "" No Bearer token for REST client (optional).
TAXONOMY_CACHE_TTL 60 No Seconds to cache taxonomy version.
ENABLE_NORMALIZED_SEARCH true No Mount normalized /search endpoint.

5. S1 Check-Worthiness Model Configuration

Configuration for the sentence-level check-worthiness analysis using XLM-RoBERTa.

Variable Default Required Description
S1_MODE model No model uses XLM-R, heuristic/off fall back to rules
S1_MODEL_ID /models/xlm-roberta-large-xnli No Hugging Face ID or local path; mock avoids downloads
S1_THRESHOLD 0.55 No Score cutoff for is_checkworthy
S1_MAX_TOKENS 256 No Truncation limit passed to the model
S1_BATCH_SIZE 8 No Sentences per forward pass
S1_DEVICE cpu No Inference device, e.g. cpu or cuda:0

Performance

p95 latency on CPU for eight sentences is roughly 0.1s; check /metrics for real values. On CPU, a batch of eight sentences scores in ~0.1s using the mock model. Real model latency will be higher; set S1_MODE=heuristic to disable.

Updated keys (preferred)

The following environment variables are used by the new /s1/score endpoint (feature-flagged by ENABLE_AIB_15).

Variable Default Description
ENABLE_AIB_15 true Expose /s1/score; set false to disable the route quickly.
S1_BACKEND dummy Backend for S1 scorer: dummy (fast, deterministic) or hf (transformers).
S1_MODEL_ID xlm-roberta-base HuggingFace model id (used when S1_BACKEND=hf).
S1_THRESHOLD 0.5 Decision threshold; score >= thresholdlabel=true.
S1_DEVICE cpu Inference device (e.g., cpu, cuda).
S1_MAX_LEN 256 Maximum tokens; longer inputs are truncated.

Deprecated keys (kept for backward-compat)

These variables were used by an earlier S1 design. Prefer the updated keys above.

Variable Default Description
S1_MODE model Legacy switch; superseded by ENABLE_AIB_15 + S1_BACKEND.