Tutorial: Build an Async Enrichment Pipeline with IRBIS API

This tutorial shows a production-friendly pattern for integrating IRBIS into an AI-driven reporting platform:
  • Your platform creates an enrichment job

  • IRBIS returns an async request id

  • You poll until results are ready

  • You normalize output into your internal schema

  • You cache results to control cost/credits

  • You store an audit trail (job id ↔ IRBIS request id)

Who this is for

Decision makers and engineering teams building:

  • automated reporting
  • identity enrichment layers
  • trust & safety / risk workflows
  • “one-click investigation” products

Architecture (recommended)

Request path (sync)

  1. User submits identifier (phone/email/name)
  2. Your API creates enrichment_job record
  3. Your API enqueues background task
  4. Your API returns job_id immediately

Worker path (async)

  1. Worker calls IRBIS lookup endpoint (POST)
  2. IRBIS responds with numeric id + status: progress
  3. Worker polls api-usage/{id} until ready
  4. Worker normalizes and stores results
  5. Worker marks job completed (or failed)
  6. Your report generator uses normalized results

Data model (minimal)

Create a table/document like:

  • job_id (your UUID)
  • tenant_id (customer/org)
  • input_type (phone|email|name)
  • input_value (hashed + raw if needed)
  • lookup_id (IRBIS lookupId used)
  • irbis_request_id (numeric id returned by IRBIS)
  • status (queued|running|completed|failed)
  • result_raw (IRBIS JSON, optional)
  • result_normalized (your schema JSON)
  • created_at, updated_at
  • error (string)

Step 0 — Get the right lookupId (cache it)

IRBIS requires a lookupId that matches what your subscription enables.

Call once per tenant (or daily) and cache it:

GET https://irbis.espysys.com/api/request-monitor/lookupid-list?key={API_KEY}

Store a mapping like:

  • combined_phonelookupId
  • combined_emaillookupId
  • combined_namelookupId
This avoids “wrong lookupId” errors and makes your system self-healing if packages change.

Step 1 — Submit a lookup request (creates async IRBIS request)

Phone example

POST

https://irbis.espysys.com/api/developer/combined_phone

Body

  • key: your API key
  • value: phone number
  • lookupId: cached lookupId for combined_phone

cURL

curl -X 'POST' \
  'https://irbis.espysys.com/api/developer/combined_phone' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "key": "<API_Key>",
    "value": "+79017007397",
    "lookupId": <LOOKUPID_VALUE>
  }'

Expected

You receive a numeric id and status: "progress".

Save that numeric id as irbis_request_id.

Do the same pattern for:

  • Email: POST /api/developer/combined_email
  • Name: POST /api/developer/combined_name

Step 2 — Poll results until ready (api-usage/{id})

GET

https://irbis.espysys.com/api/request-monitor/api-usage/{id}?key={API_KEY}

cURL

curl -X 'GET' \
  'https://irbis.espysys.com/api/request-monitor/api-usage/<RESPONSE_ID>?key=<API_Key>' \
  -H 'accept: application/json'

Polling strategy (simple + safe)

Use a backoff so you don’t hammer the API:

  • attempt 1: wait 2s
  • attempt 2: wait 5s
  • attempt 3+: wait 10s (cap)
  • max attempts: 12–18 (2–3 minutes total)

Stop when:

  • response indicates data is ready (your integration can treat “not progress anymore” as ready)
  • or you hit max attempts → mark job failed with “timeout retrieving results”

Step 3 — Normalize IRBIS output into your internal schema

Your platform should not depend on provider-specific JSON forever. Normalize it into a stable schema like:

{
  "provider": "irbis",
  "input": { "type": "phone", "value": "+79017007397" },
  "status": "completed",
  "signals": [
    { "type": "identity", "name": "..." },
    { "type": "exposure", "label": "..." },
    { "type": "footprint", "label": "..." }
  ],
  "raw_ref": { "irbis_request_id": 1486 }
}

Rules of thumb

  • Keep raw JSON stored (optional) for debugging/audit
  • Convert provider output into your stable signals[] list
  • Record provenance:
    • provider name
    • request id
    • timestamps
    • lookup type

Step 4 — Cache results to control credits

Caching is how enrichment platforms win on margin.

Recommended caching key

tenant_id + input_type + normalized(input_value)

TTL suggestions

  • Phone/email: 7–30 days depending on your product
  • Name: shorter TTL (higher ambiguity), e.g., 1–7 days

Cache policy

  • If cached result exists and is fresh → return cached
  • If missing/stale → create new enrichment job

Step 5 — Credits & guardrails

To show “credits remaining” in your admin UI (or to block heavy workflows), call:

GET https://irbis.espysys.com/api/request-monitor/credit-stat?key=YOUR_API_KEY

Also implement safety controls:

  • Per-tenant daily budget
  • Per-workflow budget (signup vs payout vs investigation)
  • Rate limit per identifier

Important limit: “Insufficient enrichment timeout”

IRBIS enforces a 30-second timeout between searches. If you call too fast, you may get:

  • “Insufficient enrichment timeout”

In production this means:

  • don’t fire repeated lookups for the same tenant in tight loops
  • use queueing + caching
  • add a simple “cooldown” per tenant/workflow if needed

Example workflow: “Generate AI report”

Goal: user submits phone/email/name → your platform outputs a structured report.

  1. Create job_id
  2. Run enrichment job (IRBIS async)
  3. Normalize signals
  4. Pass normalized signals to your report generator (LLM or rules engine)
  5. Store:
    • report
    • job record
    • IRBIS request id
    • decision trace (what signals influenced what)

Common mistakes (and fixes)

Mistake: Hardcoding lookupId

✅ Fix: call lookupid-list and cache mapping

Mistake: Blocking user request waiting for IRBIS

✅ Fix: return job_id immediately; use background worker

Mistake: Too many repeated requests (timeout error)

✅ Fix: caching + backoff + cooldown

Mistake: Storing only raw JSON

✅ Fix: store normalized schema for product stability
Skip to content