API quickstart

Use the v1 HTTP API with curl (or any HTTP client). Replace YOUR_HOST, YOUR_API_KEY, DATASET_ID, and RUN_ID with real values from your environment.

Before you start: API keys and all /api/v1/* routes require a Pro workspace. Free workspaces can still read this guide and the OpenAPI spec. Create keys under Workspace settings → API access (owners and admins only).

Prerequisites

Pro workspace with billing / plan_tier set accordingly.
At least one API key with scopes ingestion, analysis, and exports (default when creating a key in the UI).

Authentication

Every request must include:

Authorization: Bearer YOUR_API_KEY

Error shape

Failures return JSON in this form:

{ "error": { "code": "string", "message": "string", "details": {} } }

Common errors

401 — invalid or missing key

{
  "error": {
    "code": "unauthorized",
    "message": "Invalid API key."
  }
}

403 — workspace not on Pro

{
  "error": {
    "code": "plan_upgrade_required",
    "message": "API access requires a Pro workspace."
  }
}

Workflow

Typical flow: upload → queue run → process queue → poll status → Benford JSON (optional) → LTD JSON (optional) → export.

1. Upload a dataset

POST multipart/form-data with a required file field (CSV or XLSX). Optional fields: displayName, description, omitReference (true), referenceColumnName, referenceColumnIndex.

The first row of the file must be column headers; all following rows are data. Documented row limits apply to data rows only (the header is excluded). Published limits are maximums; very wide or high-cardinality datasets may process more slowly and can still exceed runtime thresholds.

curl -sS -X POST "https://YOUR_HOST/api/v1/uploads" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@./sample.csv"

When the server returns 200, the body already includes row/column counts (synchronous ingest). With EDGE_WORKER_SECRET set, CSV may return 202 and the raw file is queued for the ingestion-worker Edge Function (XLSX still returns 200 and is processed in the app). If you get 202, save datasetId and poll until ready:

curl -sS "https://YOUR_HOST/api/v1/datasets/DATASET_ID/ingestion" \
  -H "Authorization: Bearer YOUR_API_KEY"

Repeat until status is ready (or failed). Then use datasetId in the steps below.

2. Queue an analysis run

At least one column must be selected for the dataset (same rules as the web app). After upload, select columns in the UI once, or rely on columns auto-selected during ingest.

Without a body — uses the dataset’s saved LTD defaults (if any):

curl -sS -X POST "https://YOUR_HOST/api/v1/datasets/DATASET_ID/analysis-runs" \
  -H "Authorization: Bearer YOUR_API_KEY"

With LTD overrides (per run; does not change dataset defaults):

curl -sS -X POST "https://YOUR_HOST/api/v1/datasets/DATASET_ID/analysis-runs" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"ltdEnabled\":true,\"ltdDecimalPlaces\":2,\"ltdDataKind\":\"dollar\"}"

Body fields (all optional):

ltdEnabled (boolean) — true adds last_two_digits; false runs Benford and repetition only.
ltdDecimalPlaces (integer 0–6) — scale amounts by 10^n before the LTD test.
ltdDataKind (dollar or count) — eligibility: dollar excludes under $10.00 at two decimal places (threshold scales with ltdDecimalPlaces); count excludes scaled whole integers under 1000.

Omit the body to inherit dataset defaults from the web app (Dataset details → Save LTD defaults). When enabling LTD via API, set ltdDataKind if the dataset has no saved kind yet.

Save runId.

3. Process queued runs

Runs the Benford engine for queued jobs for this dataset (same idea as Process in the app).

curl -sS -X POST "https://YOUR_HOST/api/v1/datasets/DATASET_ID/process" \
  -H "Authorization: Bearer YOUR_API_KEY"

Call process again until the run finishes, or invoke it from your own worker.

4. Poll run status

curl -sS "https://YOUR_HOST/api/v1/analysis-runs/RUN_ID" \
  -H "Authorization: Bearer YOUR_API_KEY"

When status is completed, you can fetch chart-ready Benford and LTD data as JSON or create an export. The run record includes ltdConfig when Last Two Digits was enabled (decimal places, dollar vs count, exclusion threshold).

5. Get Benford results (JSON)

After the run is completed, GET /api/v1/analysis-runs/{runId}/benford-results returns benford_results rows (distributions, MAD, SSD, confidence bands, etc.) for building your own charts. 404 if the run id is wrong or not in this workspace. 409 with analysis_run_not_completed while the run is still queued or running, or analysis_run_failed if the run failed — there are no partial rows.

Optional query parameters (see OpenAPI):

testMode — first_digit, first_two_digits, or second_digit (must match stored enum values).
columnName — exact header string after trim; omit for every column.

curl -sS "https://YOUR_HOST/api/v1/analysis-runs/RUN_ID/benford-results" \
  -H "Authorization: Bearer YOUR_API_KEY"

Example with filters:

curl -sS "https://YOUR_HOST/api/v1/analysis-runs/RUN_ID/benford-results?testMode=first_digit&columnName=Amount" \
  -H "Authorization: Bearer YOUR_API_KEY"

Distributions use probability per bin on a 0–1 scale (multiply by 100 for a percent axis). Full units and formulas are documented on this operation in the OpenAPI spec.

6. Get Last Two Digits (LTD) results (JSON)

When the run includes last_two_digits in testsEnabled (enabled by default in the web app via dataset LTD defaults), fetch uniform 1% / 00–99 bin distributions and SSD:

curl -sS "https://YOUR_HOST/api/v1/analysis-runs/RUN_ID/ltd-results" \
  -H "Authorization: Bearer YOUR_API_KEY"

Optional query: columnName — exact header string; omit for all columns.

curl -sS "https://YOUR_HOST/api/v1/analysis-runs/RUN_ID/ltd-results?columnName=Amount" \
  -H "Authorization: Bearer YOUR_API_KEY"

The response includes ltdConfig (same snapshot as on GET /api/v1/analysis-runs/{runId}): decimalPlaces, dataKind (dollar | count), and minScaledValue for eligibility. Distributions are 0–1; bin keys are two-digit strings ("00"–"99"). There is no MAD and no Benford confidence band on LTD.

API queue note: configure long-lived defaults in the UI (Dataset details), or pass ltdEnabled, ltdDecimalPlaces, and ltdDataKind on each POST …/analysis-runs request (see step 2).

7. Export PDF or CSV (async)

Create an export job, then poll until status is completed and download via the signed URL (about one hour).

curl -sS -X POST "https://YOUR_HOST/api/v1/exports/runs/RUN_ID" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"format\":\"pdf\"}"

Save exportJobId from the JSON response, then poll:

curl -sS "https://YOUR_HOST/api/v1/export-jobs/EXPORT_JOB_ID" \
  -H "Authorization: Bearer YOUR_API_KEY"

When downloadUrl is present, fetch it with a normal GET (no API key on that URL).

Synchronous GET /api/v1/exports/runs/{runId}/pdf and /csv still work for small scripts; async jobs are retention-aware (30 days Free / 90 days Pro for generated files).

When the run has LTD result rows, PDF and CSV exports include a Last Two Digits section (100 bins per column, SSD, no confidence bands) in addition to Benford and value repetition.

Scopes

Each key carries ingestion, analysis, and exports. Individual routes require the matching scope.

Rate limits

Workspaces are limited to about 120 requests per minute on v1 (in-process limiter). When throttled you get 429, a Retry-After header, and error.code rate_limited.

Same origin and CORS

If your app and docs share the same host, browser “Try it” from the bundled API reference usually works without extra CORS configuration. Calls from other origins need explicit CORS on /api/v1/* if you want browser-based clients.