Aito — the predictive database, in a container

Pull, run, predict.

From docker pull to your first prediction in five minutes.

docker pull ghcr.io/aitohq/aito
docker run -p 9005:9005 ghcr.io/aitohq/aito

Aito returns predictions through the same SQL-ish query interface you'd use for ordinary lookups. No model training, no embeddings, no separate ML pipeline. Insert rows, query rows, predict rows.

From pull to prediction

Three requests: create a schema, insert rows, query a prediction. The example below predicts how an incoming purchase invoice should be coded — the same pattern behind predictive accounting and ERP automation. Locally the API needs no key, so every command works as-is.

1. Create the schema. One table of purchase invoices. The description column is analyzed as text, so Aito can learn from individual words:

curl -X PUT http://localhost:9005/api/v1/schema \
  -H 'content-type: application/json' \
  -d '{
  "schema": {
    "invoices": {
      "type": "table",
      "columns": {
        "vendor":      { "type": "String" },
        "description": { "type": "Text", "analyzer": "english" },
        "amount":      { "type": "Decimal" },
        "gl_account":  { "type": "String" },
        "approver":    { "type": "String" }
      }
    }
  }
}'

2. Insert rows. Twelve invoices that have already been coded — the kind of history every accounting system has:

curl -X POST http://localhost:9005/api/v1/data/invoices/batch \
  -H 'content-type: application/json' \
  -d '[
  { "vendor": "AWS", "description": "Monthly cloud hosting and compute", "amount": 2340.50, "gl_account": "6510 Cloud services", "approver": "sara" },
  { "vendor": "Google Cloud", "description": "Cloud compute and BigQuery usage", "amount": 1480.00, "gl_account": "6510 Cloud services", "approver": "sara" },
  { "vendor": "Slack", "description": "Team messaging subscription, 25 seats", "amount": 218.75, "gl_account": "6520 Software subscriptions", "approver": "sara" },
  { "vendor": "Atlassian", "description": "Jira and Confluence annual subscription", "amount": 1890.00, "gl_account": "6520 Software subscriptions", "approver": "sara" },
  { "vendor": "Finnair", "description": "Flights Helsinki to Berlin, customer visit", "amount": 420.30, "gl_account": "6310 Travel", "approver": "tom" },
  { "vendor": "Scandic Hotels", "description": "Hotel two nights, conference trip", "amount": 358.00, "gl_account": "6310 Travel", "approver": "tom" },
  { "vendor": "Lyreco", "description": "Office paper, pens and notebooks", "amount": 86.40, "gl_account": "6110 Office supplies", "approver": "tom" },
  { "vendor": "IKEA", "description": "Office chairs and desk lamps", "amount": 540.00, "gl_account": "6110 Office supplies", "approver": "tom" },
  { "vendor": "Google Ads", "description": "Search advertising campaign April", "amount": 1500.00, "gl_account": "6610 Marketing", "approver": "lisa" },
  { "vendor": "LinkedIn", "description": "Sponsored content campaign, lead gen", "amount": 980.00, "gl_account": "6610 Marketing", "approver": "lisa" },
  { "vendor": "Wolt", "description": "Team lunch, quarterly planning day", "amount": 156.80, "gl_account": "6210 Staff catering", "approver": "tom" },
  { "vendor": "Kespro", "description": "Office coffee, milk and snacks", "amount": 92.30, "gl_account": "6210 Staff catering", "approver": "tom" }
]'

3. Query a prediction. A new invoice arrives from Datadog — a vendor that appears nowhere in the data:

curl -X POST http://localhost:9005/api/v1/_predict \
  -H 'content-type: application/json' \
  -d '{
  "from": "invoices",
  "where": { "vendor": "Datadog", "description": "Monitoring subscription for production servers" },
  "predict": "gl_account",
  "limit": 3
}'
{
  "offset": 0,
  "total": 6,
  "hits": [
    { "$p": 0.568, "field": "gl_account", "feature": "6520 Software subscriptions" },
    { "$p": 0.086, "field": "gl_account", "feature": "6110 Office supplies" },
    { "$p": 0.086, "field": "gl_account", "feature": "6210 Staff catering" }
  ]
}

Aito has never seen Datadog, but the word "subscription" in the description carries the prediction to the right account. Change "predict": "gl_account" to "predict": "approver" and the same data tells you sara should approve it. Add "select": ["feature", "$p", "$why"] and Aito explains which words and values drove the probability.

There is no training step in any of this — every prediction is computed live from the rows you just inserted. Insert more rows and the next query already knows about them.

To start over, remove the container (docker rm -f <name>) and run it again — state lives inside the container unless you mount a volume.

What else you can query

Query typeWhat it does
_searchFull-text + structured search across your tables
_predictPredict a column value given other column values
_recommendRecommend rows based on positive / negative examples
_relateFind statistical relationships between two queries
_similarityRank rows by similarity to a target row

Full quickstart → · API reference →

Free for development. Licensed for production.

The free-tier image holds up to 10,000 rows per table and 50,000 rows total. Beyond that, inserts return HTTP 429. Reads keep working at any size — the cap only stops new data.

That covers most evaluation, CI, and prototyping work. For production deployments — anything that serves your customers or participates in a revenue workflow — you'll want a Production License.

What counts as "production"?

If your Aito instance is part of any of the following, it's production:

  • Serving data to your customers, end-users, or employees
  • A pipeline that generates revenue or supports a revenue process
  • An internal tool at a company with more than 250 employees or annual revenue above $10M
  • An internet-facing endpoint with real users

Not production: local dev, CI, side projects, personal use, academic research, evaluation up to 90 days. All free.

Get a Production License

Email sales@aito.ai with a one-line description of your use case. You'll get a license key back within a working day — no qualifying call, no SDR follow-up. Founder-led, by design.

Read the full license terms →

How it works in production

Once you have a Production License key, set it as an environment variable and the row caps lift:

docker run -d \
  -p 9005:9005 \
  -v aito-data:/io/state \
  -e AITO_LICENSE_KEY=ak_live_... \
  ghcr.io/aitohq/aito:latest

The image validates the key against console.aito.ai on startup and caches the response on the /io/state volume — so the container keeps running for up to 7 days without internet if you need offline operation.

What we collect

Every Aito instance phones home on startup with:

  • A random instance UUID
  • Software version, row count, table count
  • Your IP address
  • Java/OS version

We do not collect: row contents, query text, schemas, table data. The telemetry is how we tell free-tier evaluation traffic from production traffic — and it's how sales knows to reach out about a license. Full details in our privacy policy under "Free-tier Docker image telemetry".

Disabling telemetry is a breach of the license terms §4(b).

Cloud-hosted Aito

Prefer a managed instance? console.aito.ai runs Aito for you with the same query interface, plus backups, auto-scaling, and the Aito UI on top. The Sandbox tier is free; Dev and Prod tiers are paid. The cloud service is governed by separate Terms of Service.

Mirrors

The image is also published to AWS ECR Public for users in AWS-native environments:

docker pull public.ecr.aws/aitoai/aito

Identical image, identical contents.


Questions: support@aito.ai · GitHub · License terms