Introduction

Relate

POST /api/v1/_relate

Relate provides statistical information of data relationships.

It calculates correlations between a pair of features, which can be used to for example to find causation and correlation.

The hits are by default ordered by relation.mi field. It indicates how strong the correlation is.

Parameters

Name	Type	Description
bodyrequired	object	Relate query

Successful responses

Response	Type	Description
200 OK	object	Relate results

Request format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// PropositionSet expression is used to describe a collection // of propositions. Required.

"relate": PropositionSet,

// Declares the sorting order.

"orderBy": RelateOrderBy,

// Describes the fields and/or built-in attributes to return.

"select": Selection,

// The number of results to skip from the beginning. // Default: 0

"offset": integer,

// The maximum number of results to retrieve. // Default: 10

"limit": integer

}

Response format

{

"offset": integer,

"total": integer,

// Entries returned for a given query. Required.

"hits": Hits

}

What features of products affect purchasing

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we ask Aito to explain what factors of products affect to people purchasing them. With $exists, we tell Aito to get all properties of the product (impressions table links to the products table), and relate those to the condition {"purchase": true }.

The response may seem overwhelming but it contains a lot of useful information.

When looking at the second hit, we can see that when { "product.tags" : { "$has": "vegetable" } }, the "lift" value is high (compared to 1.0). It means that when the product tags contain a tag vegetable, it is ~1.9x more likely that the product will be purchased compared to the average product (=base probability).

The lift is calculated with the formula: the probability of the condition { "purchase": true} divided by the average probability of the condition. The formula with the correct field names is: ps.pOnCondition / ps.p.

In the example data set, people purchase 50% of products they see. This causes the base probability to be 0.5.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_relate \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "impressions",
    "where": { "$exists": "product" },
    "relate": [
      { "purchase": true }
    ],
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 2,
  "hits": [
    {
      "related": {
        "purchase": { "$has": true }
      },
      "condition": {
        "product": { "$has": "6414880021620" }
      },
      "lift": 2.944562691897145,
      "fs": {
        "f": 4496,
        "fOnCondition": 264,
        "fOnNotCondition": 4232,
        "fCondition": 1732,
        "n": 87089
      },
      "ps": {
        "p": 0.05163564547427404,
        "pOnCondition": 0.15204439523557498,
        "pOnNotCondition": 0.04958775636552271,
        "pCondition": 0.019887742693174074
      },
      "info": {
        "h": 0.2933048451230419,
        "mi": 0.09998855952306565,
        "miTrue": 0.23689328541005905,
        "miFalse": -0.1369047258869934
      },
      "relation": {
        "n": 87089,
        "varFs": [1732, 4496],
        "stateFs": [81125, 1468, 4232, 264],
        "mi": 0.0020498727155220694
      }
    },
    {
      "related": {
        "purchase": { "$has": true }
      },
      "condition": {
        "product": { "$has": "2000818700008" }
      },
      "lift": 2.2270730293982597,
      "fs": {
        "f": 4496,
        "fOnCondition": 323,
        "fOnNotCondition": 4173,
        "fCondition": 2804,
        "n": 87089
      },
      "ps": {
        "p": 0.05163564547427404,
        "pOnCondition": 0.11499635339132602,
        "pOnNotCondition": 0.049517147816262916,
        "pCondition": 0.032196981106762584
      },
      "info": {
        "h": 0.2933048451230419,
        "mi": 0.04455167944756469,
        "miTrue": 0.132837907358048,
        "miFalse": -0.08828622791048332
      },
      "relation": {
        "n": 87089,
        "varFs": [2804, 4496],
        "stateFs": [80112, 2481, 4173, 323],
        "mi": 0.0014992597884251305
      }
    }
  ]
}

Generic query

POST /api/v1/_query

Generic query is a powerful expert interface.

It provides the functionality of every other query type in the API. Search, Similarity, Match, and Recommend can be seen as convenience APIs for the generic query.

The query format resembles the Search-query, except that it supports a "get" statement. Since this endpoint provides functionality of all other queries, "get": "product" is used as a replacement for "predict": "product", "recommend": "product", and "match": "product" counterparts.

The chapter Personalisation also explains a characteristic of the inference model.

Namespace shifting of "get"

The "get" operation changes the namespaces of "select" and "orderBy" operations. The namespace is changed from the "from" table to the linked table (specified with "get").

As an example, think of this query. The impressions table has a column called product which links to a row in products table. The price and title fields are columns of products.

{
  "from": "impressions",
  "where": {
    "query": "macbook air 2018"
  },
  "get": "product",
  "orderBy": ["price"],
  "select": ["title", "$highlight"]
}

When using "select" and "orderBy", we are already in the products table namespace, instead of having to use product.title or product.price.

Related information

The difference between $p and $lift

Parameters

Name	Type	Description
bodyrequired	object	Generic query

Successful responses

Response	Type	Description
200 OK	object	Query results

Request format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// Get expression defines what items are returned as query // results.

"get": Get,

// Declares the sorting order of the result by a field or by a // user-defined score.

"orderBy": OrderBy,

// Defines the fields returned by the select statement.

"select": Projection,

// The number of results to skip from the beginning. // Default: 0

"offset": integer,

// The maximum number of results to retrieve. // Default: 10

"limit": integer

}

Response format

{

"offset": integer,

"total": integer,

// Entries returned for a given query. Required.

"hits": Hits

}

Search query

Simple search query with the generic query.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "products",
    "where": { "id": "6410402010318" }
  }'

Response

{
  "offset": 0,
  "total": 1,
  "hits": [
    {
      "category": "115",
      "googleClicks": 10,
      "googleImpressions": 100,
      "id": "6410402010318",
      "name": "Pirkka tuna fish pieces in oil 200g/150g",
      "price": 1.69,
      "tags": "meat food protein pirkka"
    }
  ]
}

Search query with highlighted results

Search query which returns related products ordered by similarity. The response also contains the highlighted words which matched to the search term.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "products",
    "where": {
      "name": { "$match": "coffee" }
    },
    "select": ["id", "name", "tags", "price", "$score", "$highlight"],
    "orderBy": "$similarity"
  }'

Response

{
  "offset": 0,
  "total": 4,
  "hits": [
    {
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "tags": "coffee",
      "price": 3.95,
      "$score": 2.1726635013471625,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Juhla Mokka <font color=\"green\">coffee</font> 500g sj"
        }
      ]
    },
    {
      "id": "6420101441542",
      "name": "Kulta Katriina filter coffee 500g",
      "tags": "coffee",
      "price": 3.45,
      "$score": 2.1726635013471625,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Kulta Katriina filter <font color=\"green\">coffee</font> 500g"
        }
      ]
    },
    {
      "id": "6411300164653",
      "name": "Juhla Mokka Dark Roast coffee 500g hj",
      "tags": "coffee",
      "price": 3.95,
      "$score": 2.1726635013471625,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Juhla Mokka Dark Roast <font color=\"green\">coffee</font> 500g hj"
        }
      ]
    },
    {
      "id": "6410405181190",
      "name": "Pirkka Costa Rica filter coffee 500g UTZ",
      "tags": "coffee pirkka",
      "price": 2.89,
      "$score": 2.1726635013471625,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Pirkka Costa Rica filter <font color=\"green\">coffee</font> 500g UTZ"
        }
      ]
    }
  ]
}

Generic similarity query

In the example we're finding similar products based on the given hypothetical new product name.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "products",
    "orderBy": {
      "$similarity": { "name": "Atria bratwurst 175g" }
    },
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 2.1455611831385513,
      "category": "102",
      "googleClicks": 9,
      "googleImpressions": 100,
      "id": "6407870070333",
      "name": "Atria lauantaimakkara bread sausage 225g",
      "price": 0.89,
      "tags": "meat sausage with-bread"
    },
    {
      "$score": 2.1455611831385513,
      "category": "102",
      "googleClicks": 8,
      "googleImpressions": 100,
      "id": "6407870071224",
      "name": "Atria Gotler ham sausage 300g",
      "price": 1.75,
      "tags": "meat sausage with-bread"
    }
  ]
}

Generic predict query

In the example we're predicting which tags a new hypothetical product could have.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "products",
    "where": { "name": "Atria bratwurst 175g" },
    "get": "tags.$feature",
    "orderBy": "$p",
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 25,
  "hits": [
    { "$p": 0.30473884347328356, "field": "tags", "feature": "meat" },
    { "$p": 0.2715004188138674, "field": "tags", "feature": "sausage" },
    { "$p": 0.05475553563545456, "field": "tags", "feature": "food" },
    { "$p": 0.03369571423720281, "field": "tags", "feature": "protein" },
    { "$p": 0.031178962371886616, "field": "tags", "feature": "pirkka" }
  ]
}

Recommend products which a customer would most likely purchase

In the example we're finding the top 5 products which veronica (user id) would most likely to purchase based on her behavior history stored in impressions table.

This example is the the same as in the documentation of Recommendation endpoint, but made with the generic query.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "impressions",
    "where": { "context.user": "veronica" },
    "get": "product",
    "orderBy": {
      "$p": {
        "$context": { "purchase": true }
      }
    },
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$p": 0.1522789020513721,
      "category": "111",
      "googleClicks": 10,
      "googleImpressions": 100,
      "id": "6414880021620",
      "name": "Ilta Sanomat weekend news",
      "price": 2.3,
      "tags": "news"
    },
    {
      "$p": 0.11512887247223065,
      "category": "100",
      "googleClicks": 12,
      "googleImpressions": 100,
      "id": "2000818700008",
      "name": "Pirkka banana",
      "price": 0.166,
      "tags": "fresh fruit pirkka"
    },
    {
      "$p": 0.10812365117388067,
      "category": "111",
      "googleClicks": 9,
      "googleImpressions": 100,
      "id": "6410405207722",
      "name": "Pirkka paper towel 4 rl",
      "price": 1.95,
      "tags": "paper-towels pirkka"
    },
    {
      "$p": 0.10370802661668428,
      "category": "111",
      "googleClicks": 11,
      "googleImpressions": 100,
      "id": "6413200330206",
      "name": "Lotus Soft Embo 8 rll toilet paper",
      "price": 3.35,
      "tags": "toilet-paper"
    },
    {
      "$p": 0.09228451945163965,
      "category": "109",
      "googleClicks": 8,
      "googleImpressions": 100,
      "id": "6416453015234",
      "name": "Karl Fazer 200g Tyrkisk Peber chocolate",
      "price": 2.45,
      "tags": "candy"
    }
  ]
}

Query with custom scoring

In the example we're finding the top 5 products which veronica (user id) would most likely to purchase but in addition we're boosting products which have higher price. This would recommend products which are relevant for the user but also bring higher revenue to the shop. This demonstrates a situation where multiple factors should be considered in recommendations.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "impressions",
    "where": { "context.user": "veronica" },
    "get": "product",
    "orderBy": {
      "$multiply": [
        {
          "$p": {
            "$context": { "purchase": true }
          }
        },
        "price"
      ]
    },
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 0.3502414747181558,
      "category": "111",
      "googleClicks": 10,
      "googleImpressions": 100,
      "id": "6414880021620",
      "name": "Ilta Sanomat weekend news",
      "price": 2.3,
      "tags": "news"
    },
    {
      "$score": 0.34742188916589234,
      "category": "111",
      "googleClicks": 11,
      "googleImpressions": 100,
      "id": "6413200330206",
      "name": "Lotus Soft Embo 8 rll toilet paper",
      "price": 3.35,
      "tags": "toilet-paper"
    },
    {
      "$score": 0.22609707265651716,
      "category": "109",
      "googleClicks": 8,
      "googleImpressions": 100,
      "id": "6416453015234",
      "name": "Karl Fazer 200g Tyrkisk Peber chocolate",
      "price": 2.45,
      "tags": "candy"
    }
  ]
}

Batch

POST /api/v1/_batch

Batch query operation.

Allows you to send multiple queries in a single request. Batch query takes an array of queries and returns an array of results. Each query can be one of the following types:

Generic query
Similarity query
Predict query
Search query
Recommend query
Relate query

Batch query can be used for example to request predictions for multiple fields on the same go and request similar items for a reference.

Parameters

Name	Type	Description
bodyrequired	array	Batch query

Successful responses

Response	Type	Description
200 OK	object	Batch results

Request format

[AnyQuery, ...]

Response format

[DecoratedHits, ...]

Predict category and tags, and also fetch similar products for reference

You can copy-paste the example curl command to your terminal.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_batch \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  [
    {
      "from": "products",
      "where": { "name": "rye bread" },
      "predict": "category"
    },
    {
      "from": "products",
      "where": { "name": "rye bread" },
      "predict": "tags",
      "exclusiveness": false
    },
    {
      "from": "products",
      "similarity": { "name": "rye bread" }
    }
  ]'

Response

[
  {
    "offset": 0,
    "total": 11,
    "hits": [
      { "$p": 0.8272154843087481, "field": "category", "feature": "101" },
      { "$p": 0.021987233173235084, "field": "category", "feature": "100" },
      { "$p": 0.021987233173235084, "field": "category", "feature": "102" },
      { "$p": 0.021987233173235084, "field": "category", "feature": "103" },
      { "$p": 0.021987233173235084, "field": "category", "feature": "104" },
      { "$p": 0.021987233173235084, "field": "category", "feature": "108" },
      { "$p": 0.018399942828573393, "field": "category", "feature": "109" },
      { "$p": 0.014779115894842236, "field": "category", "feature": "111" },
      { "$p": 0.014779115894842236, "field": "category", "feature": "115" },
      { "$p": 0.007445087603409161, "field": "category", "feature": "106" }
    ]
  },
  {
    "offset": 0,
    "total": 25,
    "hits": [
      { "$p": 0.8392856253737974, "field": "tags", "feature": "gluten" },
      { "$p": 0.8376068362594504, "field": "tags", "feature": "bread" },
      { "$p": 0.12633864926055016, "field": "tags", "feature": "pirkka" },
      { "$p": 0.11962185955575856, "field": "tags", "feature": "food" },
      { "$p": 0.11151547978856242, "field": "tags", "feature": "meat" },
      { "$p": 0.09564066656611062, "field": "tags", "feature": "protein" },
      { "$p": 0.08898199227460421, "field": "tags", "feature": "drink" },
      { "$p": 0.08898199227460421, "field": "tags", "feature": "lactose" },
      { "$p": 0.08141226436859195, "field": "tags", "feature": "fresh" },
      { "$p": 0.07273797470048293, "field": "tags", "feature": "candy" }
    ]
  },
  {
    "offset": 0,
    "total": 42,
    "hits": [
      {
        "$score": 2.967681932293201,
        "category": "101",
        "googleClicks": 12,
        "googleImpressions": 100,
        "id": "6437002001454",
        "name": "VAASAN Ruispalat 660g 12 pcs fullcorn rye bread",
        "price": 1.69,
        "tags": "gluten bread"
      },
      {
        "$score": 2.967681932293201,
        "category": "101",
        "googleClicks": 11,
        "googleImpressions": 100,
        "id": "6411402202208",
        "name": "Fazer Puikula fullcorn rye bread 9 pcs/500g",
        "price": 1.85,
        "tags": "gluten bread"
      },
      {
        "$score": 2.967681932293201,
        "category": "101",
        "googleClicks": 10,
        "googleImpressions": 100,
        "id": "6408180733260",
        "name": "Vaasan Ruispalat thin sliced rye bread 6pcs/195g",
        "price": 1.35,
        "tags": "gluten bread"
      },
      {
        "$score": 2.967681932293201,
        "category": "101",
        "googleClicks": 9,
        "googleImpressions": 100,
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      {
        "$score": 2.967681932293201,
        "category": "101",
        "googleClicks": 8,
        "googleImpressions": 100,
        "id": "6413466080204",
        "name": "Oululainen reissumies dark rye bread 4pcs/280g",
        "price": 0.99,
        "tags": "gluten bread"
      },
      {
        "$score": 1.6441207746609814,
        "category": "102",
        "googleClicks": 9,
        "googleImpressions": 100,
        "id": "6407870070333",
        "name": "Atria lauantaimakkara bread sausage 225g",
        "price": 0.89,
        "tags": "meat sausage with-bread"
      },
      {
        "$score": 1.6441207746609814,
        "category": "103",
        "googleClicks": 8,
        "googleImpressions": 100,
        "id": "6409100024628",
        "name": "Breadded chicke nuggets 200g",
        "price": 1.49,
        "tags": "meat food protein"
      },
      {
        "$score": 1,
        "category": "100",
        "googleClicks": 12,
        "googleImpressions": 100,
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      {
        "$score": 1,
        "category": "100",
        "googleClicks": 11,
        "googleImpressions": 100,
        "id": "2000604700007",
        "name": "Cucumber Finland",
        "price": 0.9765,
        "tags": "fresh vegetable"
      },
      {
        "$score": 1,
        "category": "100",
        "googleClicks": 10,
        "googleImpressions": 100,
        "id": "6410405060457",
        "name": "Pirkka bio cherry tomatoes 250g international 1lk",
        "price": 1.29,
        "tags": "fresh vegetable pirkka tomato"
      }
    ]
  }
]

Aggregate

POST /api/v1/_aggregate

Aggregate operation.

Aggregate API is used to access aggregated values, like averages or sums. For example the following operation allows you to count all clicks by customer 4 and his mean click rate (ctr)

{
  "from": "impressions",
  "where": {
    "customer": 4
  },
  "aggregate": ["product.$sum", "product.$mean"]
}

Aggregate field can also calculate sums and averages of column scores as shown in the following example:

{
  "from": "products",
  "where": {
    "$knn": {
      "k": 4,
      "near": {
        "name": "Pirkka bread"
      }
    }
  },
  "aggregate": {
    "$mean": {
      "$freqP": {
        "f": "googleClicks",
        "n": "googleImpressions"
      }
    }
  }
}

The operation estimates the click-through-rate based on click and impression numbers using freqP, and then calculates the averages of the estimate.

Parameters

Name	Type	Description
bodyrequired	object	The request body

Successful responses

Response	Type	Description
200 OK	object	Aggregate results

Request format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// Describes the aggregation operation. Required.

"aggregate": AggregateProjection,

// Describes the fields and/or built-in attributes to return.

"select": Selection

}

Response format

{

"offset": integer,

"total": integer,

// Entries returned for a given query. Required.

"hits": Hits

}

Count amount of purchases and CTR of a product

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

You can copy-paste the example curl command to your terminal.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_aggregate \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "impressions",
    "where": { "product.id": "6408180733260" },
    "aggregate": ["purchase.$sum", "purchase.$mean"]
  }'

Response

{
  "mean": 0.04515327257663629,
  "mean.samples": 2414,
  "mean.variance": 0.04311445455225627,
  "mean.standardDeviation": 0.20764020456611063,
  "mean.standardError": 0.004226129639322252,
  "sum": 109,
  "sum.samples": 2414
}

Count more statistics for product 6408180733260 with custom names

In the example, the entry count is named as 'impressions', purchase mean as 'conversion' and counts for searches, pre-fills and recommendations with more descriptive names.

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

You can copy-paste the example curl command to your terminal.

curl -X POST \
  https://aito-demo.aito.app/api/v1/_aggregate \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "from": "impressions",
    "where": { "product.id": "6408180733260" },
    "aggregate": {
      "impressions": "$f",
      "purchases": "purchase.$sum",
      "conversion": "purchase.$mean",
      "searches": {
        "$f": { "context.type": "search" }
      },
      "prefills": {
        "$f": { "context.type": "prefill" }
      },
      "recommendations": {
        "$f": { "context.type": "recommendation" }
      }
    }
  }'

Response

{
  "recommendations": 0,
  "prefills": 9,
  "searches": 279,
  "conversion": 0.04515327257663629,
  "conversion.samples": 2414,
  "conversion.variance": 0.04311445455225627,
  "conversion.standardDeviation": 0.20764020456611063,
  "conversion.standardError": 0.004226129639322252,
  "purchases": 109,
  "purchases.samples": 2414,
  "impressions": 2414
}

Create jobs

POST /api/v1/jobs/{query}

Create a job for queries that last longer than 30 seconds. The regular endpoints reach a timeout after 30 seconds.

You can make a job request out of Predict, Match, Similarity, Generic, and Evaluate query endpoints. The query used is the same as you would use for the regular endpoint.

The API also supports running some of the more time-consuming database-operations as jobs. For the given operations, the jobs-API is the recommended way to call the API, due the query timeout limit. The available operations are Batch Data Insert, Data Delete, and Optimize endpoints. The payload format is identical to the regular operations.

Parameters

Name	Type	Description
queryrequired	string	Any of the Aito query endpoints

Successful responses

Response	Type	Description
200 OK	object	Job info

Response format

{

// JobID.

"id": string,

// Empty for the query API endpoints.

"parameters": object,

// The query path used.

"path": string,

// When the job was started.

"startedAt": string

}

Example request

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

The example query is exactly the same as would be when using the regular _evaluate endpoint.

Aito iterates through each product in the test data, and tests how accurate the prediction of tags for a given product name was.

curl -X POST \
  https://aito-demo.aito.app/api/v1/jobs/_evaluate \
  -H 'content-type: application/json' \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \
  -d '
  {
    "test": {
      "$index": {
        "$mod": [4, 0]
      }
    },
    "evaluate": {
      "from": "products",
      "where": {
        "name": { "$get": "name" }
      },
      "predict": "tags"
    }
  }'

Response

{
  "id": "56d7b654-4bf6-4dea-8733-3fb1b49fd621",
  "parameters": {  },
  "path": "_evaluate",
  "startedAt": "2025-06-08T12:49:51.629103Z"
}

Get status of all jobs

GET /api/v1/jobs/

List all jobs that exist currently.

Successful responses

Response	Type	Description
200 OK	object	Job statuses

Response format

{

// Job result will not be available after the date.

"expiresAt": string,

// Job finished running.

"finishedAt": string,

// JobID.

"id": string,

// Empty for the query API endpoints.

"parameters": object,

// The query path used.

"path": string,

// When the job was started.

"startedAt": string

}

Example request

curl -X GET \
  https://aito-demo.aito.app/api/v1/jobs/ \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi'

Response

[
  {
    "id": "56d7b654-4bf6-4dea-8733-3fb1b49fd621",
    "parameters": {  },
    "path": "_evaluate",
    "startedAt": "2025-06-08T12:49:51.629103Z"
  }
]

Get status of a job

GET /api/v1/jobs/{uuid}

If you have started a job for some of the queries, this endpoint can return you the status of the job by its ID.

Successful responses

Response	Type	Description
200 OK	object	Job status

Response format

{

// Job result will not be available after the date.

"expiresAt": string,

// Job finished running.

"finishedAt": string,

// JobID.

"id": string,

// Empty for the query API endpoints.

"parameters": object,

// The query path used.

"path": string,

// When the job was started.

"startedAt": string

}

Example request

curl -X GET \
  https://aito-demo.aito.app/api/v1/jobs/56d7b654-4bf6-4dea-8733-3fb1b49fd621 \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi'

Response

{
  "id": "56d7b654-4bf6-4dea-8733-3fb1b49fd621",
  "parameters": {  },
  "path": "_evaluate",
  "startedAt": "2025-06-08T12:49:51.629103Z"
}

Get result of a job

GET /api/v1/jobs/{uuid}/result

Get the query result for a created job.

Successful responses

Response	Type	Description
200 OK	object	Evaluate job result

Response format

{

// The amount of samples used for testing. Required.

"n": integer,

// The amount of samples used for testing. Required.

"testSamples": integer,

// The average amount of samples used for training. Required.

"trainSamples": number,

// The average number of features. Required.

"features": number,

// Complement of `accuracy` (=`1 - accuracy`). Required.

"error": number,

// Complement of `baseAccuracy` (=`1 - baseAccuracy`). // Required.

"baseError": number,

// The accuracy of predictions. Required.

"accuracy": number,

// The simulated accuracy of predictions based on taking the // most frequent value. Required.

"baseAccuracy": number,

// How much better results Aito was able to provide compared to // a naive prediction. Required.

"accuracyGain": number,

// Average rank of the best prediction. Required.

"meanRank": number,

"baseMeanRank": number,

// Improvement of meanRank upon baseMeanRank. Required.

"rankGain": number,

// A measurement which describes the quality of probabilities // (=`h - mxe`). Required.

"informationGain": number,

// Mean cross entropy. Required.

"mxe": number,

// Entropy. Required.

"h": number,

// The mean geometric probability of the predictions. Required.

"geomMeanP": number,

// Base geometric mean probability. Required.

"baseGmp": number,

// Geometric mean lift. Required.

"geomMeanLift": number,

// The mean execution time of the queries in nanoseconds. // Required.

"meanNs": number,

// The mean execution time of the queries in microseconds. // Required.

"meanUs": number,

// The mean execution time of the queries in milliseconds. // Required.

"meanMs": number,

// The median execution time of the queries in nanoseconds. // Required.

"medianNs": number,

// The median execution time of the queries in microseconds. // Required.

"medianUs": number,

// The median execution time of the queries in milliseconds. // Required.

"medianMs": number,

// The time spent for warm-up of indexes and caches for the // given query in milliseconds. Required.

"warmingMs": number

}

Example request

curl -X GET \
  https://aito-demo.aito.app/api/v1/jobs/56d7b654-4bf6-4dea-8733-3fb1b49fd621/result \
  -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi'

Response

{
  "n": 11,
  "testSamples": 11,
  "trainSamples": 31,
  "features": 210,
  "error": 0.8181818181818181,
  "baseError": 0.9090909090909091,
  "accuracy": 0.18181818181818182,
  "baseAccuracy": 0.09090909090909091,
  "accuracyGain": 0.09090909090909091,
  "meanRank": 19.90909090909091,
  "baseMeanRank": 26,
  "rankGain": 6.09090909090909,
  "informationGain": -0.672246695378206,
  "mxe": 2.8510226976872475,
  "h": 4.357552004618083,
  "geomMeanP": 0.13859789989493534,
  "baseGmp": 0.04878048780487808,
  "geomMeanLift": 2.8412569478461727,
  "meanNs": 19132551.09090909,
  "meanUs": 19132.551090909088,
  "meanMs": 19.13255109090909,
  "medianNs": 17273205,
  "medianUs": 17273.205,
  "medianMs": 17.273205,
  "allNs": [
    12511444,
    20616753,
    19494757,
    41568073,
    14575025,
    18503253,
    19810702,
    17273205,
    12208070,
    17081632,
    16815148
  ],
  "allUs": [
    12511,
    20616,
    19494,
    41568,
    14575,
    18503,
    19810,
    17273,
    12208,
    17081,
    16815
  ],
  "allMs": [12, 20, 19, 41, 14, 18, 19, 17, 12, 17, 16],
  "warmingMs": 0,
  "accurateOffsets": [2, 8],
  "errorOffsets": [0, 1, 3, 4, 5, 6, 7, 9, 10],
  "cases": [
    {
      "offset": 0,
      "testCase": {
        "category": "100",
        "googleClicks": 12,
        "googleImpressions": 100,
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      "accurate": false,
      "top": {
        "$p": 0.2725867712045053,
        "$value": "fresh vegetable pirkka tomato"
      }
    },
    {
      "offset": 1,
      "testCase": {
        "category": "100",
        "googleClicks": 9,
        "googleImpressions": 100,
        "id": "6410405093677",
        "name": "Pirkka iceberg salad Finland 100g 1st class",
        "price": 1.29,
        "tags": "fresh vegetable pirkka salad"
      },
      "accurate": false,
      "top": {
        "$p": 0.16203312421914376,
        "$value": "lactose-free drink pirkka"
      }
    },
    {
      "offset": 2,
      "testCase": {
        "category": "101",
        "googleClicks": 9,
        "googleImpressions": 100,
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      "accurate": true,
      "top": { "$p": 0.22527597111196349, "$value": "gluten bread" },
      "correct": { "$p": 0.22527597111196349, "$value": "gluten bread" }
    },
    {
      "offset": 3,
      "testCase": {
        "category": "102",
        "googleClicks": 10,
        "googleImpressions": 100,
        "id": "6410405205483",
        "name": "Pirkka Finnish beef-pork minced meat 20% 400g",
        "price": 2.79,
        "tags": "meat food protein pirkka"
      },
      "accurate": false,
      "top": { "$p": 0.1800781656874408, "$value": "meat food pirkka" }
    },
    {
      "offset": 4,
      "testCase": {
        "category": "103",
        "googleClicks": 11,
        "googleImpressions": 100,
        "id": "6412000030026",
        "name": "Saarioinen Maksalaatikko liver casserole 400g",
        "price": 1.99,
        "tags": "meat food"
      },
      "accurate": false,
      "top": { "$p": 0.12869832052160496, "$value": "food carbohydrate pirkka" }
    },
    {
      "offset": 5,
      "testCase": {
        "category": "104",
        "googleClicks": 12,
        "googleImpressions": 100,
        "id": "6410405082657",
        "name": "Pirkka Finnish semi-skimmed milk 1l",
        "price": 0.81,
        "tags": "lactose drink pirkka"
      },
      "accurate": false,
      "top": {
        "$p": 0.15946712539297916,
        "$value": "lactose-free drink pirkka"
      },
      "correct": { "$p": 0.15894666120082465, "$value": "lactose drink pirkka" }
    },
    {
      "offset": 6,
      "testCase": {
        "category": "104",
        "googleClicks": 8,
        "googleImpressions": 100,
        "id": "6408430000258",
        "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
        "price": 1.95,
        "tags": "lactose-free drink"
      },
      "accurate": false,
      "top": {
        "$p": 0.24843983175406179,
        "$value": "lactose-free drink pirkka"
      }
    },
    {
      "offset": 7,
      "testCase": {
        "category": "108",
        "googleClicks": 9,
        "googleImpressions": 100,
        "id": "6420101441542",
        "name": "Kulta Katriina filter coffee 500g",
        "price": 3.45,
        "tags": "coffee"
      },
      "accurate": false,
      "top": { "$p": 0.7816059135211525, "$value": "coffee pirkka" },
      "correct": { "$p": 0.044609716250265476, "$value": "coffee" }
    },
    {
      "offset": 8,
      "testCase": {
        "category": "109",
        "googleClicks": 10,
        "googleImpressions": 100,
        "id": "6411401015090",
        "name": "Fazer Sininen milk chocolate slab 200g",
        "price": 2.19,
        "tags": "candy lactose"
      },
      "accurate": true,
      "top": { "$p": 0.2310100412834967, "$value": "candy lactose" },
      "correct": { "$p": 0.2310100412834967, "$value": "candy lactose" }
    },
    {
      "offset": 9,
      "testCase": {
        "category": "111",
        "googleClicks": 11,
        "googleImpressions": 100,
        "id": "6413200330206",
        "name": "Lotus Soft Embo 8 rll toilet paper",
        "price": 3.35,
        "tags": "toilet-paper"
      },
      "accurate": false,
      "top": { "$p": 0.4072071464754265, "$value": "drink soda" }
    },
    {
      "offset": 10,
      "testCase": {
        "category": "115",
        "googleClicks": 10,
        "googleImpressions": 100,
        "id": "6410402010318",
        "name": "Pirkka tuna fish pieces in oil 200g/150g",
        "price": 1.69,
        "tags": "meat food protein pirkka"
      },
      "accurate": false,
      "top": {
        "$p": 0.1875474618544392,
        "$value": "fresh vegetable pirkka tomato"
      }
    }
  ],
  "accurateCases": [
    {
      "offset": 2,
      "testCase": {
        "category": "101",
        "googleClicks": 9,
        "googleImpressions": 100,
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      "accurate": true,
      "top": { "$p": 0.22527597111196349, "$value": "gluten bread" },
      "correct": { "$p": 0.22527597111196349, "$value": "gluten bread" }
    },
    {
      "offset": 8,
      "testCase": {
        "category": "109",
        "googleClicks": 10,
        "googleImpressions": 100,
        "id": "6411401015090",
        "name": "Fazer Sininen milk chocolate slab 200g",
        "price": 2.19,
        "tags": "candy lactose"
      },
      "accurate": true,
      "top": { "$p": 0.2310100412834967, "$value": "candy lactose" },
      "correct": { "$p": 0.2310100412834967, "$value": "candy lactose" }
    }
  ],
  "errorCases": [
    {
      "offset": 0,
      "testCase": {
        "category": "100",
        "googleClicks": 12,
        "googleImpressions": 100,
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      "accurate": false,
      "top": {
        "$p": 0.2725867712045053,
        "$value": "fresh vegetable pirkka tomato"
      }
    },
    {
      "offset": 1,
      "testCase": {
        "category": "100",
        "googleClicks": 9,
        "googleImpressions": 100,
        "id": "6410405093677",
        "name": "Pirkka iceberg salad Finland 100g 1st class",
        "price": 1.29,
        "tags": "fresh vegetable pirkka salad"
      },
      "accurate": false,
      "top": {
        "$p": 0.16203312421914376,
        "$value": "lactose-free drink pirkka"
      }
    },
    {
      "offset": 3,
      "testCase": {
        "category": "102",
        "googleClicks": 10,
        "googleImpressions": 100,
        "id": "6410405205483",
        "name": "Pirkka Finnish beef-pork minced meat 20% 400g",
        "price": 2.79,
        "tags": "meat food protein pirkka"
      },
      "accurate": false,
      "top": { "$p": 0.1800781656874408, "$value": "meat food pirkka" }
    },
    {
      "offset": 4,
      "testCase": {
        "category": "103",
        "googleClicks": 11,
        "googleImpressions": 100,
        "id": "6412000030026",
        "name": "Saarioinen Maksalaatikko liver casserole 400g",
        "price": 1.99,
        "tags": "meat food"
      },
      "accurate": false,
      "top": { "$p": 0.12869832052160496, "$value": "food carbohydrate pirkka" }
    },
    {
      "offset": 5,
      "testCase": {
        "category": "104",
        "googleClicks": 12,
        "googleImpressions": 100,
        "id": "6410405082657",
        "name": "Pirkka Finnish semi-skimmed milk 1l",
        "price": 0.81,
        "tags": "lactose drink pirkka"
      },
      "accurate": false,
      "top": {
        "$p": 0.15946712539297916,
        "$value": "lactose-free drink pirkka"
      },
      "correct": { "$p": 0.15894666120082465, "$value": "lactose drink pirkka" }
    },
    {
      "offset": 6,
      "testCase": {
        "category": "104",
        "googleClicks": 8,
        "googleImpressions": 100,
        "id": "6408430000258",
        "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
        "price": 1.95,
        "tags": "lactose-free drink"
      },
      "accurate": false,
      "top": {
        "$p": 0.24843983175406179,
        "$value": "lactose-free drink pirkka"
      }
    },
    {
      "offset": 7,
      "testCase": {
        "category": "108",
        "googleClicks": 9,
        "googleImpressions": 100,
        "id": "6420101441542",
        "name": "Kulta Katriina filter coffee 500g",
        "price": 3.45,
        "tags": "coffee"
      },
      "accurate": false,
      "top": { "$p": 0.7816059135211525, "$value": "coffee pirkka" },
      "correct": { "$p": 0.044609716250265476, "$value": "coffee" }
    },
    {
      "offset": 9,
      "testCase": {
        "category": "111",
        "googleClicks": 11,
        "googleImpressions": 100,
        "id": "6413200330206",
        "name": "Lotus Soft Embo 8 rll toilet paper",
        "price": 3.35,
        "tags": "toilet-paper"
      },
      "accurate": false,
      "top": { "$p": 0.4072071464754265, "$value": "drink soda" }
    },
    {
      "offset": 10,
      "testCase": {
        "category": "115",
        "googleClicks": 10,
        "googleImpressions": 100,
        "id": "6410402010318",
        "name": "Pirkka tuna fish pieces in oil 200g/150g",
        "price": 1.69,
        "tags": "meat food protein pirkka"
      },
      "accurate": false,
      "top": {
        "$p": 0.1875474618544392,
        "$value": "fresh vegetable pirkka tomato"
      }
    }
  ],
  "alpha_binByTopScore": [
    {
      "meanScore": 0.15756918395529218,
      "maxScore": 0.1800781656874408,
      "minScore": 0.12869832052160496,
      "accuracy": 0,
      "n": 4,
      "accurateOffsets": [],
      "errorOffsets": [4, 5, 1, 3]
    },
    {
      "meanScore": 0.2230683265009903,
      "maxScore": 0.24843983175406179,
      "minScore": 0.1875474618544392,
      "accuracy": 0.5,
      "n": 4,
      "accurateOffsets": [2, 8],
      "errorOffsets": [10, 6]
    },
    {
      "meanScore": 0.4871332770670281,
      "maxScore": 0.7816059135211525,
      "minScore": 0.2725867712045053,
      "accuracy": 0,
      "n": 3,
      "accurateOffsets": [],
      "errorOffsets": [0, 9, 7]
    }
  ]
}

Database API

Operations which manipulate the Aito database.

Get database schema

GET /api/v1/schema

Get the schema for the database.

Successful responses

Response	Type	Description
200 OK	object	The current active schema

Response format

{

// Database tables. Required.

"schema": {

// Any schema which is a valid Aito table schema.

"<yourTableName>": UserDefinedTableSchema

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/schema \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "schema": {
    "products": {
      "columns": {
        "description": {
          "analyzer": "english",
          "nullable": false,
          "type": "Text"
        },
        "id": { "nullable": false, "type": "Int" },
        "name": { "nullable": false, "type": "String" },
        "price": { "nullable": false, "type": "Decimal" }
      },
      "type": "table"
    }
  }
}

Create database schema

PUT /api/v1/schema

Create or update the schema for the entire database.

Note:

An existing table that is not included in the updated schema will not be deleted.
An existing table that is included in the updated schema will be updated if the table has no data.
The new table names must be valid. See Valid Table Names section for more information.

Parameters

Name	Type	Description
bodyrequired	object	The aito schema definition

Successful responses

Response	Type	Description
200 OK	object	The current active schema

Request format

{

// Database tables. Required.

"schema": {

// Any schema which is a valid Aito table schema.

"<yourTableName>": UserDefinedTableSchema

}

Response format

{

// Database tables. Required.

"schema": {

// Any schema which is a valid Aito table schema.

"<yourTableName>": UserDefinedTableSchema

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.aito.app/api/v1/schema \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "schema": {
      "products": {
        "type": "table",
        "columns": {
          "id": { "type": "Int" },
          "name": { "type": "String" },
          "price": { "type": "Decimal" },
          "description": { "type": "Text", "analyzer": "English" }
        }
      }
    }
  }'

Response

{
  "schema": {
    "products": {
      "columns": {
        "description": { "analyzer": "english", "type": "Text" },
        "id": { "type": "Int" },
        "name": { "type": "String" },
        "price": { "type": "Decimal" }
      },
      "type": "table"
    }
  }
}

Delete database

DELETE /api/v1/schema

Delete the entire database schema.

The operation deletes all data and contents of the database! The action is irreversible.

Successful responses

Response	Type	Description
200 OK	object	The summary of deletion

Response format

{

// Array of table names deleted. Required.

"deleted": [string, ...]

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.aito.app/api/v1/schema \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "deleted": ["products"]
}

Get table schema

GET /api/v1/schema/{table}

Get the schema of the specified table.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to

Successful responses

Response	Type	Description
200 OK	object	The current schema of the table

Response format

{

// Type of the database schema item.

"type": string,

// Table columns.

"columns": {

// Type of the column.

"<yourColumnName>": ColumnType

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/schema/products \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "columns": {
    "description": { "analyzer": "english", "nullable": false, "type": "Text" },
    "id": { "nullable": false, "type": "Int" },
    "name": { "nullable": false, "type": "String" },
    "price": { "nullable": false, "type": "Decimal" }
  },
  "type": "table"
}

Create table schema

PUT /api/v1/schema/{table}

Update a schema of the specified table.

Note:

The table schema cannot be updated if it contains data.
The new table name must be valid. See Valid Table Names section for more information.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to
bodyrequired	object	The new schema of the table

Successful responses

Response	Type	Description
200 OK	object	The current schema of the table

Request format

{

// Type of the database schema item.

"type": string,

// Table columns.

"columns": {

// Type of the column.

"<yourColumnName>": ColumnType

}

Response format

{

// Type of the database schema item.

"type": string,

// Table columns.

"columns": {

// Type of the column.

"<yourColumnName>": ColumnType

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.aito.app/api/v1/schema/products \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "type": "table",
    "columns": {
      "id": { "type": "Int" },
      "name": { "type": "String" },
      "price": { "type": "Decimal" },
      "description": { "type": "Text", "analyzer": "English" }
    }
  }'

Response

{
  "columns": {
    "description": { "analyzer": "english", "type": "Text" },
    "id": { "type": "Int" },
    "name": { "type": "String" },
    "price": { "type": "Decimal" }
  },
  "type": "table"
}

Delete table

DELETE /api/v1/schema/{table}

Delete a single table in the schema.

The operation deletes all data and contents of the table! The action is irreversible.

Note: The delete operation would fail if it leaves the database schema in broken state.

For example, given the following schema:

{
  "schema": {
    "users": {
      "type": "table",
      "columns": {
        "username": { "type": "String" }
      }
    },
    "sessions" : {
      "type": "table",
      "columns": {
         "id"     : { "type" : "String" },
         "user"   : { "type" : "String", "link": "users.username" }
      }
    }
  }
}

The users table cannot be deleted before changing the sessions table first so that sessions.user is not linked to the users table.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to

Successful responses

Response	Type	Description
200 OK	object	The summary of deletion

Response format

{

// Array of table names deleted. Required.

"deleted": [string, ...]

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.aito.app/api/v1/schema/products \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "deleted": ["products"]
}

Get column schema

GET /api/v1/schema/{table}/{column}

Get the schema of a column.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to
tablerequired	string	The name of the column

Successful responses

Response	Type	Description
200 OK	object	The current schema of the column

Response format

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/schema/products/name \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "nullable": false,
  "type": "String"
}

Add or replace column

PUT /api/v1/schema/{table}/{column}

Add or replace a column of a table.

If a column with the same name already exists then the operation deletes all data and contents of the column! The action is irreversible.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to
tablerequired	string	The name of the column
bodyrequired	object	The schema of the column

Successful responses

Response	Type	Description
200 OK	object	The schema of the column

Request format

{

// Value that existing rows get.

"value":

integer

number

boolean

null

string

}

Response format

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.aito.app/api/v1/schema/products/quantity \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "type": "Int",
    "nullable": false,
    "value": 0
  }'

Response

{
  "nullable": false,
  "type": "Int",
  "value": 0
}

Delete column

DELETE /api/v1/schema/{table}/{column}

Delete a column from a table.

The operation deletes all data and contents of the column! The action is irreversible.

Note: The delete operation would fail if it leaves the database schema in broken state.

For example, given the following schema:

{
  "schema": {
    "users": {
      "type": "table",
      "columns": {
        "username": { "type": "String" },
        "name": { "type": "String" }
      }
    },
    "sessions" : {
      "type": "table",
      "columns": {
         "id"     : { "type" : "String" },
         "user"   : { "type" : "String", "link": "users.username" }
      }
    }
  }
}

The column username of the users table cannot be deleted before changing the sessions table first so that sessions.user is not linked to users.username.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to
tablerequired	string	The name of the column

Successful responses

Response	Type	Description
200 OK	object	The summary of deletion

Response format

{

// Array of table names deleted. Required.

"deleted": [string, ...]

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.aito.app/api/v1/schema/products/description \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {}'

Response

{
  "deleted": ["description"]
}

Rename a table

POST /api/v1/schema/_rename

Rename a table to the specified name.

Rename the table in the 'from' field to the specified name in the rename field. Set 'replace' to true, if you want to replace an existing table with the specified name.

The new table name must be valid. See Valid Table Names section for more information.

Parameters

Name	Type	Description
bodyrequired	object	The request body

Successful responses

Response	Type	Description
200 OK	object	Rename Table results

Request format

{

// The table to rename. Required.

"from": FromTablemodify,

// The name of the renamed table. Required.

"rename": FromTablemodify,

// If replace is true, operation will overwrite any existing // table. // Default: false

"replace": boolean

}

Response format

{}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/schema/_rename \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "from": "products",
    "rename": "renamed_products"
  }'

Response

{}

Copy a table

POST /api/v1/schema/_copy

Copy a table. This operations creates a copy of the table with the given name. The operation can be very fast, because the copying is done by copying the reference to the underlying immutable data structure.

The 'from' field must contain the name of the copied table. The 'copy' field must contain the new name of the new copy. Set 'replace' field to true, if you want to replace any existing table with the target name.

The new table name must be valid. See Valid Table Names section for more information.

Parameters

Name	Type	Description
bodyrequired	object	The request body

Successful responses

Response	Type	Description
200 OK	object	Copy Table results

Request format

{

// The existing table to copy. Required.

"from": FromTablemodify,

// The target name of the new copy. Required.

"copy": FromTablemodify,

// If replace is true, operation will overwrite any existing // table. // Default: false

"replace": boolean

}

Response format

{}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/schema/_copy \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "from": "products",
    "copy": "old_products"
  }'

Response

{}

Insert entry

POST /api/v1/data/{table}

Insert entry to a table.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to
bodyrequired	object	Any object which is valid according to the provisioned schema

Successful responses

Response	Type	Description
200 OK	object	The inserted entry

Request format

UserDefinedObject

Response format

UserDefinedObject

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "id": 1,
    "name": "Apple iPhone 8 64 Gt, spacegray",
    "price": 648.9,
    "description": "A11 processor and wireless charging."
  }'

Response

{
  "description": "A11 processor and wireless charging.",
  "id": 1,
  "name": "Apple iPhone 8 64 Gt, spacegray",
  "price": 648.9
}

Insert multiple entries

POST /api/v1/data/{table}/batch

Import multiple entries into the database.

The batch import can be used to upload multiple entries to a single table. The payload needs to be a valid JSON array (instead of ndjson).

The batch import can run as a job. The path for running batch as a job is

/api/v1/jobs/data/<TABLE>/batch

Note: batch API supports max 10MB payloads.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to
bodyrequired	array	An array of objects which are valid according to the provisioned schema

Successful responses

Response	Type	Description
200 OK	object	Summary of the inserted entries

Request format

[UserDefinedObject, ...]

Response format

{

// How many entries were inserted. Required.

"entries": integer,

// Status text. Required.

"status": string

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products/batch \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  [
    {
      "id": 1,
      "name": "Apple iPhone 8 64 Gt, spacegray",
      "price": 648.9,
      "description": "A11 processor and wireless charging."
    },
    {
      "id": 2,
      "name": "Apple iPhone X 32 GB, space gray",
      "price": 1048.9,
      "description": "All‑screen design. Longest battery life ever in an iPhone."
    },
    {
      "id": 3,
      "name": "Samsung Galaxy S9",
      "price": 698.2,
      "description": "The Camera. Reimagined."
    }
  ]'

Response

{
  "entries": 3,
  "status": "ok"
}

Delete entries

POST /api/v1/data/_delete

Delete entries with a Search-like interface.

You can describe the target table and filters for which entries to delete. The delete-operation must walk over each entry in the table, and can thus be expensive. Delete can be run as a job, thus preventing timeout errors from happening. The path for running delete as a job is

/api/v1/jobs/data/<TABLE_NAME>/_delete

An empty proposition will match and delete everything!

Parameters

Name	Type	Description
bodyrequired	object	To be clarified

Successful responses

Response	Type	Description
200 OK	object	Delete results

Request format

{

// The modified table. Required.

"from": FromTablemodify,

// The entries to delete. Required.

"where": Proposition

}

Response format

{

// The number of rows that was deleted. Required.

"total": integer

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/_delete \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "from": "products",
    "where": { "id": 1 }
  }'

Response

{
  "total": 1
}

Initiate file upload

POST /api/v1/data/{table}/file

Initiate a file upload session.

The file API allows circumventing the batch upload API payload size limit by allowing upload of large data sets. The file API accepts data in gzip compressed ndjson format, stored into a file.

File must be a gzip compressed ndjson, normal JSON arrays are not accepted.

The data file is uploaded to AWS S3 and processed asynchronously. The file must be compressed with gzip before uploading to reduce the size of the transferred data.

The file API is not a single API, but requires a minimum of three calls (per table). The sequence is as follows:

Initiate the file upload process
Upload compressed ndjson file to S3, using the signed URL
Trigger file processing
(Optional) Poll the file processing status

Loading diagram...

You can find the bash implementation of the flow at our tools repository. See the upload-file.sh script.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to

Successful responses

Response	Type	Description
200 OK	object	The details to execute the S3 upload and the job's id

Response format

{

// The uuid of the file upload session. Required.

"id": string,

// The presigned S3 url where to push the data. Required.

"url": string,

// The http method used for uploading to S3. Required.

"method": string,

// Defines when the presigned upload link expires. Required.

"expires": string

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products/file \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "expires": "2025-06-08T13:09:55",
  "id": "efe2bfde-9b87-4d63-b109-3c2bd8319128",
  "method": "PUT",
  "url": "https://aitoai-customer-uploads.s3.eu-west-1.amazonaws.com/localhost/products/efe2bfde-9b87-4d63-b109-3c2bd8319128?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20250608T124955Z&X-Amz-SignedHeaders=host&X-Amz-Expires=1199&X-Amz-Credential=AKIA42C7USIXZFCKEUJT%2F20250608%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Signature=b40c63127a36da0d9225238b3ca925234a6cb11c9a968632bc562ca442d8b221"
}

Trigger file processing

POST /api/v1/data/{table}/file/{uuid}

Start the processing of a previously uploaded file.

Note: This operation is part of the file upload sequence. If you want to read how to execute a full file upload flow, see Initiate file upload documentation.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to
uuidrequired	string	The assigned id of the operation

Successful responses

Response	Type	Description
200 OK	object	Processing started status

Response format

{

// The id of the operation. Required.

"id": string,

// Textual description of the job. Required.

"status": string

}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products/file/c8d6b2e9-7cc1-424f-abab-31d2c92cbf23 \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "id": "c8d6b2e9-7cc1-424f-abab-31d2c92cbf23",
  "status": "started"
}

Get file processing status

GET /api/v1/data/{table}/file/{uuid}

Get the file upload progress.

The response is probabilistic and might not contain the very last result, since the status update is asynchronous, and the upload happens in multiple parallel streams. The response, however, will give an idea of approximate progress.

Note: This operation is part of the file upload sequence. If you want to read how to execute a full file upload flow, see Initiate file upload documentation.

Parameters

Name	Type	Description
tablerequired	string	The name of the table to add data to
uuidrequired	string	The assigned id of the operation

Successful responses

Response	Type	Description
200 OK	object	The file processing status

Response format

{

"status": {

// Total duration of the file processing elapsed. Required.

"totalDurationMs": number,

// Total duration of the file processing elapsed as human // readable units. Required.

"totalDuration": string,

// Throughput of lines in human readable units.

"throughput": string,

// When the file processing was started. Required.

"startedAt": string,

// When the file processing was finished. Required.

"finishedAt": string,

// Is the job finished or not. Required.

"finished": boolean,

// The number of lines completed so far. Required.

"completedCount": integer,

// Any object which is valid according to the database schema. // Required.

"lastSuccessfulElement": UserDefinedObject

"errors": {

// Human consumable description. Required.

"message": string,

// Array of failing elements.

"rows": [UserDefinedObject, ...]

}

Request during processing

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/data/products/file/83ca7fdc-0226-4544-9787-7b0e18933fb1 \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

The example shows what the response looks while data processing is still in progress.

{
  "errors": { "message": "Last 0 failing rows", "rows": null },
  "status": {
    "phase": "AitoDatabaseInsert",
    "finished": false,
    "completedCount": 3,
    "lastSuccessfulElement": {
      "description": "The Camera. Reimagined.",
      "id": 3,
      "name": "Samsung Galaxy S9",
      "price": 698.2
    },
    "startedAt": "20250608T124957.342Z",
    "throughput": "2.92/s"
  }
}

Request after processing

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/data/products/file/e8d3b747-ed2d-4411-9688-457e661eb7e5 \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

The example shows what the response looks after data processing has been successfully done.

{
  "errors": { "message": "Last 0 failing rows", "rows": null },
  "status": {
    "totalDurationMs": 1031,
    "phase": "Finished",
    "finished": true,
    "completedCount": 3,
    "lastSuccessfulElement": {
      "description": "The Camera. Reimagined.",
      "id": 3,
      "name": "Samsung Galaxy S9",
      "price": 698.2
    },
    "totalDuration": "1 second and 31 milliseconds",
    "startedAt": "20250608T124958.758Z",
    "finishedAt": "20250608T124959.789Z",
    "throughput": "2.91/s"
  }
}

Optimize the database

POST /api/v1/data/{table}/optimize

Optimize the database for the query performance

Note: The recommended way to run optimize is a job for it. The optimize-operation easily times out for any non-trivial database. The path for running optimize as a job is

/api/v1/jobs/data/<TABLE_NAME>/optimize

Aito.ai database is implemented as a log-structured merge-tree. Because this architecture, Aito's tables are implemented internally as a tree of table segments.

Now, the complexity of the table tree has major implications on both query speed and write speed side. The less segments Aito maintains in the tree, the faster the queries are, but the slower the writes are, because Aito needs to rewrite parts of the tree regularly. Similarly the more segments are allowed, the slower the queries are, but the faster the write speed becomes.

Aito seeks to maintain the approximately O(log N) segments in the table tree in order to maintain a reasonable compromise between the query and the write speeds.

Still, there can be situations, where it is beneficial to rewrite the entire database as a single segment to get the optimal query speed. Optimize operation does this.

It may take minutes or hours to optimize a big table. This means, that optimize should be used to improve the query performance only in situations, when the database and the results need to be updated rarely, for example nightly.

Optimize will maintain a write lock on the database over the entire operation. This means that you cannot add data at the time the optimize operation is running. Still, the queries will work normally. After the optimize is finished, the optimized table needs to be reloaded, which can induce a significant latency for the following query.

Parameters

Name	Type	Description
bodyrequired	object	An empty object

Successful responses

Response	Type	Description
200 OK	object	An empty object

Request format

{}

Response format

{}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products/optimize \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {}'

Response

{}

Database Schema

The Aito database requires a schema to operate. The schema defines:

The name of the tables
The name and the ColumnType of the columns in each table
The Analyzer of a column if needed
The relationships between tables

Please refer to the Defining a database schema guide for more details.

UserDefinedTableSchema

Any schema which is a valid Aito table schema.

Table schema describes the structure of the table in a formal language. The schema describes all fields (or columns), data types of the fields, and information to help Aito preprocess your data. For example what language a textual data contains.

The contents of the schema depends on the data that will be inserted into the database.

Format

{

// Type of the database schema item.

"type": string,

// Table columns.

"columns": {

// Type of the column.

"<yourColumnName>": ColumnType

}

Example

{
  "type": "table",
  "columns": {
    "id": { "type": "Int", "nullable": false },
    "name": { "type": "String", "nullable": false },
    "price": { "type": "Decimal", "nullable": false },
    "description": { "type": "Text", "nullable": false, "analyzer": "English" }
  }
}

ColumnType

Type of the column.

Describes an individual field (or column), the type, and information to help Aito preprocess your data. For example what language a textual data contains.

Format

Examples

{ "type": "int", "nullable": false }

{ "type": "string", "nullable": false }

{ "type": "decimal", "nullable": false }

{ "type": "text", "nullable": false, "analyzer": "english" }

{ "type": "json", "nullable": true }

BooleanType

Boolean column type.

When column is a boolean, the only accepted values are true and false.

Format

{

// Type of the column. Required.

"type": string,

// When true, `null` values are allowed. // Default: true

"nullable": boolean,

// Path to a column of a linked row. // Default: null

"link": string

}

Example

{ "type": "boolean" }

Referenced in

DecimalType

Double-precision floating-point number.

Format

{

// Type of the column. Required.

"type": string,

// When true, `null` values are allowed. // Default: true

"nullable": boolean,

// Path to a column of a linked row. // Default: null

"link": string

}

Example

{ "type": "Decimal", "nullable": false }

Referenced in

IntType

Integer column type.

Format

{

// Type of the column. Required.

"type": string,

// When true, `null` values are allowed. // Default: true

"nullable": boolean,

// Path to a column of a linked row. // Default: null

"link": string

}

Examples

{ "type": "Int" }

{ "type": "Int", "link": "users.id" }

Referenced in

StringType

String column type.

The string data type is a primitive version of the Text type. The value is turned into a single feature. For example "lazy black cat" becomes 1 feature: "lazy black cat".

Format

{

// Type of the column. Required.

"type": string,

// When true, `null` values are allowed. // Default: true

"nullable": boolean,

// Path to a column of a linked row. // Default: null

"link": string

}

Examples

{ "type": "String", "nullable": false }

{ "type": "String", "link": "messages.id" }

Referenced in

TextType

Text column type.

The text data type enables smart textual analysis of strings. A text column has an analyzer which defines how the text can be split into words or tokens, which are used as features during inference.

Format

{

// Type of the column. Required.

"type": string,

// Aito analyzers break the [Text type](#schema-text-type) data // into features that can be used for inference.

"analyzer": Analyzer,

// When true, `null` values are allowed. // Default: true

"nullable": boolean,

// Path to a column of a linked row. // Default: null

"link": string

}

Example

{ "type": "Text", "analyzer": "English", "nullable": false }

Referenced in

JsonType

Json column type.

The json datatype type can have an arbitrary json value The value is turned into a single feature. For example {"a":[1, 2, 3], "b":true} becomes 1 feature: {"a":[1, 2, 3], "b":true}.

Format

{

// Type of the column. Required.

"type": string,

// When true, `null` values are allowed. // Default: true

"nullable": boolean,

// Path to a column of a linked row. // Default: null

"link": string

}

Example

{ "type": "Json", "nullable": true }

Referenced in

Analyzer

Aito analyzers break the Text type data into features that can be used for inference.
Let's take a look at an example of predicting the category of a product using its description using the following data:

description	tags
Brazilian organic orange	organic, fruit, imported
Local organic spinach	organic, vegetable, local
Lentil snack	snack

Given a description of "organic tomatoes", we would like to predict the tag of this product.
If no analyzer is defined, the description is treated as a String type and the description "organic tomatoes" is turned into only 1 feature "organic tomatoes". Since there is no entry in the given data containing the description "organic tomatoes", Aito is not able to provide any meaningful prediction for the tags.
Using the default English analyzer, "organic tomatoes" will be turned into 2 features "organ" (the English stem of the word "organic") and "tomato", "Brazilian organic orange" will be turned into 3 features brazilian, "organ", and "orang", other descriptions will be turned into features in a similar fashion.
Aito can now find patterns between these features. For example, when the description has a feature "organ", the tag is likely "organic". Hence, using the analyzer, Aito can return reasonable prediction for unseen entry.

Format

Examples

"standard"

"whitespace"

"english"

"en"

{ "type": "delimiter", "delimiter": "," }

{ "type": "language", "language": "en" }

{ "type": "char-ngram", "minGram": 2, "maxGram": 3 }

{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2 }

AliasAnalyzer

Aito has several built-in analyzers and they are selected by using their name in the "analyzer" field of a text column. For instance:

{ "analyzer": "english" }

The built-in analyzers include:

Standard Analyzer:
- Name: "standard"
- A good default analyzer which Works well in most languages. The analyzer generates features based on the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29. The standard analyzer filters English stop words that are normally not useful.
- E.g: "the cats are running" will be break down into "cats", "running".
Whitespace Analyzer:
- Name: "whitespace"
- The analyzer breaks the text into features whenever it encounters a whitespace character. Adjacent sequences of non-Whitespace characters form tokens.
- E.g: "the cats are running" will be break down into "the", "cats", "are", and "running".
Language Analyzer:
- Alias: the language name or the language ISO 639-1 Code (except some special case)
- A Language Analyzer with the default setting (no stop words or keywords).
- See Language Analyzer for supported languages and its aliases.

Format

string

Examples

"standard"

"whitespace"

"english"

"en"

Referenced in

CharNGramAnalyzer

The Character N-gram Analyzer breaks text into n-gram features.

For example, the following n-gram analyzer:

{ "type": "char-ngram", "minGram": 3, "maxGram": 3 }

would break the text "the cats are running" into the following list of features:

["the", "he ", "e c", " ca", "cat", "ats", "ts ", "s a", " ar", "are", "re ", "e r", " ru", "run", "unn", "nni", "nin", "ing"]

The analyzer can be useful for languages that don’t use spaces or that have long compound words, like German.

Format

{

// Type of the analyzer. Required.

"type": string,

// The minimum length of characters in a feature. Required.

"minGram": integer,

// The maximum length of characters in a feature. Required.

"maxGram": integer

}

Example

{ "type": "char-ngram", "minGram": 2, "maxGram": 3 }

Referenced in

DelimiterAnalyzer

The Delimiter Analyzer breaks text into features whenever encounters a specified delimiter character.

With the trimWhitespace option, the analyzer trims the whitespace surrounding a feature.

For example, the following analyzer:

{
  "type": "delimiter",
  "delimiter": ",",
  "trimWhitespace": true
}

would break the text "the, cats,are, running" into 4 features:

["the", "cats", "are", "running"]

Format

{

// Type of the analyzer. Required.

"type": string,

// The delimiter. Required.

"delimiter": string,

// Trims leading and trailing whitespace of the features. // Default: true

"trimWhitespace": boolean

}

Examples

{ "type": "delimiter", "delimiter": "," }

{ "type": "delimiter", "delimiter": "\n", "trimWhitespace": true }

Referenced in

LanguageAnalyzer

Language Analyzers aim to analyze text of a specific language.

When using a language analyzer, text is analyzed into lower-case word stem features. For example, using the following english analyzer:

{ "type": "language", "language": "english" }

a text "the cats are running" will be broken into 4 word stem features:

["the", "cat", "ar", "run"]

The value of the "language" parameter specifies which language will be used. The value can be the name or the ISO 639-1 code of the language. The full list is shown as below:

Language	Name	ISO code
Arabic	arabic	ar
Armenian	armenian	hy
Basque	basque	eu
Brazilian Portuguese	brazilian	pt-br
Bulgarian	bulgarian	bg
Catalan	catalan	ca
Chinese, Japanese, Korean	cjk	cjk
Czech	czech	cs
Danish	danish	da
Dutch	dutch	nl
English	english	en
Finnish	finnish	fi
French	french	fr
Galician	galician	gl
German	german	de
Greek	greek	el
Hindi	hindi	hi
Hungarian	hungarian	hu
Indonesian	indonesian	id
Irish	irish	ga
Italian	italian	it
Latvian	latvian	lv
Norwegian	norwegian	no
Persian	persian	fa
Portuguese	portuguese	pt
Romanian	romanian	ro
Russian	russian	ru
Spanish	spanish	es
Swedish	swedish	sv
Thai	thai	th
Turkish	turkish	tr

The language analyzers support filtering the stop words (common words that are normally not useful). Each language has a list of default stop words for filtering that can be enabled through the useDefaultStopWords" parameter. Some common English stop words are:

  "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", 
  "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", 
  "such", "that", "the", "their", "then", "there", "these", 
  "they", "this", "to", "was", "will", "with"

By default, "useDefaultStopWords" is set as false. The following analyzer:

{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": true
}

would break the text "the cats are running" into 2 features:

["cat", "run"]

It is also possible to specify a set of words that would be filtered through the "customStopWords" parameter and a set of words that would not be analyzed through the "customKeyWords" parameter. The following analyzer:

{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": false,
  "customStopWords": ["cats"],
  "customKeyWords": ["running"]
}

would break the text "the cats are running" into 3 features:

["the", "ar", "running"]

Format

{

// Type of the analyzer. Required.

"type": string,

// Name or code of the language. Required.

"language": string,

// Use the language default stopwords. // Default: false

"useDefaultStopWords": boolean,

// List of words that will be filtered. // Default:

"customStopWords": [string, ...],

// List of words that will not be featurizerd. // Default:

"customKeyWords": [string, ...]

}

Examples

{ "type": "language", "language": "en" }

{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": true,
  "customStopWords": ["flower"],
  "customKeyWords": ["animal"]
}

Referenced in

TokenNGramAnalyzer

The Token N-gram Analyzer breaks text into token n-grams (shingles) based on a source analyzer. In other words, it combines the features of the source analyzer into new features.

For example, the following Token N-gram Analyzer:

{
  "type": "token-ngram",
  "source": "english",
  "minGram": 1,
  "maxGram": 2,
  "tokenSeparator": "_"
}

would breaks the text "the cat is running" into the following list of features:

["the", "the_cat", "cat", "cat_ar", "ar", "ar_run", "run"]

Format

{

// Type of the analyzer. Required.

"type": string,

// Source analyzer to generate features before being combined // into n-grams. Required.

"source": Analyzer,

// The minimum number of features to be combined. Required.

"minGram": integer,

// The maximum number of features to be combined. Required.

"maxGram": integer,

// The string used to join the features of the source analyzer. // Default: " "

"tokenSeparator": string

}

Examples

{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2 }

{
  "type": "token-ngram",
  "source": { "type": "delimiter", "delimiter": "," },
  "minGram": 1,
  "maxGram": 3,
  "tokenSeparator": "_"
}

Referenced in

Query language

The reference documentation for Aito query language.

Common concepts

Features

To make better analysis of the data, Aito splits fields into features under the hood. How the featurization is done, depends on the field type. For example the Text type supports an "analyzer" option which allows you to control how a text field is split into features.

Some queries, for example Relate, return the features instead of the actual values of the field.

Exclusiveness

Exclusiveness is an option in predictions. In summary, it describes whether the predicted field can have multiple values at the same time or not.

Understanding the concept is easiest through an example. If we were predicting tags for a product, we would want to set "exclusiveness": false, because a product can have multiple tags. A product could be described with the following tags:

Venn diagram of tags

However if we were predicting the user, who would most likely purchase a product, we would want to use "exclusiveness": true (default behavior) because the value can only be one user at a time.

Venn diagram of users

$p vs $lift

If we were trying to find a customer, who is best characterized by a message, we'd need to understand the difference between $p and $lift. To make the difference clear, consider the following situation:

Alice messages often, but she doesn't mention iPhone often
Bob messages rarely, but only about iPhones

Querying users by $p quite likely finds Alice, because she may be overall the more likely person to mention "iPhone". Querying users by $lift, on the other hand will very certainly find Bob, because $lift describes that how characteristic the feature "iPhone" is for the user.

A more mathematical and technical description for the phenomenon is the following:

The probability lift component describes that how much more likely X is true in the specified context, when compared to average.

In Aito query syntax: $p stands for the $\footnotesize p(X|context)$ , while $lift stands for the $\footnotesize lift(X|context)$ component.

Text operators

Useful for creating conditional queries with text fields.

$match

Operator to check if a textual field fuzzy matches a given string.

Case insensitive. The matched text is split to tokens with the analyzer specified for the field in schema. For example { "$match": "great programmers" } will match strings "Bob is the greatest programmer!", and "Programmers are having great fun" if the field is properly analyzed with the English analyzer.

Format

{

"$match":

$toString

string

}

Examples

{ "$match": "coffee" }

{
  "from": "products",
  "where": {
    "name": { "$match": "coffee" }
  }
}

Referenced in

$startsWith

Operator to check if a textual field starts with a given string. Case sensitive.

Format

{

"$startsWith":

$toString

string

}

Examples

{ "$startsWith": "Cucumber" }

{
  "from": "products",
  "where": {
    "name": { "$startsWith": "Cucumber" }
  }
}

Referenced in

Comparison operators

Useful for creating conditional queries.

$gt

Operator to check if a field is greater than a given value.

Format

{

"$gt":

integer

number

null

$toString

boolean

string

Json

}

Examples

{ "$gt": 8 }

{ "$gt": 231.1 }

{ "$gt": "20150308" }

{
  "from": "products",
  "where": {
    "price": { "$gt": 2.14 }
  }
}

Referenced in

$gte

Operator to check if a field is greater than or equal to a given value.

Format

{

"$gte":

integer

number

null

$toString

boolean

string

Json

}

Examples

{ "$gte": -2 }

{ "$gte": 0 }

{ "$gte": "20180502" }

{
  "from": "products",
  "where": {
    "price": { "$gte": 2 }
  }
}

Referenced in

$lt

Operator to check if a field is less than a given value.

Format

{

"$lt":

integer

number

null

$toString

boolean

string

Json

}

Examples

{ "$lt": 4 }

{ "$lt": -12.1 }

{ "$lt": "20180502" }

{
  "from": "products",
  "where": {
    "price": { "$lt": 1.24 }
  }
}

Referenced in

$lte

Operator to check if a field is less than or equal to a given value.

Format

{

"$lte":

integer

number

null

$toString

boolean

string

Json

}

Examples

{ "$lte": 8 }

{ "$lte": 0 }

{ "$lte": "20180502" }

{
  "from": "products",
  "where": {
    "price": { "$lte": 1 }
  }
}

Referenced in

$has

Has operation checks whether the field has the specified feature.

$has is a low level operation, that operates at the feature level. The features can differ significantly from the original data, specifically in case of text, when analyzers are used.

For example if you have field called content with the text "programmers and horses", the field would have features 'programmer' and 'hors', which are stems by the English analyzer.

Format

{

"$has":

integer

number

null

$toString

boolean

string

Json

}

Examples

{ "$has": "drink" }

{
  "from": "products",
  "where": {
    "tags": { "$has": "drink" }
  }
}

Referenced in

$defined

Operator to select rows based on if an nullable field has been defined or not.

Format

{

"$defined": boolean

}

Example

{ "$defined": true }

Referenced in

$exists

An operator to get features of given field(s).

Format

{

// PropositionSet expression is used to describe a collection // of propositions.

"$exists": PropositionSet

}

Examples

{
  "$exists": ["query", "product.tags"]
}

{
  "from": "impressions",
  "where": {
    "$on": [
      {
        "$exists": ["query", "customer.tags"]
      },
      { "click": true }
    ]
  },
  "relate": ["product.title", "product.tags"]
}

Referenced in

Logical operators

Useful for combining multiple conditions in conditional queries.

$and

Performs a logical and operation on the given array containing two or more Propositions.

With the non-inference query (e.g: Search, Similarity), the $and operator guarantees that all propositions are met. For instance, the following search query:

{
  "from": "products",
  "where": {
    "$and": [
      { "description": "super slim laptop" },
      { "price": { "$gt" : 200 } }
    ]
  }
}

will always find products of which description is super "slim laptop" and price is greater than 200

With the inference query (e.g: Predict, Match, Recommend), the $and operator does not guarantee that all propositions are met. For instance, the following predict query:

{
  "from": "products",
  "where": {
    "$and": [
      { "description": "super slim laptop" },
      { "price": { "$gt" : 200 } }
    ]
  },
  "price": "tag"
}
Aito might look for products with a price greater than 200 but do not match the description of super slim laptop or products that match the description but do not meet the price condition. This is because there might be a lack of data (e.g: not enough products in the price range) to make a sophisticated prediction.

To guarantee that all propositions are met in a inference query, refer to $atomic

Format

{

"$and": [

Proposition

, ...]

}

Examples

{
  "$and": [
    { "$gt": 10 },
    { "$lt": 20 }
  ]
}

{
  "from": "products",
  "where": {
    "price": {
      "$and": [
        { "$gt": 1.5 },
        { "$lt": 2.1 }
      ]
    }
  }
}

Referenced in

$or

Performs a logical or operation on the given array containing two or more Propositions.

Format

{

"$or": [

Proposition

, ...]

}

Examples

{
  "$or": [
    { "tags": "cover" },
    { "tags": "laptop" }
  ]
}

{
  "from": "products",
  "where": {
    "price": {
      "$or": [
        { "$lt": 0.9 },
        { "$gt": 2.1 }
      ]
    }
  }
}

Referenced in

$not

Performs a logical not operation on the given Proposition.

Format

{

"$not":

}

Examples

{
  "$not": { "tags": "laptop" }
}

{
  "$not": { "$lt": 0 }
}

{
  "from": "products",
  "where": {
    "price": {
      "$not": { "$lt": 1.1 }
    }
  }
}

Referenced in

Sort operators

Can be used in "orderBy" clause to declare the sorting order of the result.

$asc

Sort returned hits in ascending order (A-Z) based on the given attribute or custom scoring function.

Format

{

// Value expression resolves to a primitive like int or json, // score, probability or.

"$asc": Value

}

Examples

{ "$asc": "price" }

{ "$asc": "product.price" }

{
  "$asc": {
    "$multiply": ["product.price", "$p"]
  }
}

Referenced in

OrderBy

$asc(Relate)

Sort returned hits in ascending order (A-Z) based on the given attribute (or column).

Format

{

"$asc": string

}

Example

{ "$asc": "lift" }

Referenced in

RelateOrderBy

$desc

Sort returned hits in descending order (Z-A) based on the given attribute or custom scoring function.

Format

{

// Value expression resolves to a primitive like int or json, // score, probability or.

"$desc": Value

}

Examples

{ "$desc": "price" }

{ "$desc": "product.price" }

{
  "$desc": {
    "$multiply": ["product.price", "$p"]
  }
}

Referenced in

OrderBy

$desc(Relate)

Sort returned hits in descending (Z-A) order based on the given attribute (or column).

Format

{

"$desc": string

}

Example

{ "$desc": "info.miTrue" }

Referenced in

RelateOrderBy

Arithmetic operators

Can be used in conditional queries or scoring in "orderBy" clauses.

$mod

Operator to check if the value of a field divided by a divisor has the specified remainder.

In other words perform a modulo operation. This operator supports object or array form. Note that the field will be converted to an integer (effectively a math floor) before the modulo operation.

Format

{

"$mod":

ModPropositionObject

ModPropositionArray

}

Examples

{
  "$mod": [2, 0]
}

{
  "$mod": { "divisor": 2, "remainder": 0 }
}

{
  "from": "products",
  "where": {
    "price": {
      "$mod": { "divisor": 2, "remainder": 0 }
    }
  }
}

Referenced in

$multiply

Multiplication operation of given items.

Format

{

"$multiply": [Score, ...]

}

Example

{
  "$multiply": ["price", 2]
}

Referenced in

$divide

Division operation.

Format

{

"$divide":

{

// Score expression resolves to a numeric score value or // probability.

"dividend": Score,

// Score expression resolves to a numeric score value or // probability.

"divisor": Score

}

[Score, Score]

}

Example

{
  "$divide": ["cost", 4]
}

Referenced in

$pow

Exponentiation operation. First item raised to the power of the second.

Format

{

"$pow":

ExponentPropositionObject

ExponentPropositionArray

}

Example

{
  "$pow": ["width", 2]
}

Referenced in

$sum

Calculates sum of given items.

Format

{

"$sum":

ContextValueQuery

[Score, ...]

}

Example

{
  "$sum": ["priceNet", "priceVat"]
}

Referenced in

$subtract

Subtraction operation.

Format

{

"$subtract":

{

// Score expression resolves to a numeric score value or // probability.

"minuend": Score,

// Score expression resolves to a numeric score value or // probability.

"subtrahend": Score

}

[Score, Score]

}

Example

{
  "$subtract": ["price", 2]
}

Referenced in

Advanced operators

More advanced operators which can improve query results in certain situations.

$atomic

Transforms a statement into a 'black box' proposition.

This prevents Aito from analyzing the proposition and using its parts separately in the statistical reasoning.

In practice the difference between normal 'white box' expressions, and the $atomic's black box expressions is: that the atomic expressions have a smaller bias, but a higher measurement error.

Consider the following example:

{
  "tags": "pen",
  "price": { "$gte": 200 } }
}

During the statistical reasoning: Aito may recognize that pens are often sold, and that over 200€ product purchases are somewhat common. As a result, Aito might assume the over 200€ pen to be a popular product.

Now, consider the expression:

{
  "$atomic": {
    "tags": "pen",
    "price": { "$gte" : 200 }
  }
}

The results of this expression will depend of the amount of data. If there are no over 200€ pens in the data: Aito will make no assumptions of the proposition's effect. On the other hand, if you have the data: Aito will recognize correctly, that the over 200€ pens are bought extremely rarely.

Format

{

// Proposition expression describes a fact, or a statement.

"$atomic": Proposition

}

Examples

{
  "$atomic": {
    "tags": "pen",
    "price": { "$gte": 200 }
  }
}

{
  "from": "products",
  "where": {
    "$atomic": {
      "tags": "pen",
      "price": { "$gte": 200 }
    }
  }
}

Referenced in

$context

Provides ability to access the fields of the table specified in "from", instead of fields of the table in "get".

Format

{

// Proposition expression describes a fact, or a statement.

"$context": Proposition

}

Examples

{
  "$context": { "click": true }
}

{
  "from": "impressions",
  "where": { "customerEmail": "john.doe@aito.ai", "query": "laptop" },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

Referenced in

$hit

Provides ability to access the fields of the hit.

Format

{

// Score expression resolves to a numeric score value or // probability.

"$hit": Score

}

Examples

{ "$hit": "price" }

{ "$hit": "$similarity" }

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$multiply": [
      { "$hit": "$similarity" },
      { "$hit": "price" }
    ]
  }
}

Referenced in

$on

$on operator is used to define conditional propositions or hard filters.

This is useful when you have limited amount of data and the condition would help to limit the context and provide better results. This can be done by providing a list containing of two items, the first object (or "prop") is the hypothesis and the second object (or "on") is the conditional.

In Aito the where clause contains propositions which aren't hard filters. Instead, Aito will turn all the propositions into features (the user's ID, every word in a text field, etc.). There are many of these and they are not statistically independent. Aito picks a subset of these features that are the best predictors of the field that is to be predicted. So what goes into the "where" is a description of the situation you're in and Aito tells you what you should expect to find if you look in a field. But the description is not taken at face value, Aito will ignore parts of it if it doesn't help the prediction.

However, there is another way to achieve this: the "$on" proposition. It is modeled after conditional probability. It is divided into two parts, the normal "where" parts and the conditional part ("hard filters"). The "$on" parameters explained:

{
  "from": "...",
  "where": {
    "$on": [
      {
        "message": "hello, world",
        "something": true,
        // other things you put in your "where" clause
      },
      {
        // The subset of data that exactly matches these conditions
        "userId": 42,
        "day": "monday"
      }
    ]
  },
  "predict": "..."
}

The $on can also be combined with normal query. If the $on condition is too strong, you could move parts of the filtering back to the where clause:

{
  "from": "...",
  "where": {
    "$on": [
      {
        "message": "hello, world",
        "something": true,
        // other things you put in your "where" clause
      },
      {
        // The subset of data that exactly matches these conditions
        "day": "monday"
      }
    ],
    "user_id": 42
  },
  "predict": "..."
}

Format

{

"$on":

OnPropositionObject

OnPropositionArray

}

Examples

{
  "$on": {
    "prop": { "click": true },
    "on": { "user.tags": "nyc" }
  }
}

{
  "$on": [
    { "click": true },
    { "user.tags": "nyc" }
  ]
}

Referenced in

$knn

The $knn operator is an adaptation of the classic k-nearest neighbor algorithm.

Aito's $knn operator identifies k most similar rows to the conditions defined in the 'near' parameter. The similarity metric is the same metric used in the similarity query. The k nearest rows can be used in inference.

The $knn operator can be useful in situation where there is no training data. For example:

{
  "from": "impressions",
  "where": {
    "product.name": "Columbian Coffee",
    "product.tags": "high quality coffee"
  },
  "predict": "purchase"
}

The query would not yield sensible results since there's no such product existed in the current data. This can be improved by using the $knn operator:

{
  "from": "impressions",
  "where": {
    "$knn": {
      "k": 5,
      "near": {
        "product.name": "Columbian Coffee",
        "product.tags": "high quality coffee"
      }
    }
  },
  "predict": "purchase"
}

In the query above, Aito would look for 5 entries that are most similar to the given criteria in "near" and use that for inference.

Format

{

"$knn":

KnnPropositionObject

KnnPropositionArray

}

Examples

{
  "$knn": [
    4,
    { "tags": "laptop" }
  ]
}

{
  "$knn": {
    "k": 4,
    "near": { "tags": "laptop" }
  }
}

Referenced in

$nn

The $nn operator is similar to the classic k-nearest neighbor algorithm, except that it matches a dynamic number of entries that are roughly the same as the specified proposition.

Aito's $nn operator identifies all rows that are roughly same to the conditions defined in the 'near' parameter. This group of parameters can be used in inference. This rough sameness is based on the same score used in the $sameness, and you can inspect the score of the matching values with the following query:

{
  "from": "rfps",
  "where": {
    "$nn": [{
      "question": "Does your company comply to ISO 27001?"
    }]
  },
  "orderBy": "$sameness"
}

$nn accepts also threshold parameter that can used to make matching stricter or looser like here:

{
  "from": "rfps",
  "where": {
    "$nn": [{
      "question": "Does your company comply to ISO 27001?"
    }, 0.5]
  },
  "orderBy": "$sameness"
}

The default threshold is 1.0.

An examples of using $nn in inference relates to question answering setting, where there is a desire to avoid false positive present in classic $knn or the default Bayesian inferences.

An example of using $nn for answering RFP question is following:

{
  "from": "rfps",
  "where": {
    "question": {
      "$nn": ["Does your company comply to ISO 27001?"]
    }
  },
  "predict": "answer"
}

This specific question will match similar question in the database. Still, because this question may also match questions like 'Does your comply with ISO 9001?', it makes sense to to also use the question's 'Does your comply with ISO 9001?' conditional features in inference like this:

{
  "from": "rfps",
  "where": {
    "question": {
      "$on": [
        "Does your company comply to ISO 27001?",
        {"$nn": ["Does your company comply to ISO 27001?"]}
      ],
      "$nn": ["Does your company comply to ISO 27001?"]
    }
  },
  "predict": "answer"
}

In this example, we $nn identifies a group of similar questions and uses these in the inference. At the same time $on structure allows the inference to see e.g. ISO 27001 as a separate feature inside this group. In this way, the system can focus on similar questions, while using individual features like ISO 27001 to infer the right answer.

Format

{

"$nn":

NnPropositionObject

NnPropositionArray

}

Examples

{
  "$nn": [
    { "tags": "laptop" }
  ]
}

{
  "$nn": {
    "near": { "tags": "laptop" }
  }
}

Referenced in

$numeric

Operator to check if a numeric field fuzzy matches a given number.

By default, numbers are compared exactly against one another. The $numeric proposition signifies that comparisons should be inexact and that the target is somewhere close to the specified number. The size of the region depends on the spread and density of the data.

Format

{

"$numeric":

integer

number

null

}

Examples

{ "$numeric": 42 }

{ "$numeric": 3.14 }

Referenced in

$hash

$hash converts the field value into a hash integer.

The hash code can be used to split non-integer data pseudo-randomly in the evaluate query.

Format

{

"$hash":

}

Example

{
  "$hash": {
    "$mod": [2, 1]
  }
}

Referenced in

$toString

$toString operator is used to convert a nummeric value to string.

This is useful when you want to use a numeric as input for an operator or a field that requires text input. For example:

{
  "description": {
    "$match": {
      "$toString": { "$get": "id" } 
    }
  }
}

Format

{

"$toString":

integer

number

null

$toString

boolean

string

Json

}

Example

{ "$toString": 4 }

Referenced in

Scoring operators

Can be used in "orderBy" clause to sort or create an advanced scoring algorithm.

$p

"$p" can be used in the "orderBy" clause of the Generic query to get the most probable values. When used this way, it is similar to the Match query.

In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest probabilities that it would be purchased:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": "$p"
}

Similar to the Match query, running the following query would yield the most likely product based on all the fields of the linked product table:

{
  "from": "impressions",
  "where": {
    "context.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": "$p"
}

Since the product field in the impressions table is linked to the products table, Aito would find all the statistical relations between what is declared inside the "where" clause and all the fields feature of a product, that is, id, name, category, price, tag. In this case, the probability score is the normalized product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select" clause (e.g: "select": ["$score", "$why"]):

"$why": {
  "type": "product",
  "factors": [
    {
      "type": "hitPropositionLift",
      "proposition": { "id" : 6410405093677 },
      "value": 1.9827806375460209,
      "factors": [
        {
          "type": "relatedPropositionLift",
          "proposition": { "purchase" : true },
          "value": 1.9827806375460209
        }
      ]
    },
    {
      "type": "hitPropositionLift",
      "proposition": { "$not" : { "name" : { "$has": "puikula" } } },
      "value": 1.0472308585357502,
      "factors": [
        {
          "type": "relatedPropositionLift",
          "proposition": { "purchase" : true },
          "value": 1.0472308585357502
        }
      ]
    }
    ...
  ]
}

We can see that the probability score is composed of lift of an id feature, a name feature and others.

$f

"$f" can be used in the "orderBy" clause of the Generic query to get the frequency of a feature.

Format

string

Examples

"$f"

{ "from": "impressions", "get": "product", "orderBy": "$f" }

Referenced in

$lift

"$lift" can be used in the "orderBy" clause of the Generic query to get the most likely values based on lifts of features with regard to other features.

In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest lifts that it would be purchased:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": "$lift"
}

Running the following query would yield the most likely product based on all the fields of the linked product table:

{
  "from": "impressions",
  "where": {
    "context.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": "$lift"
}

Since the product field in the impressions table is linked to the products table, Aito would find all the statistical relations between what is declared inside the "where" clause and all the fields feature of a product, that is, id, name, category, price, tag. In this case, the lift score is the product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select" clause (e.g: "select": ["$score", "$why"]):

"$why": {
  "type": "product",
  "factors": [
    {
      "type": "hitPropositionLift",
      "proposition": { "id" : 6410405093677 },
      "value": 1.9827806375460209,
      "factors": [
        {
          "type": "relatedPropositionLift",
          "proposition": { "purchase" : true },
          "value": 1.9827806375460209
        }
      ]
    },
    {
      "type": "hitPropositionLift",
      "proposition": { "$not": { "name" : {"$has": "puikula" } } },
      "value": 1.0472308585357502,
      "factors": [
        {
          "type": "relatedPropositionLift",
          "proposition" : { "purchase" : true },
          "value": 1.0472308585357502
        }
      ]
    }
    ...
  ]
}

We can see that the lift score is composed of lift of an id feature, a name feature and others.

$similarity

"$similarity" can be used in Generic query to get most similar rows based on the contents of the "where" clause.

Consider the following example. It will return all the products, that contain 'iphone' in the title. It also sorts the results by their similarity to the 'iphone' and highlight the 'iphone' term in the product title field.

{
  "from": "product",
  "where": { "title": { "$match": "iphone" } },
  "get": "message",
  "orderBy": "$similarity",
  "select": ["title", "$highlight"]
}

Format

string

Examples

"$similarity"

{
  "from": "product",
  "get": "message",
  "orderBy": "$similarity",
  "where": {
    "title": { "$match": "iphone" }
  }
}

Referenced in

$sameness

"$sameness" can be used in Generic query to get most roughly the same rows based on the contents of the "where" clause.

Consider the following example. It will return all the questions, that are roughly same as 'How can I order a sim card?' based on how closely they match the question.

{
  "from": "questions",
  "where": { "title": { "$nn": ["How can I order a sim card?"] } },
  "get": "message",
  "orderBy": "$sameness"
}

Format

string

Examples

"$sameness"

{
  "from": "product",
  "orderBy": "$sameness",
  "where": {
    "title": { "$match": "iphone" }
  }
}

Referenced in

$hit

Provides ability to access the fields of the hit.

Format

{

// Score expression resolves to a numeric score value or // probability.

"$hit": Score

}

Examples

{ "$hit": "price" }

{ "$hit": "$similarity" }

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$multiply": [
      { "$hit": "$similarity" },
      { "$hit": "price" }
    ]
  }
}

Referenced in

$p object

Conceptually similar to the plain $p operator, but allows using a customized proposition for the probability score calculation.

This $p operator enables more options to customized the probability score calculation, especially when getting the values of linked table:

Narrow down the fields that are used to calculate the probability:

This is similar to the behavior of the "basedOn" clause of the Match query

When calculating the probability of a linked field, aito used all the fields of the linked table (See $p for how the probability is calculated for a linked field).

If you would like to narrow down how the probability is calculated, you can add the field name following the $p. For example, find the most likely product based on only the product name:

{
  "from": "impressions",
  "where": {
    "context.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": {
    "$p": "name"
  }
}

You can also calculate the probability based on multiple fields by using the array format. For instance:

{
  "$p": ["category", "tag"]
}

Calculate the probability based on a specific context:

This is similar to the behavior of the Recommend query

By combining with the $context operator, the probability score can be defined as the probability of a context. For instance, to find the products with the highest probability that the product would be purchased:

{
  "from": "impressions",
  "where": {
    "context.user": "bob"
  },
  "get": "product",
  "orderBy": {
    "$p": {"$context": {"purchase": true}}
  }
}

Format

{

// PropositionSet expression is used to describe a collection // of propositions.

"$p": PropositionSet

}

Examples

{ "$p": "tags" }

{
  "$p": ["tags", "title"]
}

{
  "$p": {
    "$context": { "click": true }
  }
}

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": { "$p": "tags" }
}

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$p": ["tags", "title"]
  }
}

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

Referenced in

$probability

Declares the probability of the given proposition(s). This is a more configurable version of $p operation

Note the basedOn field can be used to reduce noise in recommendations. BasedOn contains the list of fields in hit table, that are used in the probability calculation. If basedOn is not defined, all fields are used, including the noisy ones.

Used in order by to sort the result by the probability in descending order.

Format

{

"$probability": {

// PropositionSet expression is used to describe a collection // of propositions.

"of": PropositionSet,

// PropositionSet expression is used to describe a collection // of propositions.

"basedOn": PropositionSet

}

Examples

{
  "$probability": { "of": "tags" }
}

{
  "$probability": {
    "of": ["tags", "title"]
  }
}

{
  "$probability": {
    "of": {
      "$context": { "click": true }
    },
    "basedOn": ["title", "tags", "category"]
  }
}

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$probability": { "of": "tags" }
  }
}

Referenced in

$lift object

Conceptually similar to the plain $lift operator, but allows using a customized proposition for the lift score calculation.

This $lift operator enables more options to customized the lift score calculation, especially when getting the values of linked table.

Narrow down the fields that are used to calculate the lift:

This is similar to the behavior of the "basedOn" clause of the Match query

When calculating the lift of a linked field, aito used all the fields of the linked table (See $lift for how the lift is calculated for a linked field).

If you would like to narrow down how the lift is calculated, you can add the field name following the $lift. For example, find the most likely product based on only the product name:

{
  "from": "impressions",
  "where": {
    "context.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": {
    "$lift": "name"
  }
}

You can also calculate the lift based on multiple fields by using the array format. For instance:

{
  "$lift": ["category", "tag"]
}

Calculate the lift based on a specific context:

This is similar to the behavior of the Recommend query

By combining with the $context operator, the lift score can be defined as the lift of a context. For instance, to find the products with the highest lift of getting purchased:

{
  "from": "impressions",
  "where": {
    "context.user": "bob"
  },
  "get": "product",
  "orderBy": {
    "$lift": {"$context": {"purchase": true}}
  }
}

Format

{

// PropositionSet expression is used to describe a collection // of propositions.

"$lift": PropositionSet

}

Examples

{ "$lift": "tags" }

{
  "$lift": ["tags", "title"]
}

{
  "$lift": {
    "$context": { "click": true }
  }
}

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": { "$lift": "tags" }
}

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$lift": ["tags", "title"]
  }
}

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$lift": {
      "$context": { "click": true }
    }
  }
}

Referenced in

$probabilityLift

Declares the lift of the given proposition(s). This is more configurable version of $lift.

NOTE: This can be used if recommendations pick noise from irrelevant fields. Overall, using only relevant fields often makes inference both more accurate and faster.

Used in order by to sort the result by the probability in descending order.

Format

{

"$probabilityLift": {

// PropositionSet expression is used to describe a collection // of propositions.

"of": PropositionSet,

// PropositionSet expression is used to describe a collection // of propositions.

"basedOn": PropositionSet

}

Examples

{
  "$probabilityLift": { "of": "tags" }
}

{
  "$probabilityLift": {
    "of": ["tags", "title"]
  }
}

{
  "$probabilityLift": {
    "of": {
      "$context": { "click": true }
    },
    "basedOn": ["title", "tags", "category"]
  }
}

{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$probabilityLift": { "of": "tags" }
  }
}

Referenced in

$mean

The mean of given value in context

Example 1:

    {
      "$mean": {
        "$context": "score"
      }
    }

Format

{

// This is used to refer to the fields inside the 'context' // table as contrast to referring fields.

"$mean": ContextValueQuery

}

Example

{
  "$mean": { "$context": "score" }
}

Referenced in

$freqP

An empirical frequency based probability estimate

$freqP allows you to specify two fields, called f (success) and n (trials) and an additional field called p (prior probability).

$freqP will calculate probability estimate for each row based on the f (success), n (trials) and p (prior probability) fields. Priori probability is the probability of success before any data is observed and 0.5 by default.

$freqP comes also with variance information, which means that it can be used together with $decision operation to solve the multi-armed bandit problem.

Format

{

"$freqP": {

// Value expression resolves to a primitive like int or json, // score, probability or.

"f": Value,

// Value expression resolves to a primitive like int or json, // score, probability or.

"n": Value,

// Value expression resolves to a primitive like int or json, // score, probability or.

"p": Value

}

Example

{
  "$freqP": { "f": "clickCount", "n": "impressionCount", "p": 0.5 }
}

Referenced in

$decision

$decision-feature exist to solve the multi-armed bandit problem, where we have:

Decision options with uncertainty about the reward probabilities
And need to gather more data from the options to better estimate reward probabilities

This problem exist classically in advertisement, where there is uncertainity of the click-through rate of the ads. In such case, one needs to find a trade-off between exploring new ads and exploiting the best ads.

$decisions operation returns a 'decision' score, which can be used to randomly order options and select the first one(s). It implements a Bayesian solution, where the decision score is sampled from a normal distribution, that uses the mean and standard error of the given score.

NOTE: that $decision only works with scores having an inherent variance information. This variance is present right now only with numerical estimates calculated with $sum and $freqP. $decision is not yet supported for e.g. $p or basic probability calculation.

Format

{

"$decision": {

// Score expression resolves to a numeric score value or // probability.

"score": Score,

"seed": integer

}

Example

{
  "$decision": {
    "score": {
      "$mean": { "$context": "clickThroughRate" }
    },
    "seed": 0
  }
}

Referenced in

$similarity object

Conceptually similar to the plain $similarity operator, but allows using a customized proposition for the similarity score calculation.

The plain $similarity operator calculates the similarity score based on the "where" clause contents, whereas this $similarity operator calculates the similarity score based on the given proposition.

These Generic Queries would yield the same results:

{
  "from": "products",
  "where": {
    "name": {"$match": "coffee"}
  },
  "orderBy": "$similarity"
}

{
  "from": "products",
  "orderBy": {
    "$similarity": {
      "name": "coffee"
    }
  }
}

This $similarity operation is useful for customizing scoring as the example below. Please refer to GenericQuery query with custom scoring example.

{
  "from": "impressions",
  "where": {
    "context.user": "veronica"
  },
  "get": "product",
  "orderBy": {
    "$multiply": [
      {
        "$p": {
          "$context": {
            "purchase": true
          }
        }
      },
      {
        "$similarity": {
          "name": "coffee"
        }
      }
    ]
  }
}

Format

{

"$similarity":

}

Examples

{
  "$similarity": { "title": "apple iphone", "tags": "premium ios phone" }
}

{
  "from": "products",
  "orderBy": {
    "$similarity": { "title": "apple iphone", "tags": "premium ios phone" }
  }
}

Referenced in

$sameness object

This operator provides a score, which reflect whether the return entry is roughly the same as the given proposition / data.

E.g. if you have a phrase "How to order a sim card?" it should be judged roughly the same as "Ho can I order a sim card?". The values above 1 are considered to be roughly same and values under 1 are considered to be roughly distinct.

$sameness works in similar way to $similarity, except that it does more strict matching. E.g. query "sim card" provides above 1 similarity score with "How can I order sim card?", but it provides significantly under 1 sameness score. $sameness works better in situations, where a more restrictive is scoring is needed to avoid false matches. An example of this is e.g. question answering situation where, where one needs to match questions more strictly in order to avoid false positivess

Format

{

"$sameness":

}

Examples

{
  "$sameness": { "title": "apple iphone", "tags": "premium ios phone" }
}

{
  "from": "products",
  "orderBy": {
    "$sameness": { "title": "apple iphone", "tags": "premium ios phone" }
  }
}

Referenced in

$analogy object

This operator provides a score, which reflect whether the return entry is analogious with the given value in respect to some other values.

E.g. if you have a query word 'cheap' it should have a high $analogy score with words like affordable and inexpensive in respect to correct query result.

E.g. if you have a text "Netflix has great series" it should have a high $analogy score with in respect to a text like 'There are good movies' in respect to sentiment or feedback category.

Format

{

"$analogy": {

"with":

integer

number

null

$toString

boolean

string

Json

// PropositionSet expression is used to describe a collection // of propositions.

"basedOn": PropositionSet

}

Examples

{
  "$analogy": { "with": "veronica", "basedOn": "purchases" }
}

{
  "from": "products",
  "get": "title",
  "orderBy": {
    "$analogy": { "with": "ideapad", "basedOn": "tags" }
  }
}

Referenced in

$normalize

$normalize operator can be used in the "orderBy" clause of the Generic query to make a score to sum to 1. For example, you can normalize the $lift or the $lift object to 1:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": {
    "$normalize": "$lift"
  }
}

{
    "from": "impressions",
    "where": {
        "context.user": "bob",
        "purchase": true
    },
    "get": "product",
    "orderBy": {
        "$normalize": {
      "$lift": { "$context": { "click": "true" } }
    }
    }
}

Format

{

// Score expression resolves to a numeric score value or // probability.

"$normalize": Score

}

Examples

{ "$normalize": "$lift" }

{
  "$normalize": { "$lift": "name" }
}

Referenced in

$impact

Impact of the scored valued (like product) to a value in the context (like star rating) relative to base. If base is not given, the value average is used instead.

Format

{

"$impact": {

// This is used to refer to the fields inside the 'context' // table as contrast to referring fields.

"value": ContextValueQuery,

"base": number

}

Examples

{
  "$impact": {
    "value": { "$context": "score" },
    "base": 2.5
  }
}

{
  "from": "projects",
  "get": "customer.industry",
  "orderBy": {
    "$impact": {
      "value": { "$context": "profit" }
    }
  }
}

Referenced in

$f (context)

The frequency / count of the given proposition in context.

Format

{

"$f": ContextPropositionQuery

}

Examples

{
  "$f": {
    "$context": { "click": false }
  }
}

{
  "from": "impressions",
  "get": "product",
  "orderBy": {
    "$f": {
      "$context": { "$click": true }
    }
  }
}

Referenced in

Aggregate operators

Can be used in aggregate operation.

$mean

Mean or average. Can be used to calculate the average of a field. Similar to SQL AVERAGE operation.

Format

{

// Score expression resolves to a numeric score value or // probability.

"$mean": Score

}

Examples

{ "$mean": "ctr" }

{
  "$mean": {
    "$freqP": { "f": "clicks", "n": "impressions" }
  }
}

Referenced in

$sum

Sum. Can be used to calculate the sum of the field.

Format

{

// Score expression resolves to a numeric score value or // probability.

"$sum": Score

}

Examples

{ "$sum": "visits" }

{ "$sum": "purchase" }

Referenced in

$f (aggregate)

Count the frequency of cases the proposition / condition is true

Format

{

// Proposition expression describes a fact, or a statement.

"$f": Proposition

}

Examples

{
  "$f": { "click": false }
}

{
  "$f": {
    "price": { "$gt": 100 }
  }
}

Referenced in

NamedAggregateProjection

Allows one to name the aggregate projection results

Format

object

Examples

{
  "aggregate": {
    "clicks": {
      "$f": { "click": true }
    },
    "ctr": { "$mean": "click" },
    "impressions": "$f"
  },
  "from": "impressions",
  "where": { "product.name": "MyProduct" }
}

{
  "failureCount": {
    "$f": { "success": false }
  }
}

Referenced in

Projection operators

Can be used in "select" clause as operators.

$index (projection)

$index selects the insertion index of a row. It can be used together with $mod to select parts of a table. It's useful for example in Evaluate query for selecting training or test data.

Format

string

Example

"$index"

Referenced in

$why (object)

Configurable explanations. Allows providing explanation specific highlights and matching.

Note: that why will contain highlights / matches also for the where clause.

Format

{

"$why": {

// # HighlightParams.

"highlight": HighlightParams,

"matches": object

}

Example

{
  "$why": {
    "highlight": {
      "posPreTag": "<b>",
      "posPostTag": "</b>",
      "negPreTag": "<em>",
      "negPostTag": "</em>",
      "encoder": "html",
      "posThreshold": 1.1,
      "negThreshold": 0.9
    },
    "matches": {  }
  }
}

Referenced in

$highlight

Sometimes, there is a need to highlight fields or individual fields' words in the examined hit. This can happen e.g. if the user is searching for specific term. Highlight can also be used in inference situations. E.g. when providing personalized search results, if the user has lactose-intolerance, it can be useful to highlight both 'lactose-free' as a positive factor and 'lactose' as a negative one.

The $highlight will provide for hit each field with HTML escaping and highlighting of positive and negative details.

  {
    "from" : "impressions",
    "where" : {
      "context.query" : "bread"
    },
    "recommend" : "product",
    "goal"      : {"purchase": true},
    "select"    : ["$p", "name", "$highlight"],
    "limit"     : 1
  }

This query produces following result containing a $highlight field:

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$p": 0.5967894477818738,
      "name": "VAASAN Ruispalat 660g 12 pcs fullcorn rye bread",
      "$highlight": [
        {
          "score": 2.5780035230413474,
          "field": "name",
          "highlight": "VAASAN Ruispalat 660g 12 pcs fullcorn rye <font color=\"green\">bread</font>"
        },
        {
          "score": 2.3703213948674122,
          "field": "tags",
          "highlight": "<font color=\"green\">gluten</font> bread"
        },
        {
          "score": 4.9484160228241825,
          "field": "$context.context.query",
          "highlight": "<font color=\"green\">bread</font>"
        }
      ]
    }
  ]
}

The $highlight fields contains a list of higlights. Eech highlight specifies highlights relative importance in the score field, the field that was relevant for the hit and the field content highlighted and html encoded. Note that also the 'where' clause content is highlighted under the $context. prefix.

Format

string

Example

"$highlight"

Referenced in

$highlight (object)

Parametric highlight makes it possible to highlight the most important parts of the query that contributed to the score in more configurable way. This is useful for debugging and understanding the scoring process.

The highlight is returned as a list of strings, where each string represents a part of the query that contributed to the score. The strings are ordered by importance, with the most important part first.

The highlight is only returned if the query contains at least one term that contributed to the score. If the query contains no terms that contributed to the score, the highlight is not returned.

NOTE: That highlight also highlights the fields in 'where' clause.

Parametric highlight allows user to configure positive highlight and negative highlight html tags. One can also configure the highlight sensitivity, e.g. that how positive or negative the score has to be to trigger highlight. Highlighted fields are HTML encoded by default, but this can be also disabled via parameter.

Format

{

"$highlight": {

"posPreTag": string,

"posPostTag": string,

"negPreTag": string,

"negPostTag": string,

"posThreshold": number,

"negThreshold": number,

"encoder": string

}

Example

{
  "$highlight": {
    "posPreTag": "<b>",
    "posPostTag": "</b>",
    "negPreTag": "<em>",
    "negPostTag": "</em>",
    "encoder": "html"
  }
}

Referenced in

$matches

The $matches will provide for hit each field information about how the field is related to the hit score. It's worth emphasizing that the $matches is based on the hit score. This means that the matches will contain only the features that have contributed to the hit score. So the score is the probability, the $matches will contain items, that contributed to the probability. If the score is based on similarity, the $matches will contain the features specified inside the similarity query. This means that e.g. $match operations in where clause will not be included in the $matches.

Consider the following recommend query:

  {
    "from" : "impressions",
    "where" : {
      "context.user": "larry"
    },
    "recommend" : "product",
    "goal"      : {"purchase": true},
    "select"    : ["$score", "name", "$matches"],
    "limit"     : 1
  }

This query produces following result containing a $matches field:

{
    "offset" : 0,
    "total" : 42,
    "hits" : [ {
      "$score" : 0.1346800331844607,
      "name" : "Vaasan Ruispalat thin sliced rye bread 6pcs/195g",
      "$matches" : {
        "name" : [ {
          "begin" : 33,
          "end" : 38,
          "feature" : "bread",
          "text" : "bread",
          "why" : {
            "type" : "product",
            "factors" : [ {
              "type" : "hitPropositionLift",
              "proposition" : {
                "name" : {
                  "$has" : "bread"
                }
              },
              "value" : 2.658334404390319,
              "factors" : [ {
                "type" : "relatedPropositionLift",
                "proposition" : {
                  "context.user" : {
                    "$has" : "larry"
                  }
                },
                "value" : 2.658334404390319
              } ]
            } ]
          },
          "score" : 2.658334404390319
        } ],
        "tags" : [ {
          "begin" : 0,
          "end" : 6,
          "feature" : "gluten",
          "text" : "gluten",
          "why" : {
            "type" : "product",
            "factors" : [ {
              "type" : "hitPropositionLift",
              "proposition" : {
                "tags" : {
                  "$has" : "gluten"
                }
              },
              "value" : 2.184491633564294,
              "factors" : [ {
                "type" : "relatedPropositionLift",
                "proposition" : {
                  "context.user" : {
                    "$has" : "larry"
                  }
                },
                "value" : 2.184491633564294
              } ]
            } ]
          },
          "score" : 2.184491633564294
        } ]
      }
    } ]
  }

The $matches contains separate match list for 'name' and 'tags' fields. Both fields have one match. 'Name' contains a match 'bread' as larry seems to be purchasing bread 2.67 times more often than average user. Both matches contain the begin and end positions, that identify the location of the match in the field's text. These, alongside the text field,
that contains the matching original text, can be used to highlight or bold the matching statement in UI. The $why field contains the explanation for the match, and it traces the score batch to the user variable.

Feature field contains the matched feature, and it can be used inside $has proposition to find other content with the same feature.

Format

string

Example

"$matches"

Referenced in

$matches (object)

A version of $matches, that accepts parameters and can could be configured, except that it doesn't yet have any configuration parameters.

NOTE: That matches will also match the fields in the 'where' clause.

Format

{

"$matches": object

}

Example

{
  "$matches": {  }
}

Referenced in

NamedProjection

Named projection makes it possible to name the projected values.

This is useful for e.g. naming different averages with more descriptive names as in the following example:

{
  "ctr": {"$mean": { "$context": "click" } },
  "meanScore": {"$mean": { "$context": "score" } }
}

Format

object

Examples

{
  "from": "impressions",
  "get": "product",
  "select": {
    "clicks": {
      "$f": {
        "$context": { "click": true }
      }
    },
    "ctr": {
      "$mean": { "$context": "click" }
    },
    "impressions": "$f",
    "product": "name"
  }
}

{
  "failures": {
    "$f": {
      "$context": { "success": false }
    }
  }
}

{
  "id": "id",
  "mean": {
    "$mean": { "$context": "product.price" }
  }
}

{ "$value": "$value", "field": "field", "id": "id" }

Referenced in

Built-in attributes

Can be used in "select" clause as fields.

$index

$index is a built-in variable which indicates the insertion index of a row. It can be used together with $mod to select parts of a table. It's useful for example in Evaluate query for selecting training or test data.

Format

string

Example

"$index"

Referenced in

$sort

$sort is a built-in field that can be used to access the sort value used in the orderBy-clause.

Format

string

Example

"$sort"

Referenced in

$score

$score is a built-in field that can be used to access the sort value used in the orderBy-clause, when the sort-value is a numeric score like a probability.

Format

string

Example

"$score"

Referenced in

$p

$p is a built-in field that can be used to access the value used by orderBy-clause, when the sort-value is a probability. See $p for more information.

Format

string

Example

"$p"

Referenced in

$why

When selecting $why, Aito opens up why a certain result was predicted. Explanation contains 3 different factors, which are explained below.

The three different factors are for an estimate of form:

p(x_i | A, B, C)

`"baseP"`

The base probability.

p(X)

`"normalizer"`

Aito has two different normalizes, that are

exclusiveness normalizer
trueFalseExclusiveness normalizer

These normalizes are often grouped into a single 'product' component.

{
  "type" : "product",
  "factors" : [ {
    "type" : "normalizer",
    "name" : "exclusiveness",
    "value" : 1.0119918068684681
  }, {
    "type" : "normalizer",
    "name" : "trueFalseExclusiveness",
    "value" : 1.09917613448721
  } ]
}

The exclusiveness normalizer is only used, when exclusiveness is on. In this case, it is assumed that only one feature can be true at the same time, and that one feature will be true. In practice, exclusiveness enforces the probabilities of alternative features to sum to 1.0.

The normalizer is of form:

\dfrac{1}{sum((p(X_0) + p(X_1) + ...))}

Aito makes a probability estimation for both X and ¬X on the background and uses the trueFalseExclusiveness normalizer to assert that the probabilities P(X) and P(¬X) sum to 1.0.

The normalizer is of form:

\dfrac{1}{p(X) + p(\neg X)}

`"relatedVariableLift"`

Probability lifts. For example: the lift may say a product is clicked with 2.3x likelihood (or 130% higher likelihood), when it has 5 stars.

A probability lift is of form:

\dfrac{p(A | X)}{p(A)}

Format

string

Example

"$why"

Referenced in

$value

$value is a built-in field which contains the value of the returned object. $value can be used to access the field value referred in the predict, match, recommend and get-clauses, when the returned item is either a field value or a field feature/proposition. $value is intended to replace the 'feature' field in the long term.

The $value field has been added to contain the information in the ‘feature’ so that for query:

  {
    "from" : "products",
    "where" : {
      "title" : "apple iphone"
    },
    "predict": "tags",
    "select" : ["$p", "$value"],
    "limit":3
  }

The result will be the following:

  {
    "offset" : 0,
    "total" : 10,
    "hits" : [ {
      "$p" : 0.3656914544001758,
      "$value" : "premium"
    }, {
      "$p" : 0.1546922568903658,
      "$value" : "cover"
    }, {
      "$p" : 0.09493670104339776,
      "$value" : "macosx"
    } ]
  }

Value works similarly, when predicting the field value, using the generic query.

  {
    "from" : "products",
    "where" : {
      "title" : "apple iphone"
    },
    "get": "tags",
    "orderBy" : "$p",
    "select" : ["$p", "$value"],
    "limit":3
  }

Or when when predicting the field features with the generic query:

  {
    "from" : "products",
    "where" : {
      "title" : "apple iphone"
    },
    "get": "tags.$feature",
    "orderBy" : "$p",
    "select" : ["$p", "$value"],
    "limit":3
  }

Format

string

Example

"$value"

Referenced in

$proposition

$proposition is a built-in field which contains the proposition object of the returned feature. The returned proposition is compatible with the proposition format and it can be used as such in the where clause.

Consider the following query:

  {
    "from": "products",
    "where": {
      "title": "Apple"
    },
    "predict": {
      "$on": [
        { "$exists": "tags" },
        { "$and": [
          { "tags": { "$match": "phone" } },
          { "$not": { "tags": { "$match": "laptop" } } }
        ] }
      ]
    },
    "select": ["$p", "$value", "$proposition"],
    "limit": 1
  }

This provides the following results:

  {
    "offset" : 0,
    "total" : 10,
    "hits" : [ {
      "$p" : 0.22622976807854914,
      "$value" : "phone",
      "$proposition" : {
        "$on" : [ {
          "tags" : {
            "$has" : "phone"
          }
        }, {
          "$and" : [ {
            "tags" : {
              "$has" : "phone"
            }
          }, {
            "$not" : {
              "tags" : {
                "$has" : "laptop"
              }
            }
          } ]
        } ]
      }
    } ]
  }

Format

string

Example

"$proposition"

Referenced in

Explanation objects

Explanation object when using the "$why" operator.

SumExplanation

Explain how a summation was calculated.

The SumExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a score calculated using the $sum operator.

Format

{

// The explanation type: sum. Required.

"type": string,

// The explanation of the summed score's terms. Required.

"terms": [ScoreExplanation, ...]

}

Example

{
  "terms": [
    { "field": "id", "type": "field", "value": 4 },
    { "field": "price", "type": "field", "value": 1500 }
  ],
  "type": "sum"
}

Referenced in

SubtractionExplanation

This explanation object explains how a substraction result was calculated. It occurs in $why results, if $subtract operation is used.

Format

{

// The explanation type: divide. Required.

"type": string,

// The subtracted value. Required.

"minuend": ScoreExplanation,

// The subtraction value. Required.

"subtrahend": ScoreExplanation

}

Example

{
  "minuend": { "field": "price", "type": "field", "value": 119.5 },
  "subtrahend": { "field": "cost", "type": "field", "value": 100.5 },
  "type": "subtraction"
}

Referenced in

ProductExplanation

This explanation object explains how a product score was calculated. It occurs in $why results, if $multiply operation is used.

The ProductExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of:

Aggregated score by product. For example, in a Match query:

{
  "from": "impressions",
  "where": {
    "context.user": "larry"
  },
  "match": "product",
  "select": ["$score", "name", "$why"]
}

The final score is a product of multiple score components:

{
  "type": "product",
  "factors": [
    {
      "type": "hitPropositionLift",
      "proposition": { "id" : 6410405216120 },
      "value": 599.5491890842981,
      "factors": [
        {
          "type": "baseLift",
          "value": 265.0
        },
        {
          "type": "relatedPropositionLift",
          "proposition": { "context.user" : "larry" },
          "value": 2.2624497701294266
        }
      ]
    },
    ...
  ]
}

A score calculated using the $multiply operator.

Format

{

// The explanation type: product. Required.

"type": string,

// The explanation of the product score's factors. Required.

"factors": [ScoreExplanation, ...]

}

Example

{
  "factors": [
    {
      "factors": [
        { "type": "baseLift", "value": 31 }
      ],
      "proposition": { "id": 3 },
      "type": "hitPropositionLift",
      "value": 31
    },
    { "field": "price", "type": "field", "value": 1500 }
  ],
  "type": "product"
}

Referenced in

BaseLiftExplanation

Conceptually similar to BaseProbabilityExplanation but show the prior lift instead of prior probability.

See more Probability vs. Lift

Format

{

// The explanation type: baseLift. Required.

"type": string,

// The prior lift. Required.

"value": number

}

Example

{ "type": "baseLift", "value": 31 }

Referenced in

DivisionExplanation

This explanation object explains how a division result was calculated. It occurs in $why results, if $divide operation is used.

Format

{

// The explanation type: divide. Required.

"type": string,

// The divided value. Required.

"dividend": ScoreExplanation,

// The divider value. Required.

"divisor": ScoreExplanation

}

Example

{
  "dividend": { "field": "return", "type": "field", "value": 400000 },
  "divisor": { "field": "investment", "type": "field", "value": 250000 },
  "type": "division"
}

Referenced in

BaseProbabilityExplanation

Explain the initial weight of a feature. It can be understand as the prior probability $p(X)$ of a feature.

Let's take a look at an example of a Predict query:

{
  "from": "products",
  "where": {
    "name": "Columbian coffee"
  },
  "predict": "tags",
  "select": ["$p", "feature", "$why"]
}

When opening up the explanation with "$why" operator, a tag's feature "coffee" has a BaseProbabilityExplanation:

{
  "type": "baseP",
  "value": 0.16
}

This explanation tells that Aito gives the feature "coffee" a prior probability of 0.16.

Format

{

// The explanation type: baseP. Required.

"type": string,

// The prior probability. Required.

"value": number,

// The variable.

"proposition": PropositionExplanation

}

Example

{
  "proposition": { "click": true },
  "type": "baseP",
  "value": 0.5
}

Referenced in

Explain how a related variable's lift was calculated.

A related variable (feature) most commonly appears when doing inference with some conditions. The RelatedVariableLiftExplanation explains how a variable of the conditions affecting the lift of a hit's variable.

Let's take a look at an example of Match query:

{
  "from": "impressions",
  "where": {
    "context.user": "larry"
  },
  "match": "product",
  "select": ["$score", "name", "$why"]
}

When opening up the explanation with "$why" operator, the first hit has an explanation as follows:

{
  "type": "hitVariableLift",
  "variable": "id:6410405216120",
  "value": 599.5491890842981,
  "factors": [
    {
      "type": "baseLift",
      "value": 265.0
    },
    {
      "type": "relatedVariableLift",
      "variable": "context.user:larry",
      "value": 2.2624497701294266
    }
  ]
}

This explains that the feature "context.user:larry" extracted from the conditions "where": { "context.user": "larry" } enhances the likelihood that the product having an id of 6410405216120 with a lift of 2.2624497701294266.

Format

{

// The explanation type: relatedPropositionLift. Required.

"type": string,

// The related proposition. Required.

"proposition": PropositionExplanation,

// The lift value of the related proposition. Required.

"value": number

}

Examples coming later.

Referenced in

HitLinkPropositionLiftExplanation

Explain how a propositions's lift was calculated.

HitLinkPropositionLiftExplanation explains the impact of the value that links to table containing the returned hits.

Let's consider an example, where there is an impression table that has a numeric field 'product' that links to the product table. In such a case the HitLinkPropositionLift would explain the significance of the field 'product' in the impression table. E.g., if the product link's value is 4, the HitLinkPropositionLiftExplanation will explain the effect of the proposition { "product" : 4 }. If the value is 2.0, it means that the 4 product is estimated to be twice as probable just based on the statistics of the linking column.

Format

{

// The explanation type: hitLinkPropositionLift. Required.

"type": string,

// The link proposition. Required.

"proposition": PropositionExplanation,

// The lift value of the linked proposition. Required.

"value": number

}

Example

{
  "proposition": { "product": 5 },
  "type": "hitLinkPropositionLift",
  "value": 2.32
}

Referenced in

DecoratedScoreExplanation

This format is essentially a normal score explanation, but it can contain additional $matches and $highlight fields. It is returned when ParametricWhy is used.

Format

object

Examples coming later.

Referenced in

HitPropositionLiftExplanation

Explain how a propositions's lift was calculated.

A hit score was calculated by aggregating the score of its propositions (features). The HitPropositionLiftExplanation explains how different proposition was calculated.

A HitPropositionLift can be:

A similarity score

A hit's field can contain a word that match the stem of the given similarity condition. That word would have a HitPropositionLift that is a similarity score. Let's take a look at an example of Similarity query:

{
  "from": "products",
  "similarity": {
    "name": "Columbian coffee",
    "tags": "expansive coffee"
  },
  "select": ["$score", "name", "tags", "$why"]
}

When opening up the explanation with "$why" operator, we can see that a hit with name "Juhla Mokka coffee 500g sj" containing the word coffee has a HitPropositionLiftExplanation as follows:

{
  "type": "hitPropositionLift",
  "proposition": "name:coffe",
  "value": 2.1726635013471625,
  "factors": [
    {
      "type": "exponent",
      "value": 2.1726635013471625,
      "base": {
        "type": "idf",
        "value": 2.1726635013471625
      },
      "power": {
        "type": "tf",
        "value": 1.0
      }
    }
  ]
}

An aggregated score of BaseLift and RelatedPropositionLift

Let's take a look at an example of Match query:

{
  "from": "impressions",
  "where": {
    "context.user": "larry"
  },
  "match": "product",
  "select": ["$score", "name", "$why"]
}

When opening up the explanation with "$why" operator, the first hit has a HitPropositionLiftExplanation as follows:

{
  "type": "hitPropositionLift",
  "proposition": { "id" : { "$has" : 6410405216120 } },
  "value": 599.5491890842981,
  "factors": [
    {
      "type": "baseLift",
      "value": 265.0
    },
    {
      "type": "relatedPropositionLift",
      "proposition": { "context.user": { "$has" : "larry" } },
      "value": 2.2624497701294266
    }
  ]
}

This explains that the initial lift of the feature "id:6410405216120" is 265 and when the user is Larray, the relatedPropositionLift is 2.2624497701294266. Hence the aggregated lift is $265 * 2.2624497701294266 = 599.5491890842981$

Format

{

// The explanation type: hitPropositionLift. Required.

"type": string,

// The proposition. Required.

"proposition": PropositionExplanation,

// The aggregated lift value. Required.

"value": number,

// The factors contributing to the aggregated lift and their // explanation. Required.

"factors": [ScoreExplanation, ...]

}

Example

{
  "factors": [
    { "type": "baseLift", "value": 31 }
  ],
  "proposition": { "field": 4 },
  "type": "hitPropositionLift",
  "value": 31
}

Referenced in

ConstantExplanation

Default value explanation describes a constant value, typically given by the user.

Format

{

// The explanation type. Required.

"type": string,

"value": number

}

Examples coming later.

Referenced in

DefaultValueExplanation

Default value explanation describes the default score for some operation.

For example TF-IDF scoring assigns default lift of 1.0 for all rows without matching terms.

Format

{

// The explanation type: default value. Required.

"type": string,

// The default value. Required.

"value": number

}

Example

{ "type": "default", "value": 1 }

Referenced in

NamedExplanation

Explain how a special named score was calculated.

The NamedExplanation now only appears when calculating a score with exclusiveness. In this case, it explains the normalizer that enforces the probabilities of a feature to have sum of 1.0.

Format

{

// Normalizer. Required.

"type": string,

// Exclusiveness. Required.

"name": string,

// The value of the normalizer. Required.

"value": number

}

Example

{ "name": "exclusiveness", "type": "normalizer", "value": 0.2982788431762749 }

Referenced in

PredictExplanation

Explain how a probability was calculated.

The PredictExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a Predict query. Let's take a look at an example of Predict query:

  {
    "from": "products",
    "where": {
      "name": "Columbian coffee"
    },
    "predict": "tags",
    "select": ["$p", "feature", "$why"],
    "limit": 22
  }

The first hit has an explanation of"

{
  "type": "product",
  "factors": [
    {
      "type": "baseP",
      "value": 0.16
    },
    {
      "type" : "product",
      "factors" : [ 
        {
          "type" : "normalizer",
          "name" : "exclusiveness",
          "value" : 1.0119918068684681
        }, 
        {
          "type" : "normalizer",
          "name" : "trueFalseExclusiveness",
          "value" : 1.09917613448721
        } 
      ]
    },
    {
      "type": "relatedVariableLift",
      "variable": "name:coffe",
      "value": 8.45603245079726
    }
  ]
}

Format

{

// The explanation type: product. Required.

"type": string,

// The explanation of the probability's factors. Required.

"factors": [ScoreExplanation, ...]

}

Example

{
  "factors": [
    { "type": "baseP", "value": 0.8048780487804879 },
    {
      "name": "exclusiveness",
      "type": "normalizer",
      "value": 0.04604801347746731
    }
  ],
  "type": "product"
}

Referenced in

ExponentExplanation

Explain how an exponent score was calculated.

The ExponentExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of an exponent score such as: 1. The tf-idf score to calculate the similarity in the Similarity query. 1. The score of the $pow operator.

Format

{

// The explanation type: exponent. Required.

"type": string,

// The exponent score. Required.

"value": number,

// The explanation of the base score element. Required.

"base": ScoreExplanation,

// The explanation of the power score element. Required.

"power": ScoreExplanation

}

Example

{
  "base": { "type": "idf", "value": 1.7551720221592049 },
  "power": { "type": "tf", "value": 1 },
  "type": "exponent",
  "value": 1.7551720221592049
}

Referenced in

TermFrequencyExplanation

Explain the term frequency score.

The term frequency score is one component of the term frequency-inverse document frequency score which is used in Aito's similarity metrics.

Format

{

// The explanation type: tf. Required.

"type": string,

// The term frequency score. Required.

"value": number

}

Example

{ "type": "tf", "value": 1 }

Referenced in

InverseDocumentFrequencyExplanation

Explain the inverse document frequency score.

The inverse document frequency score is one component of the term frequency-inverse document frequency score which is used in Aito's similarity metrics.

Format

{

// The explanation type: idf. Required.

"type": string,

// The inverse document frequency score. Required.

"value": number

}

Example

{ "type": "idf", "value": 1.7551720221592049 }

Referenced in

FieldExplanation

Explain how a field score was calculated.

The field explanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a score that was calculated using:

A field value

{
  "from" : "impressions",
  "where" : {
    "product.name":{"$match": "coffee"}
  },
  "get":"product",
  "orderBy" : {
    "$multiply": ["$p", "price"]
  },
  "select": ["$score", "$why"]
}

The explanations would contains the value of the "price" field that was use in the $multiply operator.

{
  "type": "field",
  "field": "price",
  "value": 3.95
}

A field feature (e.g: $f operator for frequency):

{
  "from" : "impressions",
  "where" : {
    "product.name":{"$match": "coffee"}
  },
  "get":"product",
  "orderBy" : "$f",
  "select": ["$score", "$why"]
}

The explanation would contains the frequency of the feature.

{
  "type": "field",
  "field": "$f",
  "value": 152.0
}

Format

{

// The explanation type: field. Required.

"type": string,

// The name or feature of the field. Required.

"field": string,

// The score value. Required.

"value": number

}

Example

{ "field": "price", "type": "field", "value": 1500 }

Referenced in

BaseProbabilityExplanation

Explain how a score was calculated.

Format

SumExplanation

SubtractionExplanation

ProductExplanation

DivisionExplanation

RelatedPropositionLiftExplanation

HitLinkPropositionLiftExplanation

DecoratedScoreExplanation

HitPropositionLiftExplanation

ConstantExplanation

DefaultValueExplanation

TermFrequencyExplanation

InverseDocumentFrequencyExplanation

FieldExplanation

Example

{ "type": "baseP", "value": 0.28 }

Referenced in

DivisionExplanation

ExponentExplanation

HitPropositionLiftExplanation

PredictExplanation

ProductExplanation

ResponseHit

SubtractionExplanation

SumExplanation

Explanation proposition objects

Explanation proposition object when using the "$why" or "relate" operator.

FieldPropositionExplanation

FieldPropositionExpanation expresses a statement about a document field

For example the expression

{
  "tags": { "$has" : "laptop" }
}

states, that the tags field contains the "laptop" feature

Format

{

// This is the format for all propositions used in the $why and // relate explanations. Required.

"fieldName": PropositionExplanation

}

Examples coming later.

Referenced in

HasExplanation

This format is used in the $why and relate explanations for the '$has'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$has":

integer

number

boolean

null

string

object

}

Examples coming later.

Referenced in

AndExplanation

This format is used in the $why and relate explanations for the '$and'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$and": [PropositionExplanation, ...]

}

Examples coming later.

Referenced in

OrExplanation

This format is used in the $why and relate explanations for the '$or'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$or": [PropositionExplanation, ...]

}

Examples coming later.

Referenced in

OnExplanation

This format is used in the $why and relate explanations for the '$on'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$on": [PropositionExplanation, ...]

}

Examples coming later.

Referenced in

NotExplanation

This format is used in the $why and relate explanations for the '$not'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

// This is the format for all propositions used in the $why and // relate explanations.

"$not": PropositionExplanation

}

Examples coming later.

Referenced in

StartsWithExplanation

This format is used in the $why and relate explanations for the '$startsWith'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$startsWith": string

}

Examples coming later.

Referenced in

GtExplanation

This format is used in the $why and relate explanations for the '$gt'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$gt":

integer

number

boolean

null

string

object

}

Examples coming later.

Referenced in

GteExplanation

This format is used in the $why and relate explanations for the '$gte'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$gte":

integer

number

boolean

null

string

object

}

Examples coming later.

Referenced in

LtExplanation

This format is used in the $why and relate explanations for the '$lt'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$lt":

integer

number

boolean

null

string

object

}

Examples coming later.

Referenced in

LteExplanation

This format is used in the $why and relate explanations for the '$lte'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$lte":

integer

number

boolean

null

string

object

}

Examples coming later.

Referenced in

DefinedExplanation

This format is used in the $why and relate explanations for the '$defined'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$defined": boolean

}

Examples coming later.

Referenced in

NumericExplanation

This format is used in the $why and relate explanations for the '$numeric'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$numeric": number

}

Examples coming later.

Referenced in

KnnExplanation

This format is used in the $why and relate explanations for the '$knn'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$knn": {

"k": integer,

"near": [PropositionExplanation, ...]

}

Examples coming later.

Referenced in

NnExplanation

This format is used in the $why and relate explanations for the '$nn'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

{

"$nn": {

"near": [PropositionExplanation, ...]

}

Examples coming later.

Referenced in

IsPropositionExplanation

This format is used in the $why and relate explanations for the is-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format

integer

number

boolean

null

string

object

Examples coming later.

Referenced in

FieldPropositionExplanation

This is the format for all propositions used in the $why and relate explanations.

Format

StartsWithExplanation

IsPropositionExplanation

Examples coming later.

Referenced in

AndExplanation

AnyValue

BaseProbabilityExplanation

FieldPropositionExplanation

HitLinkPropositionLiftExplanation

HitPropositionLiftExplanation

RelatedPropositionLiftExplanation

ResponseHit

Other types

All other API types.

ColumnName

Name of a column in a table. Links are supported.

Format

string

Examples

"id"

"age"

"product.id"

Referenced in

Score

EvaluateGenericQuery

A Generic query to be evaluated in the Evaluate query

Format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// Get expression defines what items are returned as query // results.

"get": Get,

// Declares the sorting order of the result by a field or by a // user-defined score.

"orderBy": OrderBy,

// Defines the fields returned by the select statement.

"select": Projection,

// The number of results to skip from the beginning. // Default: 0

"offset": integer,

// The maximum number of results to retrieve. // Default: 10

"limit": integer

}

Example

{ "from": "impressions", "get": "product", "limit": 20, "offset": 10 }

Referenced in

EvaluateGroupedOperation

Supported query to be evaluated in EvaluateGroupedQuery. Currently only support Generic query and Recommend query

Format

EvaluateRecommend

Examples

{
  "from": "impressions",
  "goal": { "purchase": "true" },
  "recommend": "product",
  "where": {
    "product.name": { "$get": "query" },
    "session.user": { "$get": "session.user" }
  }
}

{
  "from": "impressions",
  "get": "product",
  "orderBy": {
    "$p": { "purchase": true }
  },
  "where": {
    "product.name": { "$get": "query" },
    "session.user": { "$get": "session.user" }
  }
}

Referenced in

EvaluateGroupedQuery

The EvaluateGroupedQuery is similar to the EvaluateQuery with an addition option to group multiple entries into a single test case.

For example, if there exists a "customerCohort" identifier in "impressions" table, we can evaluate by the customerCohort instead of the individual customer with the following EvaluateGroupedQuery:

{
  "evaluate": {
    "from": "impressions",
    "where": {
      "customer": { "$get": "customer" }
    },
    "recommend": "product",
    "goal": { "purchase": true }
  },
  "group": "customerCohort",
  "test": {
    "customerCohort": { "$gte": 5 }
  },
  "select": ["trainSamples", "testSamples", "meanRank"]
}

Format

{

// Proposition expression describes a fact, or a statement.

"train": Proposition,

// Proposition expression describes a fact, or a statement.

"test": Proposition,

// TestSource enables more options to choose the testing data // in the [Evaluate Query](#post-api-v1-evaluate).

"testSource": TestSource,

// Describes the fields and/or built-in attributes to return.

"select": Selection,

"group": string,

// Proposition expression describes a fact, or a statement.

"goal": Proposition,

// Supported query to be evaluated in // [EvaluateGroupedQuery](#schema-evaluate-grouped-query). // Required.

"evaluate": EvaluateGroupedOperation

}

Example

{
  "evaluate": {
    "from": "impressions",
    "goal": { "purchase": "true" },
    "recommend": "product",
    "where": {
      "product.name": { "$get": "query" },
      "session.user": { "$get": "session.user" }
    }
  },
  "group": "userGroup",
  "select": ["accuracy", "meanRank", "n"],
  "test": {
    "userGroup": { "$gte": 5 }
  }
}

Referenced in

Evaluate

EvaluateMatch

A Match query to be evaluated in the Match query

Format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// Defines the fields returned by the select statement.

"select": Projection,

// Get expression defines what items are returned as query // results. Required.

"match": Get,

// PropositionSet expression is used to describe a collection // of propositions.

"basedOn": PropositionSet,

// The number of results to skip from the beginning. // Default: 0

"offset": integer,

// The maximum number of results to retrieve. // Default: 10

"limit": integer

}

Example

{
  "from": "impressions",
  "limit": 2,
  "match": "prevProduct",
  "offset": 2,
  "select": ["title", "description", "price"],
  "where": { "customer": 4, "query": "laptop" }
}

Referenced in

EvaluateMultiGenericQuery

The Generic Query to be evaluated in a EvaluateGroupedQuery

Format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// Get expression defines what items are returned as query // results. Required.

"get": Get,

// Declares the sorting order of the result by a field or by a // user-defined score.

"orderBy": OrderBy,

// Defines the fields returned by the select statement.

"select": Projection,

"offset": integer,

"limit": integer

}

Example

{
  "from": "impressions",
  "get": "product",
  "orderBy": {
    "$p": { "purchase": true }
  },
  "where": {
    "product.name": { "$get": "query" },
    "session.user": { "$get": "session.user" }
  }
}

Referenced in

EvaluateGroupedOperation

EvaluateOperation

Operation to be evaluated.

Format

Examples

{
  "from": "messages",
  "get": "product",
  "similarity": {
    "description": { "$get": "message" },
    "title": { "$get": "message" }
  }
}

{
  "from": "products",
  "get": "product",
  "orderBy": "$p",
  "where": {
    "name": { "$get": "name" }
  }
}

{
  "from": "products",
  "predict": "category",
  "where": {
    "name": { "$get": "name" }
  }
}

{
  "from": "messages",
  "match": "product",
  "where": {
    "message": { "$get": "message" }
  }
}

Referenced in

EvaluateQuery

EvaluatePredict

A Predict query to be evaluated in the Evaluate query

Format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// PropositionSet expression is used to describe a collection // of propositions. Required.

"predict": PropositionSet,

// PropositionSet expression is used to describe a collection // of propositions.

"basedOn": PropositionSet,

"exclusiveness": boolean,

// Defines the fields returned by the select statement.

"select": Projection,

"offset": integer,

"limit": integer

}

Example

{
  "from": "products",
  "predict": "category",
  "where": {
    "name": { "$get": "name" }
  }
}

Referenced in

EvaluateQuery

A query to evaluate:

Match query
Predict query
Similarity query
Generic query

Format

{

// Proposition expression describes a fact, or a statement.

"train": Proposition,

// Proposition expression describes a fact, or a statement.

"test": Proposition,

// TestSource enables more options to choose the testing data // in the [Evaluate Query](#post-api-v1-evaluate).

"testSource": TestSource,

// Describes the fields and/or built-in attributes to return.

"select": Selection,

// Operation to be evaluated. Required.

"evaluate": EvaluateOperation

}

Example

{
  "evaluate": {
    "from": "products",
    "predict": "category",
    "where": {
      "name": { "$get": "name" }
    }
  },
  "select": ["accuracy", "meanRank", "n"],
  "test": {
    "$index": {
      "$mod": [10, 1]
    }
  }
}

Referenced in

Evaluate

A Recommend query to be evaluated in the Evaluate query

Format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// Get expression defines what items are returned as query // results. Required.

"recommend": Get,

// Proposition expression describes a fact, or a statement. // Required.

"goal": Proposition,

// Defines the fields returned by the select statement.

"select": Projection,

"offset": integer,

"limit": integer

}

Example

{
  "from": "impressions",
  "goal": { "purchase": "true" },
  "recommend": "product",
  "where": {
    "product.name": { "$get": "query" },
    "session.user": { "$get": "session.user" }
  }
}

Referenced in

EvaluateGroupedOperation

EvaluateSimilarity

A Similarity query to be evaluated in the Evaluate query

Format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement.

"where": Proposition,

// Get expression defines what items are returned as query // results. Required.

"get": Get,

"similarity":

// Defines the fields returned by the select statement.

"select": Projection,

"offset": integer,

"limit": integer

}

Example

{
  "from": "messages",
  "get": "product",
  "similarity": {
    "description": { "$get": "message" },
    "title": { "$get": "message" }
  }
}

Referenced in

ExponentPropositionArray

Define the base and the exponent of the $pow operator in the array format.

The first item of the array is the base and the second item of the array is the exponent.

Format

[Score, Score]

Example

["width", 2]

Referenced in

Pow

ExponentPropositionObject

Define the base and the exponent of the $pow operator in the object format.

Format

{

// Score expression resolves to a numeric score value or // probability. Required.

"base": Score,

// Score expression resolves to a numeric score value or // probability. Required.

"exponent": Score

}

Example

{ "base": "width", "exponent": 2 }

Referenced in

Pow

EmptyDocument

The empty response object may contain more information in the future.

Format

{}

Examples coming later.

FieldProposition

FieldProposition expresses statements about a field in a table.

For example, the following expression

"price": {"$lt": 500 }

describes a statement that price is under 500.

Format

{

"fieldName":

}

Examples coming later.

Referenced in

From

From expression declares the examined table.

Format

FromWhere

FromTablequery

Examples

{
  "from": "impressions",
  "where": { "click": true }
}

"impressions"

{
  "from": {
    "from": "impressions",
    "where": { "click": true }
  },
  "orderBy": "$p",
  "where": { "query": "laptop" }
}

{ "from": "impressions" }

Referenced in

FromTablemodify

From expression declares the used table

Format

string

Examples

"impressions"

"products"

"customers"

"messages"

Referenced in

Copy a table

Delete entries

Rename a table

FromTablequery

From expression declares the examined table.

Format

string

Examples

"impressions"

"products"

"customers"

"messages"

Referenced in

From

FromWhere

FromWhere expression allows you to narrow the examined table.

When using the FromWhere, Aito would only consider that narrowed slice of table.

For instance, this query:

{
  "from": {
    "from": "impressions",
    "where": {
      "context.user": "larry"}
  },
 "match": "product"
}

is different from:

{
  "from": "impressions",
  "where": {
      "context.user": "larry"
  },
 "match": "product"
}

In the first query, Aito matches Larry with products only based on Larry impressions data while in the second query, Aito matches Larray with products based on Larry and other users' impressions data.

Format

{

// From expression declares the examined table. Required.

"from": From,

// Proposition expression describes a fact, or a statement. // Required.

"where": Proposition,

"limit": integer

}

Example

{
  "from": "impressions",
  "where": { "click": true }
}

Referenced in

From

Get

Get expression defines what items are returned as query results.

By default, the hits are from the table defined in "from" clause. In some cases, you may want to declare propositions like 'query is laptop' in impression table, while returning results from the separate products table, based on click likelihood. In this case, you may have query such as

{
  "from": "impressions",
  "where": { "query": "laptop" },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

The "get" expression takes a field name as a parameter. If the field is link, the returned results are from the linked table. If the field is not link, the field values are returned as results.

Normally, the result of a query consists of the field values that best fulfill the query conditions. Field analyzers extract features from text fields and the $feature property can be used to return features instead of complete field values. For instance, the following example demonstrates how to discover product tags which are likely to lead to sales

{
  "from": "impressions",
  "where": { "query": "cheap phone" },
  "get": "product.tags.$feature",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

The $feature syntax also allows you to examine the values/features of a link field like it would be a regular field.

Format

string

Examples

"product"

"user"

"text.$feature"

"link.field"

"link.$feature"

"link.text.$feature"

Referenced in

Create jobs

EvaluateGenericQuery

EvaluateMatch

GetValueExpression

$get is used to access external variables in the evaluate query.

$get is currently only used in the the Evaluate queries. The evaluate tests a specified query by examining the table rows one-by-one. $get allows accessing the tested row's properties.

Consider the following example.

Given a table containing products data with the following schema:

"products": {
  "type": "table",
  "columns": {
    "title": { "type": "Text", "analyzer": "English" },
    "description": { "type": "Text", "analyzer": "English" }
  }
}

and a table containing impressions data with the following schema:

"impressions": {
  "type": "table",
  "columns": {
    "customer": { "type": "Int", "link": "customers.id" },
    "product": { "type": "Int", "link": "products.id" },
    "query": { "type": "Text", "analyzer": "English" },
  }
}

The goal is to test how well the traditional TF-IDF similarity metric works for finding a product. The $get is used in the similarity query to compare the product's title and description fields with the impression table's query field.

{
  "test": {
    "click": true
  },
  "evaluate": {
    "from": "impressions",
    "get": "product",
    "similarity": {
      "title": { "$get": "query" },
      "description": { "$get": "query" }
    }
  },
  "select": ["trainSamples", "n", "accuracy", "baseAccuracy", "meanRank", "mxe"]
}

Format

{

"$get": string

}

Examples

{ "$get": "query" }

{ "$get": "click" }

{ "$get": "product.title" }

Referenced in

Goal

Specifies a goal to maximize.

Results are ordered by the likelihood of the goal in descending order.

Format

Score

Examples

{ "purchase": true }

{ "click": true }

Referenced in

Recommend

Hits

Entries returned for a given query.

Format

[ResponseHit, ...]

Example

[
  {
    "$p": 0.16772371915637704,
    "category": "100",
    "id": "6410405060457",
    "name": "Pirkka bio cherry tomatoes 250g international 1st class",
    "price": 1.29,
    "tags": "fresh vegetable pirkka tomato"
  },
  {
    "$p": 0.16772371915637704,
    "category": "100",
    "id": "6410405093677",
    "name": "Pirkka iceberg salad Finland 100g 1st class",
    "price": 1.29,
    "tags": "fresh vegetable pirkka"
  }
]

Referenced in

Is

The syntax {"field": { "$is": "yourvalue" } } is equivalent to { "field": "yourvalue" }.

Format

{

// PrimitiveProposition states a field's value.

"$is": PrimitiveProposition

}

Example

{ "$is": "value" }

Referenced in

KnnPropositionArray

Define the 'k' and the 'near' parameter of the $knn operator in the array format.

The first item of the array is the 'k' parameter and the second item of the array is the 'near' parameter.

Format

[

integer

Proposition

integer

Proposition

]

Example

[
  4,
  { "tags": "laptop" }
]

Referenced in

Knn

KnnPropositionObject

Define the 'k' and the 'near' parameter of the $knn operator in the object format.

Format

{

"k": integer,

"near":

}

Example

{
  "k": 4,
  "near": { "tags": "laptop" }
}

Referenced in

Knn

NnPropositionArray

Define the 'near' and 'threshold' the parameters of the $nn operator in the array format.

The first item of the array is the 'near' parameter and the second item of the array is the 'threshold' parameter.

Format

[

Proposition

number

Proposition

number

]

Example

[
  { "tags": "laptop" }
]

Referenced in

NnPropositionObject

Define the 'near' and the 'threshold' parameters of the $nn operator in the object format.

Format

{

"near":

"threshold": number

}

Example

{
  "near": { "tags": "laptop" }
}

Referenced in

ModPropositionArray

Define the divisor and the remainder of the $mod operator in the array format.

The first item of the array is the divisor and the second item of the array is the remainder.

Format

[integer, integer]

Example

[2, 0]

Referenced in

Mod

ModPropositionObject

Define the divisor and the remainder of the $mod operator in the object format.

Format

{

"divisor": integer,

"remainder": integer

}

Example

{ "divisor": 2, "remainder": 0 }

Referenced in

Mod

OnPropositionArray

Define the hypothesis and the conditional of the $on operator in the array format.

The first item of the array is the hypothesis and the second item of the array is the condition.

Format

[Proposition, Proposition]

Example

[
  { "click": true },
  { "user.tags": "nyc" }
]

Referenced in

OnPropositionObject

Define the hypothesis and the conditional of the $on operator in the object format.

Format

{

// Proposition expression describes a fact, or a statement. // Required.

"prop": Proposition,

// Proposition expression describes a fact, or a statement. // Required.

"on": Proposition

}

Example

{
  "on": { "user.tags": "nyc" },
  "prop": { "click": true }
}

Referenced in

OrderBy

Declares the sorting order of the result by a field or by a user-defined score.

Format

$desc

$asc

Value

Examples

"product.price"

{ "$asc": "product.price" }

{ "$desc": "product.price" }

{
  "$multiply": ["$p", "prices"]
}

{
  "$pow": ["product.width", 2]
}

Referenced in

Create jobs

EvaluateGenericQuery

PrimitiveProposition

PrimitiveProposition states a field's value.

It should always be used inside a field declaration of a document proposition. For example, in the proposition { "field": "value" } the string "value" is the primitive proposition.

Format

integer

number

null

$toString

boolean

string

Json

Examples

3.1

false

null

"text"

Referenced in

Proposition

Proposition expression describes a fact, or a statement.

For instance, the following proposition:

{ "customer.id": 4 } describes a customer with the id of 4
{ "clicked": true } describes that the customer has clicked the item

You can also combine multiple propositions by declaring them in an object clause. The propositions will be combined by the $and operator. For instance:

{ 
  "price": { 
    "$gt": 20, 
    "$lte": 40 
  } 
}

describes an item of which price is greater than 20 and less than or equal to 40.

This proposition is equivalent to:

{ 
  "price": { 
    "$and": [
      { "$gt": 20 }, 
      { "$lte": 40 }
    ]
  } 
}

This proposition can be used, for example, in a Search Query to find an item that matches this price criteria:

{
  "from": "products",
  "where": {
    "price": { 
      "$gt": 20, 
      "$lte": 40 
    }
  }
}

Format

Examples

{
  "customer": 4,
  "query": { "$match": "laptop" }
}

{
  "price": { "$gte": 50, "$lt": 100 }
}

{
  "tags": { "$match": "laptop" }
}

PropositionSet

PropositionSet expression is used to describe a collection of propositions. This collection of statements can be the alternative values in a field.

Format

[PropositionSet, ...]

string

ParametricProbabilityScore

Examples

"product.tags"

"query"

"product"

"tags"

Referenced in

RelateOrderBy

Declares the sorting order.

The sorting order can be any attribute of the Relate query hit.

Format

$desc(Relate)

$asc(Relate)

string

Examples

{ "$desc": "info.miTrue" }

{ "$asc": "lift" }

Referenced in

Relate

ResponseHit

Entry returned for a given query.

Format

UserDefinedObject

object

Examples

{ "name": "My product", "price": 172.19 }

{
  "$score": 0.22350516297675496,
  "$value": "coffee",
  "$why": {
    "factors": [
      {
        "factors": [
          {
            "proposition": {
              "name": { "$has": "coffee" }
            },
            "type": "relatedPropositionLift",
            "value": 8.45603245079726
          }
        ],
        "proposition": "coffee",
        "type": "hitPropositionLift",
        "value": 8.45603245079726
      }
    ],
    "type": "product"
  }
}

Referenced in

AnyValue

Hits

Score

Score expression resolves to a numeric score value or probability. All scores can be used in both highlights ($highlight) and explanations ($why).

Format