Introduction

Welcome to Aito HTTP API reference documentation. You can also test out the queries in the interactive Swagger UI.

Examples are shown in this column.

Authentication

All requests must specify an API key in the x-api-key header. There are two types of authentication keys:

  • read-only Allows only read queries. Good for sharing access to 3rd parties.
  • read/write Allows all queries.

Rate limiting

Each environment has the following rate limits:

  • 300 req/s, allowing 900 req/s bursts
  • 1 million requests per day

Please contact us at support@aito.ai, if you anticipate exceeding these limits.

Pagination

Some endpoints use pagination to limit the amount of results returned at once. The pagination is based on offset and limit parameters, similar to SQL and many other APIs.

As an example, to get the first result set of 10 items with Search query you can request:

{
  "from": "products",
  "offset": 0,
  "limit": 10
}

The response will have a total field, which tells you how many items were found in total:

{
  "offset": 0,
  "total": 81,
  "hits": [ ... ]
}

If this exceeds the amount of items in hits array, it means some results were filtered out from the response. To request the next 10 items, you can query:

{
  "from": "products",
  "offset": 10,
  "limit": 10
}

The default values for pagination parameters are the following.

ParameterDefault value
offset0
limit10

CORS

All responses are served with access-control-allow-origin: * headers. This is useful for browser applications.

Aito-specific concepts

We aim for a familiar API but in some cases Aito has a different default behavior what other databases might have.

Descending order by default

By default Aito sorts everything from the largest to the smallest. This is a design choice, dictated by the fact that within the domain of statistical reasoning: the highest values are often the most interesting ones.

For example: the items with the highest probabilities, the highest frequencies, the highest similarities, the highest mutual information, and the highest scores are often the most desired ones.

Use $asc to sort values from the smallest to the biggest, as shown in the example:

{
  "from": "products",
  "where": {
    "category.id": 89
  },
  "orderBy": {
    "$asc": "price"
  }
}

Personalisation

Aito has been designed to work well even with small data sets. One example of this is how personalised recommendations work. This is easiest to understand with an example, let's take a digital grocery store as an example.

When requesting product recommendations for a customer who's a vegetarian, Aito also considers what non-vegetarians purchase. If for example the customer would be the only vegetarian user of the grocery web shop, they could receive meat recommendations if the general average purchased a lot of meat.

This default behavior is usually a good default. In book, music, movie, and many other recommendations you commonly want to find new items, instead of getting recommendations only from your own history. However in some cases the behavior might lead to unexpected predictions. For example if we predicted how likely a vegetarian is to purchase bacon, Aito could return that it is very likely, because based on data, that's the common average.

An example recommend query could look like this:

{
  "from": "impressions",
  "where": { "session.user": "veronica" },
  "recommend": "product",
  "goal": { "purchase": true }
}

Even if we limit the data to impressions by veronica, Aito still considers other data points.

Error handling

In error cases, we return with proper HTTP status codes. Error responses:

  • 400 Bad Request Returned when there's an error with the given request payload. For example invalid query syntax.

Example error

Error returned when trying to use incorrect table name. Instead of prodjucts, it should be products.

{
  "charOffset": 17,
  "lineNumber": 3,
  "columnNumber": 13,
  "error": "failed to open 'prodjucts'",
  "status": 400,
  "message": "3:13: failed to open 'prodjucts'\n\n      \"from\": \"prodjucts\"\n              ^\n",
  "messageLines": [
    "3:13: failed to open 'prodjucts'",
    "",
    "      \"from\": \"prodjucts\"",
    "              ^"
  ]
}

Feedback & bug reports

We take our quality seriously and aim for the smoothest developer experience possible. If you run into problems, please send an email to support@aito.ai containing reproduction steps and we'll fix it as soon as possible.

Query API

The query language operations.

Search

POST /api/v1/_search

Search rows.

Allows you to search, filter, and order rows. You can also select only specific columns. Similar to SELECT in SQL.

The results are in descending order by default.

Aito supports intuitive links following. If your products table has a link column called category which links to another table called categories, you can simply use the following convenience in the query selection:

{
  "from": "products",
  "where": {
    "category.id": 89
  },
  "orderBy": "price"
}

Get all rows

You can easily select all rows from a table with the following query:

{
  "from": "products"
}

Note: the amount of results is limited to 10 by default.

Highlighted results

If you want to get search results with highlights, see Generic query.

Parameters
NameTypeDescription
bodyrequiredobjectSearch query
Successful responses
ResponseTypeDescription
200 OKobjectSearch results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Declares the sorting order of the result by a field or by a // user-defined score.
"orderBy": OrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Find by id

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

You can copy-paste the example curl command to your terminal.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_search \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": { "id": "6411300000494" }
  }'

Response

{
  "offset": 0,
  "total": 1,
  "hits": [
    {
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    }
  ]
}

Where price is greater than

You can copy-paste the example curl command to your terminal.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_search \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": {
      "price": { "$gt": 1.5 }
    },
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 21,
  "hits": [
    {
      "category": "101",
      "id": "6437002001454",
      "name": "VAASAN Ruispalat 660g 12 pcs fullcorn rye bread",
      "price": 1.69,
      "tags": "gluten bread"
    },
    {
      "category": "101",
      "id": "6411402202208",
      "name": "Fazer Puikula fullcorn rye bread 9 pcs/500g",
      "price": 1.85,
      "tags": "gluten bread"
    }
  ]
}

Find products with search term

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_search \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": {
      "name": { "$match": "coffee" }
    }
  }'

Response

{
  "offset": 0,
  "total": 4,
  "hits": [
    {
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    },
    {
      "category": "108",
      "id": "6420101441542",
      "name": "Kulta Katriina filter coffee 500g",
      "price": 3.45,
      "tags": "coffee"
    },
    {
      "category": "108",
      "id": "6411300164653",
      "name": "Juhla Mokka Dark Roast coffee 500g hj",
      "price": 3.95,
      "tags": "coffee"
    },
    {
      "category": "108",
      "id": "6410405181190",
      "name": "Pirkka Costa Rica filter coffee 500g UTZ",
      "price": 2.89,
      "tags": "coffee pirkka"
    }
  ]
}

More complex where proposition

Find all products priced over 1.5€, which have tag drink or their name matches to coffee.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_search \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": {
      "$and": [
        {
          "$or": [
            {
              "tags": { "$has": "drink" }
            },
            {
              "name": { "$match": "coffee" }
            }
          ]
        },
        {
          "price": { "$gt": 1.5 }
        }
      ]
    },
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 6,
  "hits": [
    {
      "category": "104",
      "id": "6408430000258",
      "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
      "price": 1.95,
      "tags": "lactose-free drink"
    },
    {
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    }
  ]
}

Predict

POST /api/v1/_predict

Predict the likelihood of a feature given a hypothesis.

For example predict what other products user could add into their ecommerce shopping cart, based on the existing cart. To understand why Aito predicts certain results, you can select "$why".

Related information

  • The exclusiveness option is explained in Exclusiveness chapter.
  • The chapter Personalisation also explains a characteristic of predictions in Aito.
Parameters
NameTypeDescription
bodyrequiredobjectPredict query
Successful responses
ResponseTypeDescription
200 OKobjectPredict results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// PropositionSet expression is used to describe a collection // of propositions. Required.
"predict": PropositionSet,
// Exclusiveness dictates that only one feature can be true at // the same time and that one feature will be true. // Default: true
"exclusiveness": boolean,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Predict purchase likelihood

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're predicting how likely the customer with username larry would purchase the product "Finnish bread cheese 120g lactose-free" (6410405197764). In the example data, Larry purchases a lot of lactose-free products, but has never purchased any cheese. Aito detects that the "lactose-free" tag is a commonly occuring feature in the data, and predicts that Larry would also quite likely purchase the cheese.

The query format depends on how the data has been structured in Aito (schema). In the example dataset impressions table contains each individual product a user has seen in their shop visit (=session) and if they bought the product or not.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_predict \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "larry", "product.id": "6410405216120" },
    "predict": "purchase"
  }'

Response

{
  "offset": 0,
  "total": 2,
  "hits": [
    { "$p": 0.7841966771599878, "field": "purchase", "feature": true },
    { "$p": 0.21580332284001236, "field": "purchase", "feature": false }
  ]
}

Explain the prediction

Same example as above, but we ask Aito to explain why it predicted the results. To understand the response, see "$why" section.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_predict \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "larry", "product.id": "6410405216120" },
    "select": ["$why"],
    "predict": "purchase"
  }'

Response

{
  "offset": 0,
  "total": 2,
  "hits": [
    {
      "$why": {
        "type": "product",
        "factors": [
          { "type": "baseP", "value": 0.5 },
          { "type": "normalizer", "name": "exclusiveness", "value": 1 },
          {
            "type": "relatedVariableLift",
            "variable": "product.id:6410405216120",
            "value": 1.5683933543199755
          }
        ]
      }
    },
    {
      "$why": {
        "type": "product",
        "factors": [
          { "type": "baseP", "value": 0.5 },
          { "type": "normalizer", "name": "exclusiveness", "value": 1 },
          {
            "type": "relatedVariableLift",
            "variable": "product.id:6410405216120",
            "value": 0.4316066456800247
          }
        ]
      }
    }
  ]
}

Example request

In the example we're predicting three suitable tags for a hypothetical new product based on its name. Tags are predicted based on what tags existing products have.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_predict \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": { "name": "Hovis Seed Sensations Seven Seeds Original 800g" },
    "predict": "tags",
    "exclusiveness": false,
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 22,
  "hits": [
    { "$p": 0.36, "field": "tags", "feature": "pirkka" },
    { "$p": 0.32, "field": "tags", "feature": "food" },
    { "$p": 0.28, "field": "tags", "feature": "meat" }
  ]
}

Recommend

POST /api/v1/_recommend

Recommend a row which optimizes a given goal.

For example, you could ask Aito to choose a product, which maximises the click likelihood, when user id equals 4543.

Recommend differs from predict and match in the following way: recommend always optimizes a goal, while predict and match merely mimics the existing behavior patterns in the data. As an example, consider the problem matching employees to projects. With predict and match: you can mimic the way the projects are staffed currently, and Aito will mimic both the good and the bad staffing practices. With recommend, Aito seeks to maximize the success rate and avoid decisions that lead to bad outcomes, even if these decisions were a popular practice.

The chapter Personalisation also explains a characteristic of the recommendations.

Parameters
NameTypeDescription
bodyrequiredobjectRecommend query
Successful responses
ResponseTypeDescription
200 OKobjectRecommend results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results. Required.
"recommend": Get,
// Specifies a goal to maximize. Required.
"goal": Goal,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Recommend top 5 products for a customer

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're recommending the top 5 products which veronica (user id) would most likely to purchase based on her behavior history stored in impressions table. The table contains information of which products she has seen and which of those where bought.

This query could be used to generate campaign email which recommends relevant products for a customer.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_recommend \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "veronica" },
    "recommend": "product",
    "goal": { "purchase": true },
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$p": 0.17054047089825275,
      "category": "100",
      "id": "6410405060457",
      "name": "Pirkka bio cherry tomatoes 250g international 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka tomato"
    },
    {
      "$p": 0.15926506786365752,
      "category": "100",
      "id": "6410405093677",
      "name": "Pirkka iceberg salad Finland 100g 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka"
    },
    {
      "$p": 0.11742288011689808,
      "category": "104",
      "id": "6410405216120",
      "name": "Pirkka lactose-free semi-skimmed milk drink 1l",
      "price": 1.25,
      "tags": "lactose-free drink pirkka"
    },
    {
      "$p": 0.1028880526906814,
      "category": "100",
      "id": "2000503600002",
      "name": "Chiquita banana",
      "price": 0.28054,
      "tags": "fresh fruit"
    },
    {
      "$p": 0.08033724662149096,
      "category": "100",
      "id": "2000604700007",
      "name": "Cucumber Finland",
      "price": 0.9765,
      "tags": "fresh vegetable"
    }
  ]
}

Recommend top products with additional filtering

This example is the same as above, but we're adding an additional criteria: the product name should match to 'Banana' search query.

This query could be used to build a personalised search functionality.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_recommend \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": {
      "product.name": { "$match": "Banana" },
      "session.user": "veronica"
    },
    "recommend": "product",
    "goal": { "purchase": true },
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 2,
  "hits": [
    {
      "$p": 0.6886792452830188,
      "category": "100",
      "id": "2000503600002",
      "name": "Chiquita banana",
      "price": 0.28054,
      "tags": "fresh fruit"
    },
    {
      "$p": 0.3113207547169811,
      "category": "100",
      "id": "2000818700008",
      "name": "Pirkka banana",
      "price": 0.166,
      "tags": "fresh fruit pirkka"
    }
  ]
}

Evaluate

POST /api/v1/_evaluate

Evaluate performance and accuracy.

The query supports evaluation of Predict, Match, Similarity, and Generic queries.

Evaluate operation is in alpha stage. The syntax might change in the future.

The evaluation is performed by first specifying the train and test data split:

  • The training data: The data that will be used to train Aito.
  • The testing data: The data that will be hidden from Aito and will be used to measure an Aito query's performance.

The testing data is specified using the test proposition or the TestSource. The training data is the remaining data that is not the testing data.

The evaluating query is specified following the evaluate keyword.

After that, a simulated evaluation scenario is ran: Aito simulates inserting the training data in to a table and then runs the given query for each sample (=row in a table) in the test data and measures how good the results were.

It is also possible to group multiple entries into a single test case and evaluate using the EvaluateGroupedQuery

Parameters
NameTypeDescription
bodyrequiredEvaluate query
Successful responses
ResponseTypeDescription
200 OKobjectEvaluate results
Request format
Response format
{
// The amount of samples used for testing. Required.
"n": integer,
// The amount of samples used for testing. Required.
"testSamples": integer,
// The average amount of samples used for training. Required.
"trainSamples": number,
// The average number of features. Required.
"features": number,
// Complement of `accuracy` (=`1 - accuracy`). Required.
"error": number,
// Complement of `baseAccuracy` (=`1 - baseAccuracy`). // Required.
"baseError": number,
// The accuracy of predictions. Required.
"accuracy": number,
// The simulated accuracy of predictions based on taking the // most frequent value. Required.
"baseAccuracy": number,
// How much better results Aito was able to provide compared to // a naive prediction. Required.
"accuracyGain": number,
// Average rank of the best prediction. Required.
"meanRank": number,
// . Required.
"baseMeanRank": number,
// Improvement of meanRank upon baseMeanRank. Required.
"rankGain": number,
// A measurement which describes the quality of probabilities // (=`h - mxe`). Required.
"informationGain": number,
// Mean cross entropy. Required.
"mxe": number,
// Entropy. Required.
"h": number,
// The mean geometric probability of the predictions. Required.
"geomMeanP": number,
// Base geometric mean probability. Required.
"baseGmp": number,
// Geometric mean lift. Required.
"geomMeanLift": number,
// The mean execution time of the queries in nanoseconds. // Required.
"meanNs": number,
// The mean execution time of the queries in microseconds. // Required.
"meanUs": number,
// The mean execution time of the queries in milliseconds. // Required.
"meanMs": number,
// The time spent for warm-up of indexes and caches for the // given query in milliseconds. Required.
"warmingMs": number
}

Example request

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're evaluating how good results Aito provides when we predict tags for a new hypothetical product. The results give us the accuracy and performance of the prediction example shown in Predict operation's documentation.

$index is a built-in variable which tells the insertion index of a row. In the example, we select 1/4 of the rows in products table to be used as test data. The rest of the rows are automatically used as training data.

Aito iterates through each product in the test data, and tests how accurate the prediction of tags for a given product name was.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_evaluate \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "test": {
      "$index": {
        "$mod": [4, 0]
      }
    },
    "evaluate": {
      "from": "products",
      "where": {
        "name": { "$get": "name" }
      },
      "predict": "tags"
    }
  }'

Response

{
  "mxe": 1.569385022541934,
  "baseAccuracy": 0.2727272727272727,
  "meanUs": 26986.81890909091,
  "accuracyGain": 0.6363636363636364,
  "n": 11,
  "accurateOffsets": [0, 1, 2, 3, 4, 5, 6, 7, 8, 10],
  "errorOffsets": [9],
  "rankGain": 2.545454545454546,
  "warmingMs": 0,
  "features": 249,
  "accuracy": 0.9090909090909091,
  "trainSamples": 31,
  "geomMeanP": 0.336951996098374,
  "accurateCases": [
    {
      "offset": 0,
      "testCase": {
        "category": "100",
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3407515625765043, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3407515625765043,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 1,
      "testCase": {
        "category": "100",
        "id": "6410405093677",
        "name": "Pirkka iceberg salad Finland 100g 1st class",
        "price": 1.29,
        "tags": "fresh vegetable pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.22827306303475953,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.22827306303475953,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 2,
      "testCase": {
        "category": "101",
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      "accurate": true,
      "top": { "$p": 0.4051323599486031, "field": "tags", "feature": "bread" },
      "correct": {
        "$p": 0.4051323599486031,
        "field": "tags",
        "feature": "bread"
      }
    },
    {
      "offset": 3,
      "testCase": {
        "category": "102",
        "id": "6410405205483",
        "name": "Pirkka Finnish beef-pork minced meat 20% 400g",
        "price": 2.79,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.4136638488736063, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.4136638488736063,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 4,
      "testCase": {
        "category": "103",
        "id": "6412000030026",
        "name": "Saarioinen Maksalaatikko liver casserole 400g",
        "price": 1.99,
        "tags": "meat food"
      },
      "accurate": true,
      "top": { "$p": 0.18155317896362574, "field": "tags", "feature": "food" },
      "correct": {
        "$p": 0.18155317896362574,
        "field": "tags",
        "feature": "food"
      }
    },
    {
      "offset": 5,
      "testCase": {
        "category": "104",
        "id": "6410405082657",
        "name": "Pirkka Finnish semi-skimmed milk 1l",
        "price": 0.81,
        "tags": "lactose drink pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.24032786009102414,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.24032786009102414,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 6,
      "testCase": {
        "category": "104",
        "id": "6408430000258",
        "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
        "price": 1.95,
        "tags": "lactose-free drink"
      },
      "accurate": true,
      "top": { "$p": 0.23328849278580868, "field": "tags", "feature": "drink" },
      "correct": {
        "$p": 0.23328849278580868,
        "field": "tags",
        "feature": "drink"
      }
    },
    {
      "offset": 7,
      "testCase": {
        "category": "108",
        "id": "6420101441542",
        "name": "Kulta Katriina filter coffee 500g",
        "price": 3.45,
        "tags": "coffee"
      },
      "accurate": true,
      "top": { "$p": 0.4438018262549842, "field": "tags", "feature": "coffee" },
      "correct": {
        "$p": 0.4438018262549842,
        "field": "tags",
        "feature": "coffee"
      }
    },
    {
      "offset": 8,
      "testCase": {
        "category": "109",
        "id": "6411401015090",
        "name": "Fazer Sininen milk chocolate slab 200g",
        "price": 2.19,
        "tags": "candy lactose"
      },
      "accurate": true,
      "top": { "$p": 0.26966069438857154, "field": "tags", "feature": "candy" },
      "correct": {
        "$p": 0.26966069438857154,
        "field": "tags",
        "feature": "candy"
      }
    },
    {
      "offset": 10,
      "testCase": {
        "category": "115",
        "id": "6410402010318",
        "name": "Pirkka tuna fish pieces in oil 200g/150g",
        "price": 1.69,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.40031633822623264,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.40031633822623264,
        "field": "tags",
        "feature": "pirkka"
      }
    }
  ],
  "errorCases": [
    {
      "offset": 9,
      "testCase": {
        "category": "111",
        "id": "6413200330206",
        "name": "Lotus Soft Embo 8 rll toilet paper",
        "price": 3.35,
        "tags": "toilet-paper"
      },
      "accurate": false,
      "top": {
        "$p": 0.20135125095839776,
        "field": "tags",
        "feature": "paper-towels"
      }
    }
  ],
  "baseGmp": 0.10266104053682168,
  "meanMs": 26.98681890909091,
  "error": 0.09090909090909094,
  "baseError": 0.7272727272727273,
  "testSamples": 11,
  "geomMeanLift": 3.2821798253400583,
  "alpha_binByTopScore": [
    {
      "meanScore": 0.21111649643564792,
      "maxScore": 0.23328849278580868,
      "minScore": 0.18155317896362574,
      "accuracy": 0.75,
      "n": 4,
      "accurateOffsets": [4, 1, 6],
      "errorOffsets": [9]
    },
    {
      "meanScore": 0.31276411382058317,
      "maxScore": 0.40031633822623264,
      "minScore": 0.24032786009102414,
      "accuracy": 1,
      "n": 4,
      "accurateOffsets": [5, 8, 0, 10],
      "errorOffsets": []
    },
    {
      "meanScore": 0.4208660116923979,
      "maxScore": 0.4438018262549842,
      "minScore": 0.4051323599486031,
      "accuracy": 1,
      "n": 3,
      "accurateOffsets": [2, 3, 7],
      "errorOffsets": []
    }
  ],
  "meanRank": 2,
  "meanNs": 26986818.90909091,
  "h": 3.2840393064938755,
  "informationGain": 1.7146542839519419,
  "baseMeanRank": 4.545454545454546,
  "cases": [
    {
      "offset": 0,
      "testCase": {
        "category": "100",
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3407515625765043, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3407515625765043,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 1,
      "testCase": {
        "category": "100",
        "id": "6410405093677",
        "name": "Pirkka iceberg salad Finland 100g 1st class",
        "price": 1.29,
        "tags": "fresh vegetable pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.22827306303475953,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.22827306303475953,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 2,
      "testCase": {
        "category": "101",
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      "accurate": true,
      "top": { "$p": 0.4051323599486031, "field": "tags", "feature": "bread" },
      "correct": {
        "$p": 0.4051323599486031,
        "field": "tags",
        "feature": "bread"
      }
    },
    {
      "offset": 3,
      "testCase": {
        "category": "102",
        "id": "6410405205483",
        "name": "Pirkka Finnish beef-pork minced meat 20% 400g",
        "price": 2.79,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.4136638488736063, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.4136638488736063,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 4,
      "testCase": {
        "category": "103",
        "id": "6412000030026",
        "name": "Saarioinen Maksalaatikko liver casserole 400g",
        "price": 1.99,
        "tags": "meat food"
      },
      "accurate": true,
      "top": { "$p": 0.18155317896362574, "field": "tags", "feature": "food" },
      "correct": {
        "$p": 0.18155317896362574,
        "field": "tags",
        "feature": "food"
      }
    },
    {
      "offset": 5,
      "testCase": {
        "category": "104",
        "id": "6410405082657",
        "name": "Pirkka Finnish semi-skimmed milk 1l",
        "price": 0.81,
        "tags": "lactose drink pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.24032786009102414,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.24032786009102414,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 6,
      "testCase": {
        "category": "104",
        "id": "6408430000258",
        "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
        "price": 1.95,
        "tags": "lactose-free drink"
      },
      "accurate": true,
      "top": { "$p": 0.23328849278580868, "field": "tags", "feature": "drink" },
      "correct": {
        "$p": 0.23328849278580868,
        "field": "tags",
        "feature": "drink"
      }
    },
    {
      "offset": 7,
      "testCase": {
        "category": "108",
        "id": "6420101441542",
        "name": "Kulta Katriina filter coffee 500g",
        "price": 3.45,
        "tags": "coffee"
      },
      "accurate": true,
      "top": { "$p": 0.4438018262549842, "field": "tags", "feature": "coffee" },
      "correct": {
        "$p": 0.4438018262549842,
        "field": "tags",
        "feature": "coffee"
      }
    },
    {
      "offset": 8,
      "testCase": {
        "category": "109",
        "id": "6411401015090",
        "name": "Fazer Sininen milk chocolate slab 200g",
        "price": 2.19,
        "tags": "candy lactose"
      },
      "accurate": true,
      "top": { "$p": 0.26966069438857154, "field": "tags", "feature": "candy" },
      "correct": {
        "$p": 0.26966069438857154,
        "field": "tags",
        "feature": "candy"
      }
    },
    {
      "offset": 9,
      "testCase": {
        "category": "111",
        "id": "6413200330206",
        "name": "Lotus Soft Embo 8 rll toilet paper",
        "price": 3.35,
        "tags": "toilet-paper"
      },
      "accurate": false,
      "top": {
        "$p": 0.20135125095839776,
        "field": "tags",
        "feature": "paper-towels"
      }
    },
    {
      "offset": 10,
      "testCase": {
        "category": "115",
        "id": "6410402010318",
        "name": "Pirkka tuna fish pieces in oil 200g/150g",
        "price": 1.69,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.40031633822623264,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.40031633822623264,
        "field": "tags",
        "feature": "pirkka"
      }
    }
  ]
}

Similarity

POST /api/v1/_similarity

Similarity can be used to return entries, that are similar to the given sample object.

The sample object can be either a complete or a partial row. Similarity operation uses TF-IDF for scoring the documents.

The chapter Personalisation also explains a characteristic of the similarity model.

Parameters
NameTypeDescription
bodyrequiredobjectSimilarity query
Successful responses
ResponseTypeDescription
200 OKobjectSimilarity results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Proposition expression describes a fact, or a statement. // Required.
"similarity": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Example request

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're finding similar products to a given existing product. Aito assumes that the given sample object is a hypothetical new object, which is why in this example the exact same product is also in the results.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_similarity \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "similarity": {
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    },
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 22.571874297422074,
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    },
    {
      "$score": 7.456341582830473,
      "category": "108",
      "id": "6411300164653",
      "name": "Juhla Mokka Dark Roast coffee 500g hj",
      "price": 3.95,
      "tags": "coffee"
    },
    {
      "$score": 2.93753333245045,
      "category": "108",
      "id": "6420101441542",
      "name": "Kulta Katriina filter coffee 500g",
      "price": 3.45,
      "tags": "coffee"
    }
  ]
}

Example request

In the example we're finding similar products based on just a product name.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_similarity \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "similarity": { "name": "Hovis Seed Sensations Seven Seeds Original 800g" },
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 1,
      "category": "100",
      "id": "2000818700008",
      "name": "Pirkka banana",
      "price": 0.166,
      "tags": "fresh fruit pirkka"
    },
    {
      "$score": 1,
      "category": "100",
      "id": "2000604700007",
      "name": "Cucumber Finland",
      "price": 0.9765,
      "tags": "fresh vegetable"
    },
    {
      "$score": 1,
      "category": "100",
      "id": "6410405060457",
      "name": "Pirkka bio cherry tomatoes 250g international 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka tomato"
    }
  ]
}

Match

POST /api/v1/_match

Match the most likely value/feature of a column or any column of a linked table to a given hypothesis.

Differences to Predict

While match is similar to Predict query, there are fine-grained differences explained below.

Predict returns features, while match can return values

Match can return A) the row behind a link or B) the value inside a text field. If match is done against non-analyzed field, it works similarly to predict, except the inference algorithm is somewhat different

The inference model is different

Predict treats features as 'black boxes', and it does statistical reasoning purely based on the feature's own statistics. Match does 'glass box' statistical reasoning by using all the features found behind the link or within a field.

For example, if you are predicting a product, the predict-query will look at the histories of the each individual product ids. If there is no history for the product, Aito will not be a ble to do proper inference. On the other hand, if you are matching the product, Aito will look at the product category, title and description. This enables Aito to match products, it has never seen before, as long as it is familiar with its internal features

The chapter Personalisation also explains a characteristic of the matching.

Parameters
NameTypeDescription
bodyrequiredobjectMatch query
Successful responses
ResponseTypeDescription
200 OKobjectMatch results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// Get expression defines what items are returned as query // results. Required.
"match": Get,
// PropositionSet expression is used to describe a collection // of propositions.
"basedOn": PropositionSet,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Match user to products

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're matching a user to products.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_match \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "larry" },
    "match": "product",
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$p": 0.31434534071969406,
      "category": "104",
      "id": "6410405216120",
      "name": "Pirkka lactose-free semi-skimmed milk drink 1l",
      "price": 1.25,
      "tags": "lactose-free drink pirkka"
    },
    {
      "$p": 0.11650351373760817,
      "category": "104",
      "id": "6408430000258",
      "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
      "price": 1.95,
      "tags": "lactose-free drink"
    },
    {
      "$p": 0.03486558261528519,
      "category": "100",
      "id": "6410405060457",
      "name": "Pirkka bio cherry tomatoes 250g international 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka tomato"
    },
    {
      "$p": 0.0332897370733514,
      "category": "100",
      "id": "2000604700007",
      "name": "Cucumber Finland",
      "price": 0.9765,
      "tags": "fresh vegetable"
    },
    {
      "$p": 0.03013804598948381,
      "category": "107",
      "id": "6410400033524",
      "name": "Pirkka frozen potato-onion mix 500g",
      "price": 0.79,
      "tags": "food carbohydrate pirkka"
    }
  ]
}

Relate

POST /api/v1/_relate

Relate provides statistical information of data relationships.

It calculates correlations between a pair of features, which can be used to for example to find causation and correlation.

The hits are by default ordered by relation.mi field. It indicates how strong the correlation is.

Parameters
NameTypeDescription
bodyrequiredobjectRelate query
Successful responses
ResponseTypeDescription
200 OKobjectRelate results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// PropositionSet expression is used to describe a collection // of propositions. Required.
"relate": PropositionSet,
// Declares the sorting order.
"orderBy": RelateOrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

What features of products affect purchasing

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we ask Aito to explain what factors of products affect to people purchasing them. With $exists, we tell Aito to get all properties of the product (impressions table links to the products table), and relate those to the condition {"purchase": true }.

The response may seem overwhelming but it contains a lot of useful information.

When looking at the second hit, we can see that when product.tags:vegetable, the "lift" value is high (compared to 1.0). It means that when the product tags contain a tag vegetable, it is ~1.9x more likely that the product will be purchased compared to the average product (=base probability).

The lift is calculated with the formula: the probability of the condition { "purchase": true} divided by the average probability of the condition. The formula with the correct field names is: ps.pOnCondition / ps.p.

In the example data set, people purchase 50% of products they see. This causes the base probability to be 0.5.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_relate \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "$exists": "product" },
    "relate": [
      { "purchase": true }
    ],
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 96,
  "hits": [
    {
      "related": "purchase:true",
      "lift": 0.25611447987925023,
      "condition": "product.name:fazer",
      "fs": {
        "f": 1680,
        "fOnCondition": 40,
        "fOnNotCondition": 1640,
        "fCondition": 320,
        "n": 3360
      },
      "ps": {
        "p": 0.5,
        "pOnCondition": 0.12805723993962512,
        "pOnNotCondition": 0.5391517423843831,
        "pCondition": 0.09523781551820762
      },
      "info": {
        "h": 1,
        "mi": 0.44791401701173766,
        "miTrue": -0.25165031157192846,
        "miFalse": 0.6995643285836661
      },
      "relation": {
        "n": 3360,
        "varFs": [320, 1680],
        "stateFs": [1400, 280, 1640, 40],
        "mi": 0.046664130610992574
      }
    },
    {
      "related": "purchase:true",
      "lift": 0.18455932520038865,
      "condition": "product.name:puikula",
      "fs": {
        "f": 1680,
        "fOnCondition": 16,
        "fOnNotCondition": 1664,
        "fCondition": 184,
        "n": 3360
      },
      "ps": {
        "p": 0.5,
        "pOnCondition": 0.09227966260019432,
        "pOnNotCondition": 0.5236210173698082,
        "pCondition": 0.05476177303338896
      },
      "info": {
        "h": 1,
        "mi": 0.5559663947062397,
        "miTrue": -0.2249633720193698,
        "miFalse": 0.7809297667256094
      },
      "relation": {
        "n": 3360,
        "varFs": [184, 1680],
        "stateFs": [1512, 168, 1664, 16],
        "mi": 0.03196802243310381
      }
    }
  ]
}

Generic query

POST /api/v1/_query

Generic query is a powerful expert interface.

It provides the functionality of every other query type in the API. Search, Similarity, Match, and Recommend can be seen as convenience APIs for the generic query.

The query format resembles the Search-query, except that it supports a "get" statement. Since this endpoint provides functionality of all other queries, "get": "product" is used as a replacement for "predict": "product", "recommend": "product", and "match": "product" counterparts.

The chapter Personalisation also explains a characteristic of the inference model.

Namespace shifting of "get"

The "get" operation changes the namespaces of "select" and "orderBy" operations. The namespace is changed from the "from" table to the linked table (specified with "get").

As an example, think of this query. The impressions table has a column called product which links to a row in products table. The price and title fields are columns of products.

{
  "from": "impressions",
  "where": {
    "query": "macbook air 2018"
  },
  "get": "product",
  "orderBy": ["price"],
  "select": ["title", "$highlight"]
}

When using "select" and "orderBy", we are already in the products table namespace, instead of having to use product.title or product.price.

Related information

Parameters
NameTypeDescription
bodyrequiredobjectGeneric query
Successful responses
ResponseTypeDescription
200 OKobjectQuery results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results.
"get": Get,
// Declares the sorting order of the result by a field or by a // user-defined score.
"orderBy": OrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Search query

Simple search query with the generic query.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": { "id": "6410402010318" }
  }'

Response

{
  "offset": 0,
  "total": 1,
  "hits": [
    {
      "category": "115",
      "id": "6410402010318",
      "name": "Pirkka tuna fish pieces in oil 200g/150g",
      "price": 1.69,
      "tags": "meat food protein pirkka"
    }
  ]
}

Search query with highlighted results

Search query which returns related products ordered by similarity. The response also contains the highlighted words which matched to the search term.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": {
      "name": { "$match": "coffee" }
    },
    "select": ["id", "name", "tags", "price", "$score", "$highlight"],
    "orderBy": "$similarity"
  }'

Response

{
  "offset": 0,
  "total": 4,
  "hits": [
    {
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "tags": "coffee",
      "price": 3.95,
      "$score": 1.3346734907091136,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Juhla Mokka <font color=\"green\">coffee</font> 500g sj"
        }
      ]
    },
    {
      "id": "6420101441542",
      "name": "Kulta Katriina filter coffee 500g",
      "tags": "coffee",
      "price": 3.45,
      "$score": 1.3346734907091136,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Kulta Katriina filter <font color=\"green\">coffee</font> 500g"
        }
      ]
    },
    {
      "id": "6411300164653",
      "name": "Juhla Mokka Dark Roast coffee 500g hj",
      "tags": "coffee",
      "price": 3.95,
      "$score": 1.3018159138535317,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Juhla Mokka Dark Roast <font color=\"green\">coffee</font> 500g hj"
        }
      ]
    },
    {
      "id": "6410405181190",
      "name": "Pirkka Costa Rica filter coffee 500g UTZ",
      "tags": "coffee pirkka",
      "price": 2.89,
      "$score": 1.288697684625764,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Pirkka Costa Rica filter <font color=\"green\">coffee</font> 500g UTZ"
        }
      ]
    }
  ]
}

Generic similarity query

In the example we're finding similar products based on the given hypothetical new product name.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "orderBy": {
      "$similarity": { "name": "Atria bratwurst 175g" }
    },
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 1.407073967353227,
      "category": "102",
      "id": "6407870070333",
      "name": "Atria lauantaimakkara bread sausage 225g",
      "price": 0.89,
      "tags": "meat sausage with-bread"
    },
    {
      "$score": 1.407073967353227,
      "category": "102",
      "id": "6407870071224",
      "name": "Atria Gotler ham sausage 300g",
      "price": 1.75,
      "tags": "meat sausage with-bread"
    }
  ]
}

Generic predict query

In the example we're predicting which tags a new hypothetical product could have.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": { "name": "Atria bratwurst 175g" },
    "get": "tags.$feature",
    "orderBy": "$p",
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 22,
  "hits": [
    { "$p": 0.5149108201954783, "field": "", "feature": "sausage" },
    { "$p": 0.062190920487759184, "field": "", "feature": "pirkka" },
    { "$p": 0.05389879775605796, "field": "", "feature": "food" },
    { "$p": 0.04560667502435674, "field": "", "feature": "meat" },
    { "$p": 0.0331684909268049, "field": "", "feature": "gluten" }
  ]
}

Recommend products which a customer would most likely purchase

In the example we're finding the top 5 products which veronica (user id) would most likely to purchase based on her behavior history stored in impressions table.

This example is the the same as in the documentation of Recommendation endpoint, but made with the generic query.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "veronica" },
    "get": "product",
    "orderBy": {
      "$p": {
        "$context": { "purchase": true }
      }
    },
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$p": 0.17054047089825275,
      "category": "100",
      "id": "6410405060457",
      "name": "Pirkka bio cherry tomatoes 250g international 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka tomato"
    },
    {
      "$p": 0.15926506786365752,
      "category": "100",
      "id": "6410405093677",
      "name": "Pirkka iceberg salad Finland 100g 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka"
    },
    {
      "$p": 0.11742288011689808,
      "category": "104",
      "id": "6410405216120",
      "name": "Pirkka lactose-free semi-skimmed milk drink 1l",
      "price": 1.25,
      "tags": "lactose-free drink pirkka"
    },
    {
      "$p": 0.1028880526906814,
      "category": "100",
      "id": "2000503600002",
      "name": "Chiquita banana",
      "price": 0.28054,
      "tags": "fresh fruit"
    },
    {
      "$p": 0.08033724662149096,
      "category": "100",
      "id": "2000604700007",
      "name": "Cucumber Finland",
      "price": 0.9765,
      "tags": "fresh vegetable"
    }
  ]
}

Query with custom scoring

In the example we're finding the top 5 products which veronica (user id) would most likely to purchase but in addition we're boosting products which have higher price. This would recommend products which are relevant for the user but also bring higher revenue to the shop. This demonstrates a situation where multiple factors should be considered in recommendations.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "veronica" },
    "get": "product",
    "orderBy": {
      "$multiply": [
        {
          "$p": {
            "$context": { "purchase": true }
          }
        },
        "price"
      ]
    },
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 0.21999720745874604,
      "category": "100",
      "id": "6410405060457",
      "name": "Pirkka bio cherry tomatoes 250g international 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka tomato"
    },
    {
      "$score": 0.2054519375441182,
      "category": "100",
      "id": "6410405093677",
      "name": "Pirkka iceberg salad Finland 100g 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka"
    },
    {
      "$score": 0.15009340872358526,
      "category": "104",
      "id": "6408430000258",
      "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
      "price": 1.95,
      "tags": "lactose-free drink"
    }
  ]
}

Database API

Operations which manipulate the Aito database.

Get database schema

GET /api/v1/schema

Get the schema for the database.

Successful responses
ResponseTypeDescription
200 OKobjectThe current active schema
Response format
{
// Database tables. Required.
"schema": {
// Any schema which is a valid Aito table schema.
"<yourTableName>": UserDefinedTableSchema
}
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.api.aito.ai/api/v1/schema \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "schema": {
    "products": {
      "type": "table",
      "columns": {
        "id": { "type": "Int", "nullable": false },
        "name": { "type": "String", "nullable": false },
        "price": { "type": "Decimal", "nullable": false },
        "description": {
          "type": "Text",
          "nullable": false,
          "analyzer": "english"
        }
      }
    }
  }
}

Create database schema

PUT /api/v1/schema

Create or update the schema for the entire database.

Note:

  • An existing table that is not included in the updated schema will not be deleted.
  • An existing table that is included in the updated schema will be updated if the table has no data.
Parameters
NameTypeDescription
bodyrequiredobjectThe aito schema definition
Successful responses
ResponseTypeDescription
200 OKobjectThe current active schema
Request format
{
// Database tables. Required.
"schema": {
// Any schema which is a valid Aito table schema.
"<yourTableName>": UserDefinedTableSchema
}
}
Response format
{
// Database tables. Required.
"schema": {
// Any schema which is a valid Aito table schema.
"<yourTableName>": UserDefinedTableSchema
}
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.api.aito.ai/api/v1/schema \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "schema": {
      "products": {
        "type": "table",
        "columns": {
          "id": { "type": "Int" },
          "name": { "type": "String" },
          "price": { "type": "Decimal" },
          "description": { "type": "Text", "analyzer": "English" }
        }
      }
    }
  }'

Response

{
  "schema": {
    "products": {
      "type": "table",
      "columns": {
        "id": { "type": "Int" },
        "name": { "type": "String" },
        "price": { "type": "Decimal" },
        "description": { "type": "Text", "analyzer": "english" }
      }
    }
  }
}

Delete database

DELETE /api/v1/schema

Delete the entire database schema.

The operation deletes all data and contents of the database! The action is irreversible.

Successful responses
ResponseTypeDescription
200 OKobjectThe summary of deletion
Response format
{
// Array of table names deleted. Required.
"deleted": [string, ...]
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.api.aito.ai/api/v1/schema \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "deleted": ["products"]
}

Get table schema

GET /api/v1/schema/{table}

Get the schema of the specified table.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
Successful responses
ResponseTypeDescription
200 OKobjectThe current schema of the table
Response format
{
// Type of the database schema item.
"type": string,
// Table columns.
"columns": {
// Type of the column.
"<yourColumnName>": ColumnType
}
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.api.aito.ai/api/v1/schema/products \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "type": "table",
  "columns": {
    "id": { "type": "Int", "nullable": false },
    "name": { "type": "String", "nullable": false },
    "price": { "type": "Decimal", "nullable": false },
    "description": { "type": "Text", "nullable": false, "analyzer": "english" }
  }
}

Create table schema

PUT /api/v1/schema/{table}

Update a schema of the specified table.

Note: The table schema cannot be updated if it contains data.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
bodyrequiredobjectThe new schema of the table
Successful responses
ResponseTypeDescription
200 OKobjectThe current schema of the table
Request format
{
// Type of the database schema item.
"type": string,
// Table columns.
"columns": {
// Type of the column.
"<yourColumnName>": ColumnType
}
}
Response format
{
// Type of the database schema item.
"type": string,
// Table columns.
"columns": {
// Type of the column.
"<yourColumnName>": ColumnType
}
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.api.aito.ai/api/v1/schema/products \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "type": "table",
    "columns": {
      "id": { "type": "Int" },
      "name": { "type": "String" },
      "price": { "type": "Decimal" },
      "description": { "type": "Text", "analyzer": "English" }
    }
  }'

Response

{
  "type": "table",
  "columns": {
    "id": { "type": "Int" },
    "name": { "type": "String" },
    "price": { "type": "Decimal" },
    "description": { "type": "Text", "analyzer": "english" }
  }
}

Delete table

DELETE /api/v1/schema/{table}

Delete a single table in the schema.

The operation deletes all data and contents of the table! The action is irreversible.

Note: The delete operation would fail if it leaves the database schema in broken state.

For example, given the following schema:

{
  "schema": {
    "users": {
      "type": "table",
      "columns": {
        "username": { "type": "String" }
      }
    },
    "sessions" : {
      "type": "table",
      "columns": {
         "id"     : { "type" : "String" },
         "user"   : { "type" : "String", "link": "users.username" }
      }
    }
  }
}

The users table cannot be deleted before changing the sessions table first so that sessions.user is not linked to the users table.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
Successful responses
ResponseTypeDescription
200 OKobjectThe summary of deletion
Response format
{
// Array of table names deleted. Required.
"deleted": [string, ...]
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.api.aito.ai/api/v1/schema/products \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "deleted": ["products"]
}

Get column schema

GET /api/v1/schema/{table}/{column}

Get the schema of a column.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
tablerequiredstringThe name of the column
Successful responses
ResponseTypeDescription
200 OKobjectThe current schema of the column
Response format

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.api.aito.ai/api/v1/schema/products/name \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "type": "String",
  "nullable": false
}

Add or replace column

PUT /api/v1/schema/{table}/{column}

Add or replace a column of a table.

If a column with the same name already exists then the operation deletes all data and contents of the column! The action is irreversible.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
tablerequiredstringThe name of the column
bodyrequiredobjectThe schema of the column
Successful responses
ResponseTypeDescription
200 OKobjectThe schema of the column
Request format
{
// Value that existing rows get.
"value":
integer
or
number
or
boolean
or
null
or
string
}
Response format

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.api.aito.ai/api/v1/schema/products/quantity \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "type": "Int",
    "nullable": false,
    "value": 0
  }'

Response

{
  "type": "Int",
  "nullable": false,
  "value": 0
}

Delete column

DELETE /api/v1/schema/{table}/{column}

Delete a column from a table.

The operation deletes all data and contents of the column! The action is irreversible.

Note: The delete operation would fail if it leaves the database schema in broken state.

For example, given the following schema:

{
  "schema": {
    "users": {
      "type": "table",
      "columns": {
        "username": { "type": "String" },
        "name": { "type": "String" }
      }
    },
    "sessions" : {
      "type": "table",
      "columns": {
         "id"     : { "type" : "String" },
         "user"   : { "type" : "String", "link": "users.username" }
      }
    }
  }
}

The column username of the users table cannot be deleted before changing the sessions table first so that sessions.user is not linked to users.username.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
tablerequiredstringThe name of the column
Successful responses
ResponseTypeDescription
200 OKobjectThe summary of deletion
Response format
{
// Array of table names deleted. Required.
"deleted": [string, ...]
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.api.aito.ai/api/v1/schema/products/description \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {}'

Response

{
  "deleted": ["description"]
}

Insert entry

POST /api/v1/data/{table}

Insert entry to a table.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
bodyrequiredobjectAny object which is valid according to the provisioned schema
Successful responses
ResponseTypeDescription
200 OKobjectThe inserted entry
Request format
Response format

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.api.aito.ai/api/v1/data/products \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "id": 1,
    "name": "Apple iPhone 8 64 Gt, spacegray",
    "price": 648.9,
    "description": "A11 processor and wireless charging."
  }'

Response

{
  "id": 1,
  "name": "Apple iPhone 8 64 Gt, spacegray",
  "price": 648.9,
  "description": "A11 processor and wireless charging."
}

Insert multiple entries

POST /api/v1/data/{table}/batch

Import multiple entries into the database.

The batch import can be used to upload multiple entries to a single table. The payload needs to be a valid JSON array (instead of ndjson).

Note: batch API supports max 10MB payloads.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
bodyrequiredarrayAn array of objects which are valid according to the provisioned schema
Successful responses
ResponseTypeDescription
200 OKobjectSummary of the inserted entries
Request format
Response format
{
// How many entries were inserted. Required.
"entries": integer,
// Status text. Required.
"status": string
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.api.aito.ai/api/v1/data/products/batch \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  [
    {
      "id": 1,
      "name": "Apple iPhone 8 64 Gt, spacegray",
      "price": 648.9,
      "description": "A11 processor and wireless charging."
    },
    {
      "id": 2,
      "name": "Apple iPhone X 32 GB, space gray",
      "price": 1048.9,
      "description": "All‑screen design. Longest battery life ever in an iPhone."
    },
    {
      "id": 3,
      "name": "Samsung Galaxy S9",
      "price": 698.2,
      "description": "The Camera. Reimagined."
    }
  ]'

Response

{
  "status": "ok",
  "entries": 3
}

Delete entries

POST /api/v1/data/_delete

Delete entries with a Search-like interface.

You can describe the target table and filters for which entries to delete.

An empty proposition will match and delete everything!

Parameters
NameTypeDescription
bodyrequiredobjectDeletes all entries, which match the where-clause, from the specified table
Successful responses
ResponseTypeDescription
200 OKobjectSummary of deleted content
Request format
{
// From expression declares the table containing deleting // entries. Required.
"from": FromTablemodify,
// Proposition expression describes a fact, or a statement. // Required.
"where": Proposition
}
Response format
{
// The number of rows that was deleted. Required.
"total": integer
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.api.aito.ai/api/v1/data/_delete \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "from": "products",
    "where": { "id": 1 }
  }'

Response

{
  "total": 1
}

Initiate file upload

POST /api/v1/data/{table}/file

Initiate a file upload session.

The file API allows circumventing the batch upload API payload size limit by allowing upload of large data sets. The file API accepts data in gzip compressed ndjson format, stored into a file.

File must be a gzip compressed ndjson, normal JSON arrays are not accepted.

The data file is uploaded to AWS S3 and processed asynchronously. The file must be compressed with gzip before uploading to reduce the size of the transferred data.

The file API is not a single API, but requires a minimum of three calls (per table). The sequence is as follows:

  1. Initiate the file upload process
  2. Upload compressed ndjson file to S3, using the signed URL
  3. Trigger file processing
  4. (Optional) Poll the file processing status
Loading diagram...

You can find the bash implementation of the flow at our tools repository. See the upload-file.sh script.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
Successful responses
ResponseTypeDescription
200 OKobjectThe details to execute the S3 upload and the job's id
Response format
{
// The uuid of the file upload session. Required.
"id": string,
// The presigned S3 url where to push the data. Required.
"url": string,
// The http method used for uploading to S3. Required.
"method": string,
// Defines when the presigned upload link expires. Required.
"expires": string
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.api.aito.ai/api/v1/data/products/file \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "id": "6e856f71-8dad-4aec-acef-b2bb542a275a",
  "url": "https://aitoai-customer-uploads.s3.eu-west-1.amazonaws.com/your-env-name/products/6e856f71-8dad-4aec-acef-b2bb542a275a?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20191212T100154Z&X-Amz-SignedHeaders=host&X-Amz-Expires=1199&X-Amz-Credential=AKIAI5RFK3NF3YVDPC6A%2F20191212%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Signature=f526effb2f4ed878de556612cc16796e0a59d6e294f0456ca00432847a916dd6",
  "method": "PUT",
  "expires": "2019-12-12T10:21:54"
}

Trigger file processing

POST /api/v1/data/{table}/file/{uuid}

Start the processing of a previously uploaded file.

Note: This operation is part of the file upload sequence. If you want to read how to execute a full file upload flow, see Initiate file upload documentation.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
uuidrequiredstringThe assigned id of the operation
Successful responses
ResponseTypeDescription
200 OKobjectProcessing started status
Response format
{
// The id of the operation. Required.
"id": string,
// Textual description of the job. Required.
"status": string
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.api.aito.ai/api/v1/data/products/file/ee4fd6f1-9822-41c2-8cf8-8e9e26e8eb00 \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "id": "ee4fd6f1-9822-41c2-8cf8-8e9e26e8eb00",
  "status": "started"
}

Get file processing status

GET /api/v1/data/{table}/file/{uuid}

Get the file upload progress.

The response is probabilistic and might not contain the very last result, since the status update is asynchronous, and the upload happens in multiple parallel streams. The response, however, will give an idea of approximate progress.

Note: This operation is part of the file upload sequence. If you want to read how to execute a full file upload flow, see Initiate file upload documentation.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
uuidrequiredstringThe assigned id of the operation
Successful responses
ResponseTypeDescription
200 OKobjectThe file processing status
Response format
{
"status": {
// Total duration of the file processing elapsed. Required.
"totalDurationMs": number,
// Total duration of the file processing elapsed as human // readable units. Required.
"totalDuration": string,
// Throughput of lines in human readable units.
"throughput": string,
// When the file processing was started. Required.
"startedAt": string,
// When the file processing was finished. Required.
"finishedAt": string,
// Is the job finished or not. Required.
"finished": boolean,
// The number of lines completed so far. Required.
"completedCount": integer,
// Any object which is valid according to the database schema. // Required.
"lastSuccessfulElement": UserDefinedObject
},
"errors": {
// Human consumable description. Required.
"message": string,
// Array of failing elements.
"rows": [UserDefinedObject, ...]
}
}

Request during processing

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.api.aito.ai/api/v1/data/products/file/3480ff34-ad1a-4662-91aa-da6218aeb8db \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

The example shows what the response looks while data processing is still in progress.

{
  "status": {
    "phase": "AitoDatabaseInsert",
    "finished": false,
    "completedCount": 0,
    "startedAt": "20191212T100157.009Z",
    "throughput": "0/s"
  },
  "errors": { "message": "Last 0 failing rows", "rows": null }
}

Request after processing

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.api.aito.ai/api/v1/data/products/file/224eb018-360e-492a-96ab-3fa4c4065a76 \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

The example shows what the response looks after data processing has been successfully done.

{
  "status": {
    "totalDurationMs": 1929,
    "phase": "AitoDatabaseInsert",
    "finished": true,
    "completedCount": 3,
    "lastSuccessfulElement": {
      "id": 3,
      "name": "Samsung Galaxy S9",
      "price": 698.2,
      "description": "The Camera. Reimagined."
    },
    "totalDuration": "1 second and 929 milliseconds",
    "startedAt": "20191212T100158.473Z",
    "finishedAt": "20191212T100200.402Z",
    "throughput": "1.56/s"
  },
  "errors": { "message": "Last 0 failing rows", "rows": null }
}

Optimize the database

POST /api/v1/data/{table}/optimize

Optimize the database for the query performance

Aito.ai database is implementented as a log-structured merge-tree. Because this architecture, Aito's tables are implemented internally as a tree of table segments.

Now, the complexity of the table tree has major implications on both query speed and write speed side. The less segments Aito maintains in the tree, the faster the queries are, but the slower the writes are, because Aito needs to rewrite parts of the tree regularly. Similarly the more segments are allowed, the slower the queries are, but the faster the write speed becomes.

Aito seeks to maintain the approximately O(log N) segments in the table tree in order to maintain a reasonable compromise between the query and the write speeds.

Still, there can be situations, where it is beneficial to rewrite the entire database as a single segment to get the optimal query speed. Optimize operation does this.

It may take minutes or hours to optimize a big table. This means, that optimize should be used to improve the query performance only in situations, when the database and the results need to be updated rarely, for example nightly.

Optimize will maintain a write lock on the database over the entire operation. This means that you cannot add data at the time the optimize operation is running. Still, the queries will work normally. After the optimize is finished, the optimized table needs to be reloaded, which can induce a significant latency for the following query.

Parameters
NameTypeDescription
bodyrequiredobjectAn empty object
Successful responses
ResponseTypeDescription
200 OKobjectAn empty object
Request format
object
Response format
object

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.api.aito.ai/api/v1/data/products/optimize \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {}'

Response

{}

Database Schema

The Aito database requires a schema to operate. The schema defines:

  • The name of the tables
  • The name and the ColumnType of the columns in each table
  • The Analyzer of a column if needed
  • The relationships between tables

Please refer to the Defining a database schema guide for more details.

UserDefinedTableSchema

Any schema which is a valid Aito table schema.

Table schema describes the structure of the table in a formal language. The schema describes all fields (or columns), data types of the fields, and information to help Aito preprocess your data. For example what language a textual data contains.

The contents of the schema depends on the data that will be inserted into the database.

Format
{
// Type of the database schema item.
"type": string,
// Table columns.
"columns": {
// Type of the column.
"<yourColumnName>": ColumnType
}
}

Example

{
  "type": "table",
  "columns": {
    "id": { "type": "Int", "nullable": false },
    "name": { "type": "String", "nullable": false },
    "price": { "type": "Decimal", "nullable": false },
    "description": { "type": "Text", "nullable": false, "analyzer": "English" }
  }
}

ColumnType

Type of the column.

Describes an individual field (or column), the type, and information to help Aito preprocess your data. For example what language a textual data contains.

Format

Examples

{ "type": "int", "nullable": false }
{ "type": "string", "nullable": false }
{ "type": "decimal", "nullable": false }
{ "type": "text", "nullable": false, "analyzer": "english" }

BooleanType

Boolean column type.

When column is a boolean, the only accepted values are true and false.

Format
{
// Type of the column. Required.
"type": string,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Example

{ "type": "boolean" }

DecimalType

Double-precision floating-point number.

Format
{
// Type of the column. Required.
"type": string,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Example

{ "type": "Decimal", "nullable": false }

IntType

Integer column type.

Format
{
// Type of the column. Required.
"type": string,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Examples

{ "type": "Int" }
{ "type": "Int", "link": "users.id" }

StringType

String column type.

The string data type is a primitive version of the Text type. The value is turned into a single feature. For example "lazy black cat" becomes 1 feature: "lazy black cat".

Format
{
// Type of the column. Required.
"type": string,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Examples

{ "type": "String", "nullable": false }
{ "type": "String", "link": "messages.id" }

TextType

Text column type.

The text data type enables smart textual analysis of strings. A text column has an analyzer which defines how the text can be split into words or tokens, which are used as features during inference.

Format
{
// Type of the column. Required.
"type": string,
// Aito analyzers break the [Text type](#schema-text-type) data // into features that can be used for inference.
"analyzer": Analyzer,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Example

{ "type": "Text", "analyzer": "English", "nullable": false }

Analyzer

Aito analyzers break the Text type data into features that can be used for inference.
Let's take a look at an example of predicting the category of a product using its description using the following data:

descriptiontags
Brazilian organic orangeorganic, fruit, imported
Local organic spinachorganic, vegetable, local
Lentil snacksnack
  • Given a description of "organic tomatoes", we would like to predict the tag of this product.
  • If no analyzer is defined, the description is treated as a String type and the description "organic tomatoes" is turned into only 1 feature "organic tomatoes". Since there is no entry in the given data containing the description "organic tomatoes", Aito is not able to provide any meaningful prediction for the tags.
  • Using the default English analyzer, "organic tomatoes" will be turned into 2 features "organ" (the English stem of the word "organic") and "tomato", "Brazilian organic orange" will be turned into 3 features brazilian, "organ", and "orang", other descriptions will be turned into features in a similar fashion.
  • Aito can now find patterns between these features. For example, when the description has a feature "organ", the tag is likely "organic". Hence, using the analyzer, Aito can return reasonable prediction for unseen entry.
Format

Examples

"standard"
"whitespace"
"english"
"en"
{ "type": "delimiter", "delimiter": "," }
{ "type": "language", "language": "en" }
{ "type": "char-ngram", "minGram": 2, "maxGram": 3 }
{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2 }

AliasAnalyzer

Aito has several built-in analyzers and they are selected by using their name in the "analyzer" field of a text column. For instance:

{ "analyzer": "english" }

The built-in analyzers include:

  • Standard Analyzer:
    • Name: "standard"
    • A good default analyzer which Works well in most languages. The analyzer generates features based on the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29. The standard analyzer filters English stop words that are normally not useful.
    • E.g: "the cats are running" will be break down into "cats", "running".
  • Whitespace Analyzer:
    • Name: "whitespace"
    • The analyzer breaks the text into features whenever it encounters a whitespace character. Adjacent sequences of non-Whitespace characters form tokens.
    • E.g: "the cats are running" will be break down into "the", "cats", "are", and "running".
  • Language Analyzer:
    • Alias: the language name or the language ISO 639-1 Code (except some special case)
    • A Language Analyzer with the default setting (no stopword or keyword).
    • See Language Analyzer for supported languages and its aliases.
Format
string

Examples

"standard"
"whitespace"
"english"
"en"

CharNGramAnalyzer

The Character N-gram Analyzer breaks text into n-gram features.

For example, the following n-gram analyzer:

{ "type": "char-ngram", "minGram": 3, "maxGram": 3 }

would break the text "the cats are running" into the following list of features:

["the", "he ", "e c", " ca", "cat", "ats", "ts ", "s a", " ar", "are", "re ", "e r", " ru", "run", "unn", "nni", "nin", "ing"]

The analyzer can be useful for languages that don’t use spaces or that have long compound words, like German.

Format
{
// Type of the analyzer. Required.
"type": string,
// The minimum length of characters in a feature. Required.
"minGram": integer,
// The maximum length of characters in a feature. Required.
"maxGram": integer
}

Example

{ "type": "char-ngram", "minGram": 2, "maxGram": 3 }

DelimiterAnalyzer

The Delimiter Analyzer breaks text into features whenever encounters a specified delimiter character.

With the trimWhitespace option, the analyzer trims the whitespace surrounding a feature.

For example, the following analyzer:

{
  "type": "delimiter",
  "delimiter": ",",
  "trimWhitespace": true
}

would break the text "the, cats,are, running" into 4 features:

["the", "cats", "are", "running"]
Format
{
// Type of the analyzer. Required.
"type": string,
// The delimiter. Required.
"delimiter": string,
// Trims leading and trailing whitespace of the features. // Default: true
"trimWhitespace": boolean
}

Examples

{ "type": "delimiter", "delimiter": "," }
{ "type": "delimiter", "delimiter": "\n", "trimWhitespace": true }

LanguageAnalyzer

Language Analyzers aim to analyze text of a specific language.

When using a language analyzer, text is analyzed into lower-case word stem features. For example, using the following english analyzer:

{ "type": "language", "language": "english" }

a text "the cats are running" will be broken into 4 word stem features:

["the", "cat", "ar", "run"]

The value of the "language" parameter specifies which language will be used. The value can be the name or the ISO 639-1 code of the language. The full list is shown as below:

LanguageNameISO code
Arabicarabicar
Armenianarmenianhy
Basquebasqueeu
Brazilian Portuguesebrazilianpt-br
Bulgarianbulgarianbg
Catalancatalanca
Chinese, Japanese, Koreancjkcjk
Czechczechcs
Danishdanishda
Dutchdutchnl
Englishenglishen
Finnishfinnishfi
Frenchfrenchfr
Galiciangaliciangl
Germangermande
Greekgreekel
Hindihindihi
Hungarianhungarianhu
Indonesianindonesianid
Irishirishga
Italianitalianit
Latvianlatvianlv
Norwegiannorwegianno
Persianpersianfa
Portugueseportuguesept
Romanianromanianro
Russianrussianru
Spanishspanishes
Swedishswedishsv
Thaithaith
Turkishturkishtr

The language analyzers support filtering the stopwords (common words that are normally not useful). Each language has a list of default stopwords for filtering that can be enabled through the useDefaultStopWords" parameter. Some common English stopwords are:

  "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", 
  "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", 
  "such", "that", "the", "their", "then", "there", "these", 
  "they", "this", "to", "was", "will", "with"

By default, "useDefaultStopWords" is set as false. The following analyzer:

{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": true
}

would break the text "the cats are running" into 2 features:

["cat", "run"]

It is also possible to specify a set of words that would be filtered through the "customStopWords" parameter and a set of words that would not be analyzed through the "customKeyWords" parameter. The following analyzer:

{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": false,
  "customStopWords": ["cats"],
  "customKeyWords": ["running"]
}

would break the text "the cats are running" into 3 features:

["the", "ar", "running"]
Format
{
// Type of the analyzer. Required.
"type": string,
// Name or code of the language. Required.
"language": string,
// Use the language default stopwords. // Default: false
"useDefaultStopWords": boolean,
// List of words that will be filtered. // Default:
"customStopWords": [string, ...],
// List of words that will not be featurizerd. // Default:
"customKeyWords": [string, ...]
}

Examples

{ "type": "language", "language": "en" }
{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": true,
  "customStopWords": ["flower"],
  "customKeyWords": ["animal"]
}

TokenNGramAnalyzer

The Token N-gram Analyzer breaks text into token n-grams (shingles) based on a source analyzer. In other words, it combines the features of the source analyzer into new features.

For example, the following Token N-gram Analyzer:

{
  "type": "token-ngram",
  "source": "english",
  "minGram": 1,
  "maxGram": 2,
  "tokenSeparator": "_"
}

would breaks the text "the cat is running" into the following list of features:

["the", "the_cat", "cat", "cat_ar", "ar", "ar_run", "run"]
Format
{
// Type of the analyzer. Required.
"type": string,
// Aito analyzers break the [Text type](#schema-text-type) data // into features that can be used for inference. Required.
"source": Analyzer,
// The minimum number of features to be combined. Required.
"minGram": integer,
// The maximum number of features to be combined. Required.
"maxGram": integer,
// The string used to join the features of the source analyzer. // Default: " "
"tokenSeparator": string
}

Examples

{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2 }
{
  "type": "token-ngram",
  "source": { "type": "delimiter", "delimiter": "," },
  "minGram": 1,
  "maxGram": 3,
  "tokenSeparator": "_"
}

Query language

The reference documentation for Aito query language.

Common concepts

Features

To make better analysis of the data, Aito splits fields into features under the hood. How the featurisation is done, depends on the field type. For example the Text type supports an "analyzer" option which allows you to control how a text field is splitted into features.

Some queries, for example Relate, return the features instead of the actual values of the field.

Exclusiveness

Exclusiveness is an option in predictions. In summary, it describes whether the predicted field can have multiple values at the same time or not.

Understanding the concept is easiest through an example. If we were predicting tags for a product, we would want to set "exclusiveness": false, because a product can have multiple tags. A product could be described with the following tags:

Venn diagram of tags

However if we were predicting the user, who would most likely purchase a product, we would want to use "exclusiveness": true (default behavior) because the value can only be one user at a time.

Venn diagram of users

$p vs $lift

If we were trying to find a customer, who is best characterized by a message, we'd need to understand the difference between $p and $lift. To make the difference clear, consider the following situation:

  • Alice messages often, but she doesn't mention iPhone often
  • Bob messages rarely, but only about iPhones

Querying users by $p quite likely finds Alice, because she may be overall the more likely person to mention "iPhone". Querying users by $lift, on the other hand will very certainly find Bob, because $lift describes that how characteristic the feature "iPhone" is for the user.

A more mathematical and technical description for the phenomenon is the following:

Aito uses Bayesian probability inference to estimate p(Xcontext)\footnotesize p(X|context) so that p(Xcontext)=p(X)lift(Xcontext)\footnotesize p(X|context) = p(X) * lift(X|context) where the probability lift component lift(Xcontext)=p(contextX)/p(context)\footnotesize lift(X|context) = p(context|X)/p(context)

The probability lift component describes that how much more likely X is true in the specified context, when compared to average.

In Aito query syntax: $p stands for the p(Xcontext)\footnotesize p(X|context), while $lift stands for the lift(Xcontext)\footnotesize lift(X|context) component.

Text operators

Useful for creating conditional queries with text fields.

$match

Operator to check if a textual field fuzzy matches a given string.

Case insensitive. The matched text is split to tokens with the analyzer specified for the field in schema. For example { "$match": "great programmers" } will match strings "Bob is the greatest programmer!", and "Programmers are having great fun" if the field is properly analyzed with the English analyzer.

Format
{
"$match":
string
}

Examples

{ "$match": "coffee" }
{
  "from": "products",
  "where": {
    "name": { "$match": "coffee" }
  }
}

$startsWith

Operator to check if a textual field starts with a given string. Case sensitive.

Format
{
"$startsWith":
string
}

Examples

{ "$startsWith": "Cucumber" }
{
  "from": "products",
  "where": {
    "name": { "$startsWith": "Cucumber" }
  }
}

Comparison operators

Useful for creating conditional queries.

$gt

Operator to check if a field is greater than a given value.

Format
{
"$gt":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$gt": 8 }
{ "$gt": 231.1 }
{ "$gt": "20150308" }
{
  "from": "products",
  "where": {
    "price": { "$gt": 2.14 }
  }
}

$gte

Operator to check if a field is greater than or equal to a given value.

Format
{
"$gte":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$gte": -2 }
{ "$gte": 0 }
{ "$gte": "20180502" }
{
  "from": "products",
  "where": {
    "price": { "$gte": 2 }
  }
}

$lt

Operator to check if a field is less than a given value.

Format
{
"$lt":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$lt": 4 }
{ "$lt": -12.1 }
{ "$lt": "20180502" }
{
  "from": "products",
  "where": {
    "price": { "$lt": 1.24 }
  }
}

$lte

Operator to check if a field is less than or equal to a given value.

Format
{
"$lte":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$lte": 8 }
{ "$lte": 0 }
{ "$lte": "20180502" }
{
  "from": "products",
  "where": {
    "price": { "$lte": 1 }
  }
}

$has

Has operation checks whether the field has the specified feature.

$has is a low level operation, that operates at the feature level. The features can differ significantly from the original data, specifically in case of text, when analyzers are used.

For example if you have field called content with the text "programmers and horses", the field would have features 'programmer' and 'hors', which are stems by the English analyzer.

Format
{
"$has":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$has": "drink" }
{
  "from": "products",
  "where": {
    "tags": { "$has": "drink" }
  }
}

$defined

Operator to select rows based on if an nullable field has been defined or not.

Format
{
"$defined": boolean
}

Example

{ "$defined": true }

$exists

An operator to get features of given field(s).

Format
{
// PropositionSet expression is used to describe a collection // of propositions.
"$exists": PropositionSet
}

Examples

{
  "$exists": ["query", "product.tags"]
}
{
  "from": "impressions",
  "where": {
    "$on": [
      {
        "$exists": ["query", "customer.tags"]
      },
      { "click": true }
    ]
  },
  "relate": ["product.title", "product.tags"]
}

Logical operators

Useful for combining multiple conditions in conditional queries.

$and

Performs a logical and operation on the given array containing two or more Propositions.

Format
{
"$and": [Proposition, ...]
}

Examples

{
  "$and": [
    { "$gt": 10 },
    { "$lt": 20 }
  ]
}
{
  "from": "products",
  "where": {
    "price": {
      "$and": [
        { "$gt": 1.5 },
        { "$lt": 2.1 }
      ]
    }
  }
}

$or

Performs a logical or operation on the given array containing two or more Propositions.

Format
{
"$or": [Proposition, ...]
}

Examples

{
  "$or": [
    { "tags": "cover" },
    { "tags": "laptop" }
  ]
}
{
  "from": "products",
  "where": {
    "price": {
      "$or": [
        { "$lt": 0.9 },
        { "$gt": 2.1 }
      ]
    }
  }
}

$not

Performs a logical not operation on the given Proposition.

Format
{
// Proposition expression describes a fact, or a statement.
"$not": Proposition
}

Examples

{
  "$not": { "tags": "laptop" }
}
{
  "$not": { "$lt": 0 }
}
{
  "from": "products",
  "where": {
    "price": {
      "$not": { "$lt": 1.1 }
    }
  }
}

Sort operators

Can be used in "orderBy" clause to declare the sorting order of the result.

$asc

Sort returned hits in ascending order (A-Z) based on the given attribute or custom scoring function.

Format
{
// Value expression resolves to a primitive like int or json, // score, probability or.
"$asc": Value
}

Examples

{ "$asc": "price" }
{ "$asc": "product.price" }
{
  "$asc": {
    "$multiply": ["product.price", "$p"]
  }
}

$asc(Relate)

Sort returned hits in ascending order (A-Z) based on the given attribute (or column).

Format
{
"$asc": string
}

Example

{ "$asc": "lift" }

$desc

Sort returned hits in descending order (Z-A) based on the given attribute or custom scoring function.

Format
{
// Value expression resolves to a primitive like int or json, // score, probability or.
"$desc": Value
}

Examples

{ "$desc": "price" }
{ "$desc": "product.price" }
{
  "$desc": {
    "$multiply": ["product.price", "$p"]
  }
}

$desc(Relate)

Sort returned hits in descending (Z-A) order based on the given attribute (or column).

Format
{
"$desc": string
}

Example

{ "$desc": "info.miTrue" }

Arithmetic operators

Can be used in conditional queries or scoring in "orderBy" clauses.

$mod

Operator to check if the value of a field divided by a divisor has the specified remainder.

In other words perform a modulo operation. This operator supports object or array form. Note that the field will be converted to an integer (effectively a math floor) before the modulo operation.

Format

Examples

{
  "$mod": [2, 0]
}
{
  "$mod": { "divisor": 2, "remainder": 0 }
}
{
  "from": "products",
  "where": {
    "price": {
      "$mod": { "divisor": 2, "remainder": 0 }
    }
  }
}

$multiply

Multiplication operation of given items.

Format
{
"$multiply": [Score, ...]
}

Example

{
  "$multiply": ["price", 2]
}

$pow

Exponentiation operation. First item raised to the power of the second.

Format

Example

{
  "$pow": ["width", 2]
}

$sum

Calculates sum of given items.

Format
{
"$sum": [Score, ...]
}

Example

{
  "$sum": ["priceNet", "priceVat"]
}

Advanced operators

More advanced operators which can improve query results in certain situations.

$atomic

Transforms a statement into a 'black box' proposition.

This prevents Aito from analyzing the proposition and using its parts separately in the statistical reasoning.

In practice the difference between normal 'white box' expressions, and the $atomic's black box expressions is: that the atomic expressions have a smaller bias, but a higher measurement error.

Consider the following example:

{
  "tags": "pen",
  "price": { "$gte": 200 } }
}

During the statistical reasoning: Aito may recognize that pens are often sold, and that over 200€ product purchases are somewhat common. As a result, Aito might assume the over 200€ pen to be a popular product.

Now, consider the expression:

{
  "$atomic": {
    "tags": "pen",
    "price": { "$gte" : 200 }
  }
}

The results of this expression will depend of the amount of data. If there are no over 200€ pens in the data: Aito will make no assumptions of the proposition's effect. On the other hand, if you have the data: Aito will recognize correctly, that the over 200€ pens are bought extremely rarely.

Format
{
// Proposition expression describes a fact, or a statement.
"$atomic": Proposition
}

Examples

{
  "$atomic": {
    "tags": "pen",
    "price": { "$gte": 200 }
  }
}
{
  "from": "products",
  "where": {
    "$atomic": {
      "tags": "pen",
      "price": { "$gte": 200 }
    }
  }
}

$context

Provides ability to access the fields of the table specified in "from", instead of fields of the table in "get".

Format
{
// Proposition expression describes a fact, or a statement.
"$context": Proposition
}

Examples

{
  "$context": { "click": true }
}
{
  "from": "impressions",
  "where": { "customerEmail": "john.doe@aito.ai", "query": "laptop" },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

$hit

Provides ability to access the fields of the hit.

Format
{
// Score expression resolves to a numeric score value or // probability.
"$hit": Score
}

Examples

{ "$hit": "price" }
{ "$hit": "$similarity" }
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$multiply": [
      { "$hit": "$similarity" },
      { "$hit": "price" }
    ]
  }
}

$on

$on operator is used to define conditional propositions or hard filters.

This is useful when you have limited amount of data and the condition would help to limit the context and provide better results. This can be done by providing a list containing of two items, the first object (or "prop") is the hypothesis and the second object (or "on") is the conditional.

In Aito the where clause contains propositions which aren't hard filters. Instead, Aito will turn all the propositions into features (the user's ID, every word in a text field, etc.). There are many of these and they are not statistically independent. Aito picks a subset of these features that are the best predictors of the field that is to be predicted. So what goes into the "where" is a description of the situation you're in and Aito tells you what you should expect to find if you look in a field. But the description is not taken at face value, Aito will ignore parts of it if it doesn't help the prediction.

However, there is another way to achieve this: the "$on" proposition. It is modeled after conditional probability. It is divided into two parts, the normal "where" parts and the conditional part ("hard filters"). The "$on" parameters explained:

{
  "from": "...",
  "where": {
    "$on": [
      {
        "message": "hello, world",
        "something": true,
        // other things you put in your "where" clause
      },
      {
        // The subset of data that exactly matches these conditions
        "userId": 42,
        "day": "monday"
      }
    ]
  },
  "predict": "..."
}

The $on can also be combined with normal query. If the $on condition is too strong, you could move parts of the filtering back to the where clause:

{
  "from": "...",
  "where": {
    "$on": [
      {
        "message": "hello, world",
        "something": true,
        // other things you put in your "where" clause
      },
      {
        // The subset of data that exactly matches these conditions
        "day": "monday"
      }
    ],
    "user_id": 42
  },
  "predict": "..."
}
Format

Examples

{
  "$on": {
    "prop": { "click": true },
    "on": { "user.tags": "nyc" }
  }
}
{
  "$on": [
    { "click": true },
    { "user.tags": "nyc" }
  ]
}

$knn

The $knn operator is an adaptation of the classic k-nearest neighbor algorithm.

Aito's $knn operator identifies k most similar rows to the conditions defined in the 'near' parameter. The similarity metric is the same metric used in the similarity query. The k nearest rows can be used in inference.

The $knn operator can be useful in situation where there is no training data. For example:

{
  "from": "impressions",
  "where": {
    "product.name": "Columbian Coffee",
    "product.tags": "high quality coffee"
  },
  "predict": "purchase"
}

The query would not yield sensible results since there's no such product existed in the current data. This can be improved by using the $knn operator:

{
  "from": "impressions",
  "where": {
    "$knn": {
      "k": 5,
      "near": {
        "product.name": "Columbian Coffee",
        "product.tags": "high quality coffee"
      }
    }
  },
  "predict": "purchase"
}

In the query above, Aito would look for 5 entries that are most similar to the given criteria in "near" and use that for inference.

Format

Examples

{
  "$knn": [
    4,
    { "tags": "laptop" }
  ]
}
{
  "$knn": {
    "k": 4,
    "near": { "tags": "laptop" }
  }
}

$numeric

Operator to check if a numeric field fuzzy matches a given number.

By default, numbers are compared exactly against one another. The $numeric proposition signifies that comparisons should be inexact and that the target is somewhere close to the specified number. The size of the region depends on the spread and density of the data.

Format
{
"$numeric":
integer
or
number
or
null
}

Examples

{ "$numeric": 42 }
{ "$numeric": 3.14 }

$hash

$hash converts the field value into a hash integer.

The hash code can be used to split non-integer data pseudo-randomly in the evaluate query.

Format
{
// Proposition expression describes a fact, or a statement.
"$hash": Proposition
}

Example

{
  "$hash": {
    "$mod": [2, 1]
  }
}

Scoring operators

Can be used in "orderBy" clause to sort or create an advanced scoring algorithm.

$lift

"$lift" can be used in the "orderBy" clause of the Generic query to get the most likely values based on lifts of features with regard to other features.

In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest lifts that it would be purchased:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": "$lift"
}

Running the following query would yield the most likely product based on all the fields of the linked product table:

{
  "from": "impressions",
  "where": {
    "session.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": "$lift"
}

Since the product field in the impressions table is linked to the products table, Aito would find all the statistical relations between what is declared inside the "where" clause and all the fields feature of a product, that is, id, name, category, price, tag. In this case, the lift score is the product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select" clause (e.g: "select": ["$score", "$why"]):

"$why": {
  "type": "product",
  "factors": [
    {
      "type": "hitVariableLift",
      "variable": "id:6410405093677",
      "value": 1.9827806375460209,
      "factors": [
        {
          "type": "relatedVariableLift",
          "variable": "purchase:true",
          "value": 1.9827806375460209
        }
      ]
    },
    {
      "type": "hitVariableLift",
      "variable": "not(name:puikula)",
      "value": 1.0472308585357502,
      "factors": [
        {
          "type": "relatedVariableLift",
          "variable": "purchase:true",
          "value": 1.0472308585357502
        }
      ]
    }
    ...
  ]
}

We can see that the lift score is composed of lift of an id feature, a name feature and others.

See also $p and $lift.

Format
string

Examples

"$lift"
{
  "from": "messages",
  "where": {
    "message": { "$match": "dog" }
  },
  "get": "user",
  "orderBy": "$lift"
}

$p

"$p" can be used in the "orderBy" clause of the Generic query to get the most probable values. When used this way, it is similar to the Match query.

In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest probabilities that it would be purchased:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": "$p"
}

Similar to the Match query, running the following query would yield the most likely product based on all the fields of the linked product table:

{
  "from": "impressions",
  "where": {
    "session.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": "$p"
}

Since the product field in the impressions table is linked to the products table, Aito would find all the statistical relations between what is declared inside the "where" clause and all the fields feature of a product, that is, id, name, category, price, tag. In this case, the probability score is the normalized product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select" clause (e.g: "select": ["$score", "$why"]):

"$why": {
  "type": "product",
  "factors": [
    {
      "type": "hitVariableLift",
      "variable": "id:6410405093677",
      "value": 1.9827806375460209,
      "factors": [
        {
          "type": "relatedVariableLift",
          "variable": "purchase:true",
          "value": 1.9827806375460209
        }
      ]
    },
    {
      "type": "hitVariableLift",
      "variable": "not(name:puikula)",
      "value": 1.0472308585357502,
      "factors": [
        {
          "type": "relatedVariableLift",
          "variable": "purchase:true",
          "value": 1.0472308585357502
        }
      ]
    }
    ...
  ]
}

We can see that the probability score is composed of lift of an id feature, a name feature and others.

See also $p and $lift.

Format
string

Examples

"$p"
{
  "from": "messages",
  "where": {
    "message": { "$match": "dog" }
  },
  "get": "user",
  "orderBy": "$p"
}

$similarity

"$similarity" can be used in Generic query to get most similar rows based on the contents of the "where" clause.

Consider the following example. It will return all the products, that contain 'iphone' in the title. It also sorts the results by their similarity to the 'iphone' and highlight the 'iphone' term in the product title field.

{
  "from": "product",
  "where": { "title": { "$match": "iphone" } },
  "get": "message",
  "orderBy": "$similarity",
  "select": ["title", "$highlight"]
}
Format
string

Examples

"$similarity"
{
  "from": "product",
  "where": {
    "title": { "$match": "iphone" }
  },
  "get": "message",
  "orderBy": "$similarity"
}

$lift object

Conceptually similar to the plain $lift operator, but allows using a customized proposition for the lift score calculation.

This $lift operator enables more options to customized the lift score calculation, especially when getting the values of linked table.

  1. Narrow down the fields that are used to calculate the lift:

    This is similar to the behavior of the "basedOn" clause of the Match query

    When calculating the lift of a linked field, aito used all the fields of the linked table (See $lift for how the lift is calculated for a linked field).

    If you would like to narrow down how the lift is calculated, you can add the field name following the $lift. For example, find the most likely product based on only the product name:

    {
      "from": "impressions",
      "where": {
        "session.user": "bob",
        "purchase": true
      },
      "get": "product",
      "orderBy": {
        "$lift": "name"
      }
    }
    

    You can also calculate the lift based on multiple fields by using the array format. For instance:

    {
      "$lift": ["category", "tag"]
    }
    
  2. Calculate the lift based on a specific context:

    This is similar to the behavior of the Recommend query

    By combining with the $context operator, the lift score can be defined as the lift of a context. For instance, to find the products with the highest lift of getting purchased:

    {
      "from": "impressions",
      "where": {
        "session.user": "bob"
      },
      "get": "product",
      "orderBy": {
        "$lift": {"$context": {"purchase": true}}
      }
    }
    
Format
{
// PropositionSet expression is used to describe a collection // of propositions.
"$lift": PropositionSet
}

Examples

{ "$lift": "tags" }
{
  "$lift": ["tags", "title"]
}
{
  "$lift": {
    "$context": { "click": true }
  }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": { "$lift": "tags" }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$lift": ["tags", "title"]
  }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$lift": {
      "$context": { "click": true }
    }
  }
}

$p object

Conceptually similar to the plain $p operator, but allows using a customized proposition for the probability score calculation.

This $p operator enables more options to customized the probability score calculation, especially when getting the values of linked table:

  1. Narrow down the fields that are used to calculate the probability:

    This is similar to the behavior of the "basedOn" clause of the Match query

    When calculating the probability of a linked field, aito used all the fields of the linked table (See $p for how the probability is calculated for a linked field).

    If you would like to narrow down how the probability is calculated, you can add the field name following the $p. For example, find the most likely product based on only the product name:

    {
      "from": "impressions",
      "where": {
        "session.user": "bob",
        "purchase": true
      },
      "get": "product",
      "orderBy": {
        "$p": "name"
      }
    }
    

    You can also calculate the probability based on multiple fields by using the array format. For instance:

    {
      "$p": ["category", "tag"]
    }
    
  2. Calculate the probability based on a specific context:

    This is similar to the behavior of the Recommend query

    By combining with the $context operator, the probability score can be defined as the probability of a context. For instance, to find the products with the highest probability that the product would be purchased:

    {
      "from": "impressions",
      "where": {
        "session.user": "bob"
      },
      "get": "product",
      "orderBy": {
        "$p": {"$context": {"purchase": true}}
      }
    }
    
Format
{
// PropositionSet expression is used to describe a collection // of propositions.
}

Examples

{ "$p": "tags" }
{
  "$p": ["tags", "title"]
}
{
  "$p": {
    "$context": { "click": true }
  }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": { "$p": "tags" }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$p": ["tags", "title"]
  }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

$similarity object

Conceptually similar to the plain $similarity operator, but allows using a customized proposition for the similarity score calculation.

The plain $similarity operator calculates the similarity score based on the "where" clause contents, whereas this $similarity operator calculates the similarity score based on the given proposition.

These Generic Queries would yield the same results:

{
  "from": "products",
  "where": {
    "name": {"$match": "coffee"}
  },
  "orderBy": "$similarity"
}
{
  "from": "products",
  "orderBy": {
    "$similarity": {
      "name": "coffee"
    }
  }
}

This $similarity operation is useful for customizing scoring as the example below. Please refer to GenericQuery query with custom scoring example.

{
  "from": "impressions",
  "where": {
    "session.user": "veronica"
  },
  "get": "product",
  "orderBy": {
    "$multiply": [
      {
        "$p": {
          "$context": {
            "purchase": true
          }
        }
      },
      {
        "$similarity": {
          "name": "coffee"
        }
      }
    ]
  }
}
Format
{
// Proposition expression describes a fact, or a statement.
"$similarity": Proposition
}

Examples

{
  "$similarity": { "title": "apple iphone", "tags": "premium ios phone" }
}
{
  "from": "products",
  "orderBy": {
    "$similarity": { "title": "apple iphone", "tags": "premium ios phone" }
  }
}

$f

"$f" can be used in the "orderBy" clause of the Generic query to get the frequency of a feature.

Format
string

Examples

"$f"
{ "from": "impressions", "get": "product", "orderBy": "$f" }

$normalize

$normalize operator can be used in the "orderBy" clause of the Generic query to make a score to sum to 1. For example, you can normalize the $lift or the $lift object to 1:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": {
    "$normalize": "$lift"
  }
}

or

{
    "from": "impressions",
    "where": {
        "session.user": "bob",
        "purchase": true
    },
    "get": "product",
    "orderBy": {
        "$normalize": {
      "$lift": { "$context": { "click": "true" } }
    }
    }
}
Format
{
// Score expression resolves to a numeric score value or // probability.
"$normalize": Score
}

Examples

{ "$normalize": "$lift" }
{
  "$normalize": { "$lift": "name" }
}

Built-in attributes

Can be used in "select" clause.

$index

$index is a built-in variable which indicates the insertion index of a row. It can be used together with $mod to select parts of a table. It's useful for example in Evaluate query for selecting training or test data.

Format
string

Example

"$index"

$lift

"$lift" can be used in the "orderBy" clause of the Generic query to get the most likely values based on lifts of features with regard to other features.

In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest lifts that it would be purchased:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": "$lift"
}

Running the following query would yield the most likely product based on all the fields of the linked product table:

{
  "from": "impressions",
  "where": {
    "session.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": "$lift"
}

Since the product field in the impressions table is linked to the products table, Aito would find all the statistical relations between what is declared inside the "where" clause and all the fields feature of a product, that is, id, name, category, price, tag. In this case, the lift score is the product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select" clause (e.g: "select": ["$score", "$why"]):

"$why": {
  "type": "product",
  "factors": [
    {
      "type": "hitVariableLift",
      "variable": "id:6410405093677",
      "value": 1.9827806375460209,
      "factors": [
        {
          "type": "relatedVariableLift",
          "variable": "purchase:true",
          "value": 1.9827806375460209
        }
      ]
    },
    {
      "type": "hitVariableLift",
      "variable": "not(name:puikula)",
      "value": 1.0472308585357502,
      "factors": [
        {
          "type": "relatedVariableLift",
          "variable": "purchase:true",
          "value": 1.0472308585357502
        }
      ]
    }
    ...
  ]
}

We can see that the lift score is composed of lift of an id feature, a name feature and others.

See also $p and $lift.

Format
string

Examples

"$lift"
{
  "from": "messages",
  "where": {
    "message": { "$match": "dog" }
  },
  "get": "user",
  "orderBy": "$lift"
}

$p

"$p" can be used in the "orderBy" clause of the Generic query to get the most probable values. When used this way, it is similar to the Match query.

In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest probabilities that it would be purchased:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": "$p"
}

Similar to the Match query, running the following query would yield the most likely product based on all the fields of the linked product table:

{
  "from": "impressions",
  "where": {
    "session.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": "$p"
}

Since the product field in the impressions table is linked to the products table, Aito would find all the statistical relations between what is declared inside the "where" clause and all the fields feature of a product, that is, id, name, category, price, tag. In this case, the probability score is the normalized product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select" clause (e.g: "select": ["$score", "$why"]):

"$why": {
  "type": "product",
  "factors": [
    {
      "type": "hitVariableLift",
      "variable": "id:6410405093677",
      "value": 1.9827806375460209,
      "factors": [
        {
          "type": "relatedVariableLift",
          "variable": "purchase:true",
          "value": 1.9827806375460209
        }
      ]
    },
    {
      "type": "hitVariableLift",
      "variable": "not(name:puikula)",
      "value": 1.0472308585357502,
      "factors": [
        {
          "type": "relatedVariableLift",
          "variable": "purchase:true",
          "value": 1.0472308585357502
        }
      ]
    }
    ...
  ]
}

We can see that the probability score is composed of lift of an id feature, a name feature and others.

See also $p and $lift.

Format
string

Examples

"$p"
{
  "from": "messages",
  "where": {
    "message": { "$match": "dog" }
  },
  "get": "user",
  "orderBy": "$p"
}

$why

When selecting $why, Aito opens up why a certain result was predicted. Explanation contains 3 different factors, which are explained below.

The three different factors are for an estimate of form:

p(xiA,B,C)p(x_i | A, B, C)

"baseP"

The base probability.

p(X)p(X)

"normalizer"

The normalizer is only used, when exclusiveness is on. In this case, it is assumed that only one feature can be true at the same time, and that one feature will be true. In practice, exclusiveness enforces the probabilities of alternative features to sum to 1.0.

The normalizer of form:

1sum((p(X0)+p(X1)+...))\dfrac{1}{sum((p(X_0) + p(X_1) + ...))}

Probability lifts. For example: the lift may say a product is clicked with 2.3x likelihood (or 130% higher likelihood), when it has 5 stars.

A probability lift is of form:

p(AX)p(A)\dfrac{p(A | X)}{p(A)}
Format
string

Example

"$why"

Explanation objects

Explanation object when using the "$why" operator.

BaseLiftExplanation

Conceptually similar to BaseProbabilityExplanation but show the prior lift instead of prior probability.

See more Probability vs. Lift

Format
{
// The explanation type: baseLift. Required.
"type": string,
// The prior lift. Required.
"value": number
}

Example

{ "type": "baseLift", "value": 31 }

BaseProbabilityExplanation

Explain the initial weight of a feature. It can be understand as the prior probability p(X)p(X) of a feature.

Let's take a look at an example of a Predict query:

{
  "from": "products",
  "where": {
    "name": "Columbian coffee"
  },
  "predict": "tags",
  "select": ["$p", "feature", "$why"]
}

When opening up the explanation with "$why" operator, a tag's feature "coffee" has a BaseProbabilityExplanation:

{
  "type": "baseP",
  "value": 0.16
}

This explanation tells that Aito gives the feature "coffee" a prior probability of 0.16.

Format
{
// The explanation type: baseP. Required.
"type": string,
// The prior probability. Required.
"value": number
}

Example

{ "type": "baseP", "value": 0.5 }

ExponentExplanation

Explain how an exponent score was calculated.

The ExponentExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of an exponent score such as: 1. The tf-idf score to calculate the similarity in the Similarity query. 1. The score of the $pow operator.

Format
{
// The explanation type: exponent. Required.
"type": string,
// The exponent score. Required.
"value": number,
// Explain how a score was calculated. Required.
// Explain how a score was calculated. Required.
}

Example

{
  "type": "exponent",
  "value": 1.7551720221592049,
  "base": { "type": "idf", "value": 1.7551720221592049 },
  "power": { "type": "tf", "value": 1 }
}

FieldExplanation

Explain how a field score was calculated.

The field explanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a score that was calculated using:

  1. A field value

    {
      "from" : "impressions",
      "where" : {
        "product.name":{"$match": "coffee"}
      },
      "get":"product",
      "orderBy" : {
        "$multiply": ["$p", "price"]
      },
      "select": ["$score", "$why"]
    }
    

    The explanations would contains the value of the "price" field that was use in the $multiply operator.

    {
      "type": "field",
      "field": "price",
      "value": 3.95
    }
    
  2. A field feature (e.g: $f operator for frequency):

    {
      "from" : "impressions",
      "where" : {
        "product.name":{"$match": "coffee"}
      },
      "get":"product",
      "orderBy" : "$f",
      "select": ["$score", "$why"]
    }
    

    The explanation would contains the frequency of the feature.

    {
      "type": "field",
      "field": "$f",
      "value": 152.0
    }
    
Format
{
// The explanation type: field. Required.
"type": string,
// The name or feature of the field. Required.
"field": string,
// The score value. Required.
"value": number
}

Example

{ "type": "field", "field": "price", "value": 1500 }

HitVariableLiftExplanation

Explain how a variable's lift was calculated.

A hit score was calculated by aggregating the score of its variables (features). The HitVariableLiftExplanation explains how different variable was calculated.

A HitVariableLift can be:

  1. A similarity score

    A hit's field can contain a word that match the stem of the given similarity condition. That word would have a HitVariableLift that is a similarity score. Let's take a look at an example of Similarity query:

    {
      "from": "products",
      "similarity": {
        "name": "Columbian coffee",
        "tags": "expansive coffee"
      },
      "select": ["$score", "name", "tags", "$why"]
    }
    

    When opening up the explanation with "$why" operator, we can see that a hit with name "Juhla Mokka coffee 500g sj" containing the word coffee has a HitVariableLiftExplanation as follows:

    {
      "type": "hitVariableLift",
      "variable": "name:coffe",
      "value": 2.1726635013471625,
      "factors": [
        {
          "type": "exponent",
          "value": 2.1726635013471625,
          "base": {
            "type": "idf",
            "value": 2.1726635013471625
          },
          "power": {
            "type": "tf",
            "value": 1.0
          }
        }
      ]
    }
    
  2. An aggregated score of BaseLift and RelatedVariableLift

Let's take a look at an example of Match query:

{
  "from": "impressions",
  "where": {
    "session.user": "larry"
  },
  "match": "product",
  "select": ["$score", "name", "$why"]
}

When opening up the explanation with "$why" operator, the first hit has a HitVariableLiftExplanation as follows:

{
  "type": "hitVariableLift",
  "variable": "id:6410405216120",
  "value": 599.5491890842981,
  "factors": [
    {
      "type": "baseLift",
      "value": 265.0
    },
    {
      "type": "relatedVariableLift",
      "variable": "session.user:larry",
      "value": 2.2624497701294266
    }
  ]
}

This explains that the initial lift of the feature "id:6410405216120" is 265 and when the user is Larray, the relatedVariableLift is 2.2624497701294266. Hence the aggregated lift is 2652.2624497701294266=599.5491890842981265 * 2.2624497701294266 = 599.5491890842981

Format
{
// The explanation type: hitVariableLift. Required.
"type": string,
// The variable. Required.
"variable": string,
// The aggregated lift value. Required.
"value": number,
// The factors contributing to the aggregated lift and their // explanation. Required.
"factors": [ScoreExplanation, ...]
}

Example

{
  "type": "hitVariableLift",
  "variable": "id:3",
  "value": 31,
  "factors": [
    { "type": "baseLift", "value": 31 }
  ]
}

InverseDocumentFrequencyExplanation

Explain the inverse document frequency score.

The inverse document frequency score is one component of the term frequency-inverse document frequency score which is used in Aito's similarity metrics.

Format
{
// The explanation type: idf. Required.
"type": string,
// The inverse document frequency score. Required.
"value": number
}

Example

{ "type": "idf", "value": 1.7551720221592049 }

NamedExplanation

Explain how a special named score was calculated.

The NamedExplanation now only appears when calculating a score with exclusiveness. In this case, it explains the normalizer that enforces the probabilities of a feature to have sum of 1.0.

Format
{
// Normalizer. Required.
"type": string,
// Exclusiveness. Required.
"name": string,
// The value of the normalizer. Required.
"value": number
}

Example

{ "type": "normalizer", "name": "exclusiveness", "value": 0.2982788431762749 }

PredictExplanation

Explain how a probability was calculated.

The PredictExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a Predict query. Let's take a look at an example of Predict query:

  {
    "from": "products",
    "where": {
      "name": "Columbian coffee"
    },
    "predict": "tags",
    "select": ["$p", "feature", "$why"],
    "limit": 22
  }

The first hit has an explanation of"

{
  "type": "product",
  "factors": [
    {
      "type": "baseP",
      "value": 0.16
    },
    {
      "type": "normalizer",
      "name": "exclusiveness",
      "value": 0.3698470421219182
    },
    {
      "type": "relatedVariableLift",
      "variable": "name:coffe",
      "value": 8.45603245079726
    }
  ]
}
Format
{
// The explanation type: product. Required.
"type": string,
// The explanation of the probability's factors. Required.
"factors": [ScoreExplanation, ...]
}

Example

{
  "type": "product",
  "factors": [
    { "type": "baseP", "value": 0.8048780487804879 },
    {
      "type": "normalizer",
      "name": "exclusiveness",
      "value": 0.04604801347746731
    }
  ]
}

ProductExplanation

Explain how a product score was calculated.

The ProductExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of:

  1. Aggregated score by product. For example, in a Match query:
    {
      "from": "impressions",
      "where": {
        "session.user": "larry"
      },
      "match": "product",
      "select": ["$score", "name", "$why"]
    }
    
    The final score is a product of multiple score components:
    {
      "type": "product",
      "factors": [
        {
          "type": "hitVariableLift",
          "variable": "id:6410405216120",
          "value": 599.5491890842981,
          "factors": [
            {
              "type": "baseLift",
              "value": 265.0
            },
            {
              "type": "relatedVariableLift",
              "variable": "session.user:larry",
              "value": 2.2624497701294266
            }
          ]
        },
        ...
      ]
    }
    
  2. A score calculated using the $multiply operator.
Format
{
// The explanation type: product. Required.
"type": string,
// The explanation of the product score's factors. Required.
"factors": [ScoreExplanation, ...]
}

Example

{
  "type": "product",
  "factors": [
    {
      "type": "hitVariableLift",
      "variable": "id:3",
      "value": 31,
      "factors": [
        { "type": "baseLift", "value": 31 }
      ]
    },
    { "type": "field", "field": "price", "value": 1500 }
  ]
}

Explain how a related variable's lift was calculated.

A related variable (feature) most commonly appears when doing inference with some conditions. The RelatedVariableLiftExplanation explains how a variable of the conditions affecting the lift of a hit's variable.

Let's take a look at an example of Match query:

{
  "from": "impressions",
  "where": {
    "session.user": "larry"
  },
  "match": "product",
  "select": ["$score", "name", "$why"]
}

When opening up the explanation with "$why" operator, the first hit has an explanation as follows:

{
  "type": "hitVariableLift",
  "variable": "id:6410405216120",
  "value": 599.5491890842981,
  "factors": [
    {
      "type": "baseLift",
      "value": 265.0
    },
    {
      "type": "relatedVariableLift",
      "variable": "session.user:larry",
      "value": 2.2624497701294266
    }
  ]
}

This explains that the feature "session.user:larry" extracted from the conditions "where": { "session.user": "larry" } enhances the likelihood that the product having an id of 6410405216120 with a lift of 2.2624497701294266.

Format
{
// The explanation type: relatedVariableLift. Required.
"type": string,
// The related variable. Required.
"variable": string,
// The lift value of the related variable. Required.
"value": number
}

Example

{
  "type": "relatedVariableLift",
  "variable": "title:appl",
  "value": 2.268307643781859
}

ScoreExplanation

SumExplanation

Explain how a summation was calculated.

The SumExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a score calculated using the $sum operator.

Format
{
// The explanation type: sum. Required.
"type": string,
// The explanation of the summed score's terms. Required.
"terms": [ScoreExplanation, ...]
}

Example

{
  "type": "sum",
  "terms": [
    { "type": "field", "field": "id", "value": 4 },
    { "type": "field", "field": "price", "value": 1500 }
  ]
}

TermFrequencyExplanation

Explain the term frequency score.

The term frequency score is one component of the term frequency-inverse document frequency score which is used in Aito's similarity metrics.

Format
{
// The explanation type: tf. Required.
"type": string,
// The term frequency score. Required.
"value": number
}

Example

{ "type": "tf", "value": 1 }

Other types

All other API types.

ColumnName

Name of a column in a table. Links are supported.

Format
string

Examples

"id"
"age"
"product.id"

DocumentProposition

DocumentProposition expresses statements about a document.

For example the expression

{
  "tags": "laptop",
  "price": { "$lt": 500 }
}

describes a product, where the product is tagged as a laptop, and its price is under 500

Format
object

Examples

{ "click": true }
{ "query": "laptop" }
{
  "product.tags": { "$match": "laptop" },
  "click": true
}

EvaluateGenericQuery

A Generic query to be evaluated in the Evaluate query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results.
"get": Get,
// Declares the sorting order of the result by a field or by a // user-defined score.
"orderBy": OrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}

Example

{ "from": "impressions", "get": "product", "offset": 10, "limit": 20 }

EvaluateGroupedOperation

Supported query to be evaluated in EvaluateGroupedQuery. Currently only support Generic query and Recommend query

Format

Examples

{
  "from": "impressions",
  "where": {
    "session.user": { "$get": "session.user" },
    "product.name": { "$get": "query" }
  },
  "recommend": "product",
  "goal": { "purchase": "true" }
}
{
  "from": "impressions",
  "where": {
    "session.user": { "$get": "session.user" },
    "product.name": { "$get": "query" }
  },
  "get": "product",
  "orderBy": {
    "$p": { "purchase": true }
  }
}

EvaluateGroupedQuery

The EvaluateGroupedQuery is similar to the EvaluateQuery with an addition option to group multiple entries into a single test case.

For example, if there exists a "customerCohort" identifier in "impressions" table, we can evaluate by the customerCohort instead of the individual customer with the following EvaluateGroupedQuery:

{
  "evaluate": {
    "from": "impressions",
    "where": {
      "customer": { "$get": "customer" }
    },
    "recommend": "product",
    "goal": { "purchase": true }
  },
  "group": "customerCohort",
  "test": {
    "customerCohort": { "$gte": 5 }
  },
  "select": ["trainSamples", "testSamples", "meanRank"]
}
Format
{
// Proposition expression describes a fact, or a statement.
"train": Proposition,
// Proposition expression describes a fact, or a statement.
"test": Proposition,
// TestSource enables more options to choose the testing data // in the [Evaluate Query](#post-api-v1-evaluate).
"testSource": TestSource,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"group": string,
// Proposition expression describes a fact, or a statement.
"goal": Proposition,
// Supported query to be evaluated in // [EvaluateGroupedQuery](#schema-evaluate-grouped-query). // Required.
}

Example

{
  "evaluate": {
    "from": "impressions",
    "where": {
      "session.user": { "$get": "session.user" },
      "product.name": { "$get": "query" }
    },
    "recommend": "product",
    "goal": { "purchase": "true" }
  },
  "group": "userGroup",
  "test": {
    "userGroup": { "$gte": 5 }
  },
  "select": ["accuracy", "meanRank", "n"]
}

EvaluateMatch

A Match query to be evaluated in the Match query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// Get expression defines what items are returned as query // results. Required.
"match": Get,
// PropositionSet expression is used to describe a collection // of propositions.
"basedOn": PropositionSet,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}

Example

{
  "match": "prevProduct",
  "select": ["title", "description", "price"],
  "offset": 2,
  "from": "impressions",
  "where": { "customer": 4, "query": "laptop" },
  "limit": 2
}

EvaluateMultiGenericQuery

The Generic Query to be evaluated in a EvaluateGroupedQuery

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results. Required.
"get": Get,
// Declares the sorting order of the result by a field or by a // user-defined score.
"orderBy": OrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"offset": integer,
"limit": integer
}

Example

{
  "from": "impressions",
  "where": {
    "session.user": { "$get": "session.user" },
    "product.name": { "$get": "query" }
  },
  "get": "product",
  "orderBy": {
    "$p": { "purchase": true }
  }
}

EvaluateOperation

Examples

{
  "from": "messages",
  "get": "product",
  "similarity": {
    "title": { "$get": "message" },
    "description": { "$get": "message" }
  }
}
{
  "from": "products",
  "where": {
    "name": { "$get": "name" }
  },
  "get": "product",
  "orderBy": "$p"
}
{
  "from": "products",
  "where": {
    "name": { "$get": "name" }
  },
  "predict": "category"
}
{
  "from": "messages",
  "where": {
    "message": { "$get": "message" }
  },
  "match": "product"
}

EvaluatePredict

A Predict query to be evaluated in the Evaluate query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
"predict": string,
"exclusiveness": boolean,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"offset": integer,
"limit": integer
}

Example

{
  "from": "products",
  "where": {
    "name": { "$get": "name" }
  },
  "predict": "category"
}

EvaluateQuery

A query to evaluate:

Format
{
// Proposition expression describes a fact, or a statement.
"train": Proposition,
// Proposition expression describes a fact, or a statement.
"test": Proposition,
// TestSource enables more options to choose the testing data // in the [Evaluate Query](#post-api-v1-evaluate).
"testSource": TestSource,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// Operation to be evaluated. Required.
"evaluate": EvaluateOperation
}

Example

{
  "test": {
    "$index": {
      "$mod": [10, 1]
    }
  },
  "evaluate": {
    "from": "products",
    "where": {
      "name": { "$get": "name" }
    },
    "predict": "category"
  },
  "select": ["accuracy", "meanRank", "n"]
}

EvaluateRecommend

A Recommend query to be evaluated in the Evaluate query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results. Required.
"recommend": Get,
// Proposition expression describes a fact, or a statement. // Required.
"goal": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"offset": integer,
"limit": integer
}

Example

{
  "from": "impressions",
  "where": {
    "session.user": { "$get": "session.user" },
    "product.name": { "$get": "query" }
  },
  "recommend": "product",
  "goal": { "purchase": "true" }
}

EvaluateSimilarity

A Similarity query to be evaluated in the Evaluate query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results. Required.
"get": Get,
// Proposition expression describes a fact, or a statement. // Required.
"similarity": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"offset": integer,
"limit": integer
}

Example

{
  "from": "messages",
  "get": "product",
  "similarity": {
    "title": { "$get": "message" },
    "description": { "$get": "message" }
  }
}

ExponentPropositionArray

Define the base and the exponent of the $pow operator in the array format.

The first item of the array is the base and the second item of the array is the exponent.

Format

Example

["width", 2]

ExponentPropositionObject

Define the base and the exponent of the $pow operator in the object format.

Format
{
// Score expression resolves to a numeric score value or // probability. Required.
"base": Score,
// Score expression resolves to a numeric score value or // probability. Required.
"exponent": Score
}

Example

{ "base": "width", "exponent": 2 }

From

From expression declares the examined table.

Format

Examples

{
  "from": "impressions",
  "where": { "click": true }
}
"impressions"
{
  "from": {
    "from": "impressions",
    "where": { "click": true }
  },
  "where": { "query": "laptop" },
  "orderBy": "$p"
}
{ "from": "impressions" }

FromTablemodify

From expression declares the table containing deleting entries

Format
string

Examples

"impressions"
"products"
"customers"
"messages"

FromTablequery

From expression declares the examined table.

Format
string

Examples

"impressions"
"products"
"customers"
"messages"

FromWhere

FromWhere expression allows you to narrow the examined table.

When using the FromWhere, Aito would only consider that narrowed slice of table.

For instance, this query:

{
  "from": {
    "from": "impressions",
    "where": {
      "session.user": "larry"}
  },
 "match": "product"
}

is different from:

{
  "from": "impressions",
  "where": {
      "session.user": "larry"
  },
 "match": "product"
}

In the first query, Aito matches Larry with products only based on Larry impressions data while in the second query, Aito matches Larray with products based on Larry and other users' impressions data.

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement. // Required.
"where": Proposition,
"limit": integer
}

Example

{
  "from": "impressions",
  "where": { "click": true }
}

Get

Get expression defines what items are returned as query results.

By default, the hits are from the table defined in "from" clause. In some cases, you may want to declare propositions like 'query is laptop' in impression table, while returning results from the separate products table, based on click likelihood. In this case, you may have query such as

{
  "from": "impressions",
  "where": { "query": "laptop" },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

The "get" expression takes a field name as a parameter. If the field is link, the returned results are from the linked table. If the field is not link, the field values are returned as results.

Normally, the result of a query consists of the field values that best fulfill the query conditions. Field analyzers extract features from text fields and the $feature property can be used to return features instead of complete field values. For instance, the following example demonstrates how to discover product tags which are likely to lead to sales

{
  "from": "impressions",
  "where": { "query": "cheap phone" },
  "get": "product.tags.$feature",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

The $feature syntax also allows you to examine the values/features of a link field like it would be a regular field.

Format
string

Examples

"product"
"user"
"text.$feature"
"link.field"
"link.$feature"
"link.text.$feature"

GetValue

$get is used to access external variables in evaluate query.

$get is currently only used in the the Evaluate queries. The evaluate tests a specified query by examining the table rows one-by-one. $get allow accessing the tested row's properties.

Consider the following example.

Given a table containing products data with the following schema:

"products": {
  "type": "table",
  "columns": {
    "title": { "type": "Text", "analyzer": "English" },
    "description": { "type": "Text", "analyzer": "English" }
  }
}

and a table containing impressions data with the following schema:

"impressions": {
  "type": "table",
  "columns": {
    "customer": { "type": "Int", "link": "customers.id" },
    "product": { "type": "Int", "link": "products.id" },
    "query": { "type": "Text", "analyzer": "English" },
  }
}

The goal is to test how well the traditional TF-IDF similarity metric works for finding product. The $get is used in the similarity query to compare the product's title and description fields with the impression table's query field.

{
  "test": {
    "click": true
  },
  "evaluate": {
    "from": "impressions",
    "get": "product",
    "similarity": {
      "title": { "$get": "query" },
      "description": { "$get": "query" }
    }
  },
  "select": ["trainSamples", "n", "accuracy", "baseAccuracy", "meanRank", "mxe"]
}

With out the "$get", it

Format
{
"$get": string
}

Examples

{ "$get": "query" }
{ "$get": "click" }
{ "$get": "product.title" }

Goal

Specifies a goal to maximize.

Results are ordered by the likelihood of the goal in descending order.

Format

Examples

{ "purchase": true }
{ "click": true }

Hits

Entries returned for a given query.

Format

Example

[
  {
    "name": "Pirkka bio cherry tomatoes 250g international 1st class",
    "$p": 0.16772371915637704,
    "tags": "fresh vegetable pirkka tomato",
    "price": 1.29,
    "id": "6410405060457",
    "category": "100"
  },
  {
    "name": "Pirkka iceberg salad Finland 100g 1st class",
    "$p": 0.16772371915637704,
    "tags": "fresh vegetable pirkka",
    "price": 1.29,
    "id": "6410405093677",
    "category": "100"
  }
]

Is

The syntax {"field": { "$is": "yourvalue" } } is equivalent to { "field": "yourvalue" }.

Format
{
// PrimitiveProposition states a field's value.
}

Example

{ "$is": "value" }

KnnPropositionArray

Define the 'k' and the 'near' parameter of the $knn operator in the array format.

The first item of the array is the 'k' parameter and the second item of the array is the 'near' parameter.

Format
[
integer
or
,
integer
or
]

Example

[
  4,
  { "tags": "laptop" }
]

KnnPropositionObject

Define the 'k' and the 'near' parameter of the $knn operator in the object format.

Format
{
"k": integer,
// Proposition expression describes a fact, or a statement. // Required.
"near": Proposition
}

Example

{
  "k": 4,
  "near": { "tags": "laptop" }
}

ModPropositionArray

Define the divisor and the remainder of the $mod operator in the array format.

The first item of the array is the divisor and the second item of the array is the remainder.

Format
[integer, integer]

Example

[2, 0]

ModPropositionObject

Define the divisor and the remainder of the $mod operator in the object format.

Format
{
"divisor": integer,
"remainder": integer
}

Example

{ "divisor": 2, "remainder": 0 }

OnPropositionArray

Define the hypothesis and the conditional of the $on operator in the array format.

The first item of the array is the hypothesis and the second item of the array is the condition.

Format

Example

[
  { "click": true },
  { "user.tags": "nyc" }
]

OnPropositionObject

Define the hypothesis and the conditional of the $on operator in the object format.

Format
{
// Proposition expression describes a fact, or a statement. // Required.
"prop": Proposition,
// Proposition expression describes a fact, or a statement. // Required.
}

Example

{
  "prop": { "click": true },
  "on": { "user.tags": "nyc" }
}

OrderBy

Declares the sorting order of the result by a field or by a user-defined score.

Format

Examples

"product.price"
{ "$asc": "product.price" }
{ "$desc": "product.price" }
{
  "$multiply": ["$p", "prices"]
}
{
  "$pow": ["product.width", 2]
}

PrimitiveProposition

PrimitiveProposition states a field's value.

It should always be used inside a field declaration of a document proposition. For example, in the proposition { "field": "value" } the string "value" is the primitive proposition.

Format
integer
or
number
or
null
or
boolean
or
string

Examples

4
3.1
false
null
"text"

Proposition

Proposition expression describes a fact, or a statement.

A proposition may for example say that 'customer 4 clicked item' or that 'the product price is under 100€ or that tags contain the term laptop'.

Format

Examples

{
  "customer": 4,
  "query": { "$match": "laptop" }
}
{
  "price": { "$lt": 100 }
}
{
  "tags": { "$matches": "laptop" }
}

PropositionSet

PropositionSet expression is used to describe a collection of propositions. This collection of statements can be the alternative values in a field.

Format
string
or

Examples

"product.tags"
"query"
"product"
"tags"

RelateOrderBy

Declares the sorting order.

The sorting order can be any attribute of the Relate query hit.

Format

Examples

{ "$desc": "info.miTrue" }
{ "$asc": "lift" }

ResponseHit

Entry returned for a given query.

Format

Examples

{ "name": "My product", "price": 172.19 }
{
  "$score": 0.22350516297675496,
  "$value": "coffee",
  "$why": {
    "type": "product",
    "factors": [
      {
        "type": "hitVariableLift",
        "variable": ":coffee",
        "value": 8.45603245079726,
        "factors": [
          {
            "type": "relatedVariableLift",
            "variable": "name:coffe",
            "value": 8.45603245079726
          }
        ]
      }
    ]
  }
}

Score

Score expression resolves to a numeric score value or probability. All scores can be used in both highlights ($highlight) and explanations ($why).

Format

Examples

2
"product.margin"
"$p"
"$similarity"
{
  "$multiply": ["$p", "margin"]
}

Selection

Describes the fields and/or built-in attributes to return.

Format
[
string
or
$index
or
,
...]

Examples

["user.name", "query", "product.title", "click"]
["$why"]

TestSource

TestSource enables more options to choose the testing data in the Evaluate Query. Using the TestSource, you can specify the testing data as a specific slice of the same table with the training data or of a completely different table.

Format
{
// From expression declares the examined table.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
"offset": integer,
"limit": integer,
"select": [string, ...]
}

Example

{
  "where": {
    "$index": {
      "$mod": [5, 1]
    }
  },
  "limit": 100,
  "select": ["query"]
}

UserDefinedObject

Any object which is valid according to the database schema.

The contents of the object depends on the data inserted into the database. If for example you have a products table which has fields name and price, your object could look like:

{ "name": "My product", "price": 172.19 }
Format
object

Example

{ "name": "My product", "price": 172.19 }

Value

Value expression resolves to a primitive like int or json, score, probability or individual feature.

Value expression can refer to any field in the table with expressions like "query" or "product.price". Value expression can refer to the narrowed document overall likelihood, for example "$p", after "get": "message", to refer the message's likelihood. Value can also refer to the likelihood of a proposition with expressions such as { "$p": { "tags": "cover" } } or { "$p": { "$context": "click" } } to refer to the context table's fields.

Format
string
or

Example

"product.id"