Introduction

Welcome to Aito HTTP API reference documentation. You can also test out the queries in the interactive Swagger UI.

Examples are shown in this column.

Authentication

All requests must specify an API key in the x-api-key header. There are two types of authentication keys:

  • read-only Allows only read queries. Good for sharing access to 3rd parties.
  • read/write Allows all queries.

Client configuration

We recommend setting up your Aito instance configuration using environment variables as follows:

Environment variableValue
AITO_INSTANCE_URLyour-aito-instance-url
AITO_API_KEYyour-api-key

These environment variables are recognized by the Aito Python SDK. The URL and API keys can be found on the instance overview page in the Aito Console.

Limitations

Your instance might have monthly and burst API call limits. Refer to limits on pricing page. (https://aito.ai)

Payload size is limited to 10MB per message. This includes data and all headers. For uploading larger datasets to the database, the file upload API can be used to overcome this limit.

Queries must be completed within 29 seconds, or they will time out.

API access limits cannot be enforced on an IP or hostname basis. The authentication is based on an API key. The API is served only over secure HTTP.

Pagination

Some endpoints use pagination to limit the amount of results returned at once. The pagination is based on offset and limit parameters, similar to SQL and many other APIs.

As an example, to get the first result set of 10 items with Search query you can request:

{
  "from": "products",
  "offset": 0,
  "limit": 10
}

The response will have a total field, which tells you how many items were found in total:

{
  "offset": 0,
  "total": 81,
  "hits": [ ... ]
}

If this exceeds the amount of items in hits array, it means some results were filtered out from the response. To request the next 10 items, you can query:

{
  "from": "products",
  "offset": 10,
  "limit": 10
}

The default values for pagination parameters are the following.

ParameterDefault value
offset0
limit10

CORS

All responses are served with access-control-allow-origin: * headers. This is useful for browser applications.

Aito-specific concepts

We aim for a familiar API but in some cases Aito has a different default behavior what other databases might have.

Descending order by default

By default Aito sorts everything from the largest to the smallest. This is a design choice, dictated by the fact that within the domain of statistical reasoning: the highest values are often the most interesting ones.

For example: the items with the highest probabilities, the highest frequencies, the highest similarities, the highest mutual information, and the highest scores are often the most desired ones.

Use $asc to sort values from the smallest to the biggest, as shown in the example:

{
  "from": "products",
  "where": {
    "category.id": 89
  },
  "orderBy": {
    "$asc": "price"
  }
}

Personalisation

Aito has been designed to work well even with small data sets. One example of this is how personalised recommendations work. This is easiest to understand with an example, let's take a digital grocery store as an example.

When requesting product recommendations for a customer who's a vegetarian, Aito also considers what non-vegetarians purchase. If for example the customer would be the only vegetarian user of the grocery web shop, they could receive meat recommendations if the general average purchased a lot of meat.

This default behavior is usually a good default. In book, music, movie, and many other recommendations you commonly want to find new items, instead of getting recommendations only from your own history. However in some cases the behavior might lead to unexpected predictions. For example if we predicted how likely a vegetarian is to purchase bacon, Aito could return that it is very likely, because based on data, that's the common average.

An example recommend query could look like this:

{
  "from": "impressions",
  "where": { "session.user": "veronica" },
  "recommend": "product",
  "goal": { "purchase": true }
}

Even if we limit the data to impressions by veronica, Aito still considers other data points.

Error handling

In error cases, we return with proper HTTP status codes. Error responses:

  • 400 Bad Request Returned when there's an error with the given request payload. For example invalid query syntax.

Example error

Error returned when trying to use incorrect table name. Instead of prodjucts, it should be products.

{
  "charOffset": 17,
  "lineNumber": 3,
  "columnNumber": 13,
  "error": "failed to open 'prodjucts'",
  "status": 400,
  "message": "3:13: failed to open 'prodjucts'\n\n      \"from\": \"prodjucts\"\n              ^\n",
  "messageLines": [
    "3:13: failed to open 'prodjucts'",
    "",
    "      \"from\": \"prodjucts\"",
    "              ^"
  ]
}

Valid table names

Aito Database names cannot have whitespaces (spaces, tabs, linefeeds etc.) or any of the following characters:

/".$

These name validation rules were progressively applied in June. If you have invalid tables, you can use schema/_rename end point to rename them.

Feedback & bug reports

We take our quality seriously and aim for the smoothest developer experience possible. If you run into problems, please send an email to support@aito.ai containing reproduction steps and we'll fix it as soon as possible.

Query API

The query language operations.

Search

POST /api/v1/_search

Search rows.

Allows you to search, filter, and order rows. You can also select only specific columns. Similar to SELECT in SQL.

The results are in descending order by default.

Aito supports intuitive links following. If your products table has a link column called category which links to another table called categories, you can simply use the following convenience in the query selection:

{
  "from": "products",
  "where": {
    "category.id": 89
  },
  "orderBy": "price"
}

Get all rows

You can easily select all rows from a table with the following query:

{
  "from": "products"
}

Note: the amount of results is limited to 10 by default.

Highlighted results

If you want to get search results with highlights, see Generic query.

Parameters
NameTypeDescription
bodyrequiredobjectSearch query
Successful responses
ResponseTypeDescription
200 OKobjectSearch results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Declares the sorting order of the result by a field or by a // user-defined score.
"orderBy": OrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Find by id

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

You can copy-paste the example curl command to your terminal.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_search \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": { "id": "6411300000494" }
  }'

Response

{
  "offset": 0,
  "total": 1,
  "hits": [
    {
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    }
  ]
}

Where price is greater than

You can copy-paste the example curl command to your terminal.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_search \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": {
      "price": { "$gt": 1.5 }
    },
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 21,
  "hits": [
    {
      "category": "101",
      "id": "6437002001454",
      "name": "VAASAN Ruispalat 660g 12 pcs fullcorn rye bread",
      "price": 1.69,
      "tags": "gluten bread"
    },
    {
      "category": "101",
      "id": "6411402202208",
      "name": "Fazer Puikula fullcorn rye bread 9 pcs/500g",
      "price": 1.85,
      "tags": "gluten bread"
    }
  ]
}

Find products with search term

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_search \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": {
      "name": { "$match": "coffee" }
    }
  }'

Response

{
  "offset": 0,
  "total": 4,
  "hits": [
    {
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    },
    {
      "category": "108",
      "id": "6420101441542",
      "name": "Kulta Katriina filter coffee 500g",
      "price": 3.45,
      "tags": "coffee"
    },
    {
      "category": "108",
      "id": "6411300164653",
      "name": "Juhla Mokka Dark Roast coffee 500g hj",
      "price": 3.95,
      "tags": "coffee"
    },
    {
      "category": "108",
      "id": "6410405181190",
      "name": "Pirkka Costa Rica filter coffee 500g UTZ",
      "price": 2.89,
      "tags": "coffee pirkka"
    }
  ]
}

More complex where proposition

Find all products priced over 1.5€, which have tag drink or their name matches to coffee.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_search \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": {
      "$and": [
        {
          "$or": [
            {
              "tags": { "$has": "drink" }
            },
            {
              "name": { "$match": "coffee" }
            }
          ]
        },
        {
          "price": { "$gt": 1.5 }
        }
      ]
    },
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 6,
  "hits": [
    {
      "category": "104",
      "id": "6408430000258",
      "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
      "price": 1.95,
      "tags": "lactose-free drink"
    },
    {
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    }
  ]
}

Predict

POST /api/v1/_predict

Predict the likelihood of a feature given a hypothesis.

For example predict what other products user could add into their e-commerce shopping cart, based on the existing cart. To understand why Aito predicts certain results, you can select "$why".

Related information

  • The exclusiveness option is explained in Exclusiveness chapter.
  • The chapter Personalisation also explains a characteristic of predictions in Aito.
Parameters
NameTypeDescription
bodyrequiredobjectPredict query
Successful responses
ResponseTypeDescription
200 OKobjectPredict results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// PropositionSet expression is used to describe a collection // of propositions. Required.
"predict": PropositionSet,
// Exclusiveness dictates that only one feature can be true at // the same time and that one feature will be true. // Default: true
"exclusiveness": boolean,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Predict purchase likelihood

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're predicting how likely the customer with username larry would purchase the product "Finnish bread cheese 120g lactose-free" (6410405197764). In the example data, Larry purchases a lot of lactose-free products, but has never purchased any cheese. Aito detects that the "lactose-free" tag is a commonly occuring feature in the data, and predicts that Larry would also quite likely purchase the cheese.

The query format depends on how the data has been structured in Aito (schema). In the example dataset impressions table contains each individual product a user has seen in their shop visit (=session) and if they bought the product or not.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_predict \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "larry", "product.id": "6410405216120" },
    "predict": "purchase"
  }'

Response

{
  "offset": 0,
  "total": 2,
  "hits": [
    { "$p": 0.9975001237562499, "field": "purchase", "feature": true },
    { "$p": 0.0024998762437503097, "field": "purchase", "feature": false }
  ]
}

Explain the prediction

Same example as above, but we ask Aito to explain why it predicted the results. To understand the response, see "$why" section.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_predict \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "larry", "product.id": "6410405216120" },
    "select": ["$why"],
    "predict": "purchase"
  }'

Response

{
  "offset": 0,
  "total": 2,
  "hits": [
    {
      "$why": {
        "type": "product",
        "factors": [
          {
            "type": "baseP",
            "value": 0.5,
            "proposition": {
              "purchase": { "$has": true }
            }
          },
          {
            "type": "product",
            "factors": [
              {
                "type": "normalizer",
                "name": "exclusiveness",
                "value": 1.0000000000000002
              },
              {
                "type": "normalizer",
                "name": "trueFalseExclusiveness",
                "value": 1
              }
            ]
          },
          {
            "type": "relatedPropositionLift",
            "proposition": {
              "$and": [
                {
                  "product.id": { "$has": "6410405216120" }
                },
                {
                  "session.user": { "$has": "larry" }
                }
              ]
            },
            "value": 1.9950002475124993
          }
        ]
      }
    },
    {
      "$why": {
        "type": "product",
        "factors": [
          {
            "type": "baseP",
            "value": 0.5,
            "proposition": {
              "purchase": { "$has": false }
            }
          },
          {
            "type": "product",
            "factors": [
              {
                "type": "normalizer",
                "name": "exclusiveness",
                "value": 1.0000000000000002
              },
              {
                "type": "normalizer",
                "name": "trueFalseExclusiveness",
                "value": 1
              }
            ]
          },
          {
            "type": "relatedPropositionLift",
            "proposition": {
              "$and": [
                {
                  "product.id": { "$has": "6410405216120" }
                },
                {
                  "session.user": { "$has": "larry" }
                }
              ]
            },
            "value": 0.0049997524875006185
          }
        ]
      }
    }
  ]
}

Example request

In the example we're predicting three suitable tags for a hypothetical new product based on its name. Tags are predicted based on what tags existing products have.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_predict \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": { "name": "Hovis Seed Sensations Seven Seeds Original 800g" },
    "predict": "tags",
    "exclusiveness": false,
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 22,
  "hits": [
    { "$p": 0.3409090909090909, "field": "tags", "feature": "pirkka" },
    { "$p": 0.29545454545454547, "field": "tags", "feature": "food" },
    { "$p": 0.25, "field": "tags", "feature": "meat" }
  ]
}

Recommend

POST /api/v1/_recommend

Recommend a row which optimizes a given goal.

For example, you could ask Aito to choose a product, which maximizes the click likelihood, when user id equals 4543.

Recommend differs from predict and match in the following way: recommend always optimizes a goal, while predict and match merely mimics the existing behavior patterns in the data. As an example, consider the problem matching employees to projects. With predict and match: you can mimic the way the projects are staffed currently, and Aito will mimic both the good and the bad staffing practices. With recommend, Aito seeks to maximize the success rate and avoid decisions that lead to bad outcomes, even if these decisions were a popular practice.

The chapter Personalisation also explains a characteristic of the recommendations.

Parameters
NameTypeDescription
bodyrequiredobjectRecommend query
Successful responses
ResponseTypeDescription
200 OKobjectRecommend results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results. Required.
"recommend": Get,
// Specifies a goal to maximize. Required.
"goal": Goal,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Recommend top 5 products for a customer

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're recommending the top 5 products which veronica (user id) would most likely to purchase based on her behavior history stored in impressions table. The table contains information of which products she has seen and which of those where bought.

This query could be used to generate campaign email which recommends relevant products for a customer.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_recommend \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "veronica" },
    "recommend": "product",
    "goal": { "purchase": true },
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$p": 0.9922839884327421,
      "category": "100",
      "id": "6410405093677",
      "name": "Pirkka iceberg salad Finland 100g 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka"
    },
    {
      "$p": 0.9880776334264704,
      "category": "100",
      "id": "2000503600002",
      "name": "Chiquita banana",
      "price": 0.28054,
      "tags": "fresh fruit"
    },
    {
      "$p": 0.9627692001227784,
      "category": "108",
      "id": "6410405181190",
      "name": "Pirkka Costa Rica filter coffee 500g UTZ",
      "price": 2.89,
      "tags": "coffee pirkka"
    },
    {
      "$p": 0.9370075696184778,
      "category": "109",
      "id": "6420256014134",
      "name": "Tupla Maxi chocolate bar 50g UTZ",
      "price": 0.62,
      "tags": "candy lactose"
    },
    {
      "$p": 0.9181227425357429,
      "category": "101",
      "id": "6437002001454",
      "name": "VAASAN Ruispalat 660g 12 pcs fullcorn rye bread",
      "price": 1.69,
      "tags": "gluten bread"
    }
  ]
}

Recommend top products with additional filtering

This example is the same as above, but we're adding an additional criteria: the product name should match to 'Banana' search query.

This query could be used to build a personalised search functionality.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_recommend \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": {
      "product.name": { "$match": "Banana" },
      "session.user": "veronica"
    },
    "recommend": "product",
    "goal": { "purchase": true },
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 2,
  "hits": [
    {
      "$p": 0.9880776334264704,
      "category": "100",
      "id": "2000503600002",
      "name": "Chiquita banana",
      "price": 0.28054,
      "tags": "fresh fruit"
    },
    {
      "$p": 0.4921703347781217,
      "category": "100",
      "id": "2000818700008",
      "name": "Pirkka banana",
      "price": 0.166,
      "tags": "fresh fruit pirkka"
    }
  ]
}

Evaluate

POST /api/v1/_evaluate

Evaluate performance and accuracy.

The query supports evaluation of Predict, Match, Similarity, and Generic queries.

Evaluate operation is in alpha stage. The syntax might change in the future.

The evaluation is performed by first specifying the train and test data split:

  • The training data: The data that will be used to train Aito.
  • The testing data: The data that will be hidden from Aito and will be used to measure an Aito query's performance.

The testing data is specified using the test proposition or the TestSource. The training data is the remaining data that is not the testing data.

The evaluating query is specified following the evaluate keyword.

After that, a simulated evaluation scenario is ran: Aito simulates inserting the training data in to a table and then runs the given query for each sample (=row in a table) in the test data and measures how good the results were.

It is also possible to group multiple entries into a single test case and evaluate using the EvaluateGroupedQuery

Parameters
NameTypeDescription
bodyrequiredEvaluate query
Successful responses
ResponseTypeDescription
200 OKobjectEvaluate results
Request format
Response format
{
// The amount of samples used for testing. Required.
"n": integer,
// The amount of samples used for testing. Required.
"testSamples": integer,
// The average amount of samples used for training. Required.
"trainSamples": number,
// The average number of features. Required.
"features": number,
// Complement of `accuracy` (=`1 - accuracy`). Required.
"error": number,
// Complement of `baseAccuracy` (=`1 - baseAccuracy`). // Required.
"baseError": number,
// The accuracy of predictions. Required.
"accuracy": number,
// The simulated accuracy of predictions based on taking the // most frequent value. Required.
"baseAccuracy": number,
// How much better results Aito was able to provide compared to // a naive prediction. Required.
"accuracyGain": number,
// Average rank of the best prediction. Required.
"meanRank": number,
"baseMeanRank": number,
// Improvement of meanRank upon baseMeanRank. Required.
"rankGain": number,
// A measurement which describes the quality of probabilities // (=`h - mxe`). Required.
"informationGain": number,
// Mean cross entropy. Required.
"mxe": number,
// Entropy. Required.
"h": number,
// The mean geometric probability of the predictions. Required.
"geomMeanP": number,
// Base geometric mean probability. Required.
"baseGmp": number,
// Geometric mean lift. Required.
"geomMeanLift": number,
// The mean execution time of the queries in nanoseconds. // Required.
"meanNs": number,
// The mean execution time of the queries in microseconds. // Required.
"meanUs": number,
// The mean execution time of the queries in milliseconds. // Required.
"meanMs": number,
// The median execution time of the queries in nanoseconds. // Required.
"medianNs": number,
// The median execution time of the queries in microseconds. // Required.
"medianUs": number,
// The median execution time of the queries in milliseconds. // Required.
"medianMs": number,
// The time spent for warm-up of indexes and caches for the // given query in milliseconds. Required.
"warmingMs": number
}

Example request

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're evaluating how good results Aito provides when we predict tags for a new hypothetical product. The results give us the accuracy and performance of the prediction example shown in Predict operation's documentation.

$index is a built-in variable which tells the insertion index of a row. In the example, we select 1/4 of the rows in products table to be used as test data. The rest of the rows are automatically used as training data.

Aito iterates through each product in the test data, and tests how accurate the prediction of tags for a given product name was.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_evaluate \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "test": {
      "$index": {
        "$mod": [4, 0]
      }
    },
    "evaluate": {
      "from": "products",
      "where": {
        "name": { "$get": "name" }
      },
      "predict": "tags"
    }
  }'

Response

{
  "n": 11,
  "testSamples": 11,
  "trainSamples": 31,
  "features": 202,
  "error": 0.09090909090909094,
  "baseError": 0.7272727272727273,
  "accuracy": 0.9090909090909091,
  "baseAccuracy": 0.2727272727272727,
  "accuracyGain": 0.6363636363636364,
  "meanRank": 1.9090909090909092,
  "baseMeanRank": 4.454545454545454,
  "rankGain": 2.545454545454545,
  "informationGain": 2.1768738588975234,
  "mxe": 1.4084101290438387,
  "h": 3.585283987941362,
  "geomMeanP": 0.3767266164020222,
  "baseGmp": 0.08331476557218853,
  "geomMeanLift": 4.5217268969640845,
  "meanNs": 3120423.727272727,
  "meanUs": 3120.423727272727,
  "meanMs": 3.120423727272727,
  "medianNs": 2767849,
  "medianUs": 2767.849,
  "medianMs": 2.767849,
  "allNs": [
    3050340,
    4740278,
    3648785,
    3705769,
    1908513,
    2634641,
    4211584,
    2403868,
    2668070,
    2584964,
    2767849
  ],
  "allUs": [3050, 4740, 3648, 3705, 1908, 2634, 4211, 2403, 2668, 2584, 2767],
  "allMs": [3, 4, 3, 3, 1, 2, 4, 2, 2, 2, 2],
  "warmingMs": 1,
  "accurateOffsets": [0, 1, 2, 3, 4, 5, 6, 7, 8, 10],
  "errorOffsets": [9],
  "cases": [
    {
      "offset": 0,
      "testCase": {
        "category": "100",
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3854246183889548, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3854246183889548,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 1,
      "testCase": {
        "category": "100",
        "id": "6410405093677",
        "name": "Pirkka iceberg salad Finland 100g 1st class",
        "price": 1.29,
        "tags": "fresh vegetable pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.41960307640793315,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.41960307640793315,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 2,
      "testCase": {
        "category": "101",
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      "accurate": true,
      "top": { "$p": 0.445147520245974, "field": "tags", "feature": "bread" },
      "correct": {
        "$p": 0.445147520245974,
        "field": "tags",
        "feature": "bread"
      }
    },
    {
      "offset": 3,
      "testCase": {
        "category": "102",
        "id": "6410405205483",
        "name": "Pirkka Finnish beef-pork minced meat 20% 400g",
        "price": 2.79,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.4767020957928926, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.4767020957928926,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 4,
      "testCase": {
        "category": "103",
        "id": "6412000030026",
        "name": "Saarioinen Maksalaatikko liver casserole 400g",
        "price": 1.99,
        "tags": "meat food"
      },
      "accurate": true,
      "top": { "$p": 0.2552109631934357, "field": "tags", "feature": "food" },
      "correct": {
        "$p": 0.2552109631934357,
        "field": "tags",
        "feature": "food"
      }
    },
    {
      "offset": 5,
      "testCase": {
        "category": "104",
        "id": "6410405082657",
        "name": "Pirkka Finnish semi-skimmed milk 1l",
        "price": 0.81,
        "tags": "lactose drink pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3058803379364949, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3058803379364949,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 6,
      "testCase": {
        "category": "104",
        "id": "6408430000258",
        "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
        "price": 1.95,
        "tags": "lactose-free drink"
      },
      "accurate": true,
      "top": { "$p": 0.34164387716484773, "field": "tags", "feature": "drink" },
      "correct": {
        "$p": 0.34164387716484773,
        "field": "tags",
        "feature": "drink"
      }
    },
    {
      "offset": 7,
      "testCase": {
        "category": "108",
        "id": "6420101441542",
        "name": "Kulta Katriina filter coffee 500g",
        "price": 3.45,
        "tags": "coffee"
      },
      "accurate": true,
      "top": { "$p": 0.5588708896430944, "field": "tags", "feature": "coffee" },
      "correct": {
        "$p": 0.5588708896430944,
        "field": "tags",
        "feature": "coffee"
      }
    },
    {
      "offset": 8,
      "testCase": {
        "category": "109",
        "id": "6411401015090",
        "name": "Fazer Sininen milk chocolate slab 200g",
        "price": 2.19,
        "tags": "candy lactose"
      },
      "accurate": true,
      "top": { "$p": 0.396203917646309, "field": "tags", "feature": "lactose" },
      "correct": {
        "$p": 0.396203917646309,
        "field": "tags",
        "feature": "lactose"
      }
    },
    {
      "offset": 9,
      "testCase": {
        "category": "111",
        "id": "6413200330206",
        "name": "Lotus Soft Embo 8 rll toilet paper",
        "price": 3.35,
        "tags": "toilet-paper"
      },
      "accurate": false,
      "top": { "$p": 0.2121561238199451, "field": "tags", "feature": "pirkka" }
    },
    {
      "offset": 10,
      "testCase": {
        "category": "115",
        "id": "6410402010318",
        "name": "Pirkka tuna fish pieces in oil 200g/150g",
        "price": 1.69,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.28410516625444, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.28410516625444,
        "field": "tags",
        "feature": "pirkka"
      }
    }
  ],
  "accurateCases": [
    {
      "offset": 0,
      "testCase": {
        "category": "100",
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3854246183889548, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3854246183889548,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 1,
      "testCase": {
        "category": "100",
        "id": "6410405093677",
        "name": "Pirkka iceberg salad Finland 100g 1st class",
        "price": 1.29,
        "tags": "fresh vegetable pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.41960307640793315,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.41960307640793315,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 2,
      "testCase": {
        "category": "101",
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      "accurate": true,
      "top": { "$p": 0.445147520245974, "field": "tags", "feature": "bread" },
      "correct": {
        "$p": 0.445147520245974,
        "field": "tags",
        "feature": "bread"
      }
    },
    {
      "offset": 3,
      "testCase": {
        "category": "102",
        "id": "6410405205483",
        "name": "Pirkka Finnish beef-pork minced meat 20% 400g",
        "price": 2.79,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.4767020957928926, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.4767020957928926,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 4,
      "testCase": {
        "category": "103",
        "id": "6412000030026",
        "name": "Saarioinen Maksalaatikko liver casserole 400g",
        "price": 1.99,
        "tags": "meat food"
      },
      "accurate": true,
      "top": { "$p": 0.2552109631934357, "field": "tags", "feature": "food" },
      "correct": {
        "$p": 0.2552109631934357,
        "field": "tags",
        "feature": "food"
      }
    },
    {
      "offset": 5,
      "testCase": {
        "category": "104",
        "id": "6410405082657",
        "name": "Pirkka Finnish semi-skimmed milk 1l",
        "price": 0.81,
        "tags": "lactose drink pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3058803379364949, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3058803379364949,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 6,
      "testCase": {
        "category": "104",
        "id": "6408430000258",
        "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
        "price": 1.95,
        "tags": "lactose-free drink"
      },
      "accurate": true,
      "top": { "$p": 0.34164387716484773, "field": "tags", "feature": "drink" },
      "correct": {
        "$p": 0.34164387716484773,
        "field": "tags",
        "feature": "drink"
      }
    },
    {
      "offset": 7,
      "testCase": {
        "category": "108",
        "id": "6420101441542",
        "name": "Kulta Katriina filter coffee 500g",
        "price": 3.45,
        "tags": "coffee"
      },
      "accurate": true,
      "top": { "$p": 0.5588708896430944, "field": "tags", "feature": "coffee" },
      "correct": {
        "$p": 0.5588708896430944,
        "field": "tags",
        "feature": "coffee"
      }
    },
    {
      "offset": 8,
      "testCase": {
        "category": "109",
        "id": "6411401015090",
        "name": "Fazer Sininen milk chocolate slab 200g",
        "price": 2.19,
        "tags": "candy lactose"
      },
      "accurate": true,
      "top": { "$p": 0.396203917646309, "field": "tags", "feature": "lactose" },
      "correct": {
        "$p": 0.396203917646309,
        "field": "tags",
        "feature": "lactose"
      }
    },
    {
      "offset": 10,
      "testCase": {
        "category": "115",
        "id": "6410402010318",
        "name": "Pirkka tuna fish pieces in oil 200g/150g",
        "price": 1.69,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.28410516625444, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.28410516625444,
        "field": "tags",
        "feature": "pirkka"
      }
    }
  ],
  "errorCases": [
    {
      "offset": 9,
      "testCase": {
        "category": "111",
        "id": "6413200330206",
        "name": "Lotus Soft Embo 8 rll toilet paper",
        "price": 3.35,
        "tags": "toilet-paper"
      },
      "accurate": false,
      "top": { "$p": 0.2121561238199451, "field": "tags", "feature": "pirkka" }
    }
  ],
  "alpha_binByTopScore": [
    {
      "meanScore": 0.26433814780107895,
      "maxScore": 0.3058803379364949,
      "minScore": 0.2121561238199451,
      "accuracy": 0.75,
      "n": 4,
      "accurateOffsets": [4, 10, 5],
      "errorOffsets": [9]
    },
    {
      "meanScore": 0.3857188724020112,
      "maxScore": 0.41960307640793315,
      "minScore": 0.34164387716484773,
      "accuracy": 1,
      "n": 4,
      "accurateOffsets": [6, 0, 8, 1],
      "errorOffsets": []
    },
    {
      "meanScore": 0.493573501893987,
      "maxScore": 0.5588708896430944,
      "minScore": 0.445147520245974,
      "accuracy": 1,
      "n": 3,
      "accurateOffsets": [2, 3, 7],
      "errorOffsets": []
    }
  ]
}

Similarity

POST /api/v1/_similarity

Similarity can be used to return entries, that are similar to the given sample object.

The sample object can be either a complete or a partial row. Similarity operation uses TF-IDF for scoring the documents.

The chapter Personalisation also explains a characteristic of the similarity model.

Parameters
NameTypeDescription
bodyrequiredobjectSimilarity query
Successful responses
ResponseTypeDescription
200 OKobjectSimilarity results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Proposition expression describes a fact, or a statement. // Required.
"similarity": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Example request

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're finding similar products to a given existing product. Aito assumes that the given sample object is a hypothetical new object, which is why in this example the exact same product is also in the results.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_similarity \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "similarity": {
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    },
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 4347.31478753839,
      "category": "108",
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "price": 3.95,
      "tags": "coffee"
    },
    {
      "$score": 368.87273582619054,
      "category": "108",
      "id": "6411300164653",
      "name": "Juhla Mokka Dark Roast coffee 500g hj",
      "price": 3.95,
      "tags": "coffee"
    },
    {
      "$score": 18.108373975215677,
      "category": "108",
      "id": "6420101441542",
      "name": "Kulta Katriina filter coffee 500g",
      "price": 3.45,
      "tags": "coffee"
    }
  ]
}

Example request

In the example we're finding similar products based on just a product name.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_similarity \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "similarity": { "name": "Hovis Seed Sensations Seven Seeds Original 800g" },
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 1,
      "category": "100",
      "id": "2000818700008",
      "name": "Pirkka banana",
      "price": 0.166,
      "tags": "fresh fruit pirkka"
    },
    {
      "$score": 1,
      "category": "100",
      "id": "2000604700007",
      "name": "Cucumber Finland",
      "price": 0.9765,
      "tags": "fresh vegetable"
    },
    {
      "$score": 1,
      "category": "100",
      "id": "6410405060457",
      "name": "Pirkka bio cherry tomatoes 250g international 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka tomato"
    }
  ]
}

Match

POST /api/v1/_match

Match the most likely value/feature of a column or any column of a linked table to a given hypothesis.

Differences to Predict

While match is similar to Predict query, there are fine-grained differences explained below.

Predict returns features, while match can return values

Match can return A) the row behind a link or B) the value inside a text field. If match is done against non-analyzed field, it works similarly to predict, except the inference algorithm is somewhat different

The inference model is different

Predict treats features as 'black boxes', and it does statistical reasoning purely based on the feature's own statistics. Match does 'glass box' statistical reasoning by using all the features found behind the link or within a field.

For example, if you are predicting a product, the predict-query will look at the histories of the each individual product ids. If there is no history for the product, Aito will not be able to do proper inference. On the other hand, if you are matching the product, Aito will look at the product category, title and description. This enables Aito to match products, it has never seen before, as long as it is familiar with its internal features

The chapter Personalisation also explains a characteristic of the matching.

Parameters
NameTypeDescription
bodyrequiredobjectMatch query
Successful responses
ResponseTypeDescription
200 OKobjectMatch results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// Get expression defines what items are returned as query // results. Required.
"match": Get,
// PropositionSet expression is used to describe a collection // of propositions.
"basedOn": PropositionSet,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Match user to products

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we're matching a user to products.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_match \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "larry" },
    "match": "product",
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$p": 0.2348757987154449,
      "category": "104",
      "id": "6410405216120",
      "name": "Pirkka lactose-free semi-skimmed milk drink 1l",
      "price": 1.25,
      "tags": "lactose-free drink pirkka"
    },
    {
      "$p": 0.22064218551887363,
      "category": "104",
      "id": "6408430000258",
      "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
      "price": 1.95,
      "tags": "lactose-free drink"
    },
    {
      "$p": 0.049472827117544374,
      "category": "104",
      "id": "6410405082626",
      "name": "Pirkka Finnish nonfat milk 1l",
      "price": 0.75,
      "tags": "lactose drink pirkka"
    },
    {
      "$p": 0.0314845453058081,
      "category": "100",
      "id": "6410405060457",
      "name": "Pirkka bio cherry tomatoes 250g international 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka tomato"
    },
    {
      "$p": 0.03005362318721598,
      "category": "100",
      "id": "2000604700007",
      "name": "Cucumber Finland",
      "price": 0.9765,
      "tags": "fresh vegetable"
    }
  ]
}
Referenced in

Relate

POST /api/v1/_relate

Relate provides statistical information of data relationships.

It calculates correlations between a pair of features, which can be used to for example to find causation and correlation.

The hits are by default ordered by relation.mi field. It indicates how strong the correlation is.

Parameters
NameTypeDescription
bodyrequiredobjectRelate query
Successful responses
ResponseTypeDescription
200 OKobjectRelate results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// PropositionSet expression is used to describe a collection // of propositions. Required.
"relate": PropositionSet,
// Declares the sorting order.
"orderBy": RelateOrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

What features of products affect purchasing

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

In the example we ask Aito to explain what factors of products affect to people purchasing them. With $exists, we tell Aito to get all properties of the product (impressions table links to the products table), and relate those to the condition {"purchase": true }.

The response may seem overwhelming but it contains a lot of useful information.

When looking at the second hit, we can see that when { "product.tags" : { "$has": "vegetable" } }, the "lift" value is high (compared to 1.0). It means that when the product tags contain a tag vegetable, it is ~1.9x more likely that the product will be purchased compared to the average product (=base probability).

The lift is calculated with the formula: the probability of the condition { "purchase": true} divided by the average probability of the condition. The formula with the correct field names is: ps.pOnCondition / ps.p.

In the example data set, people purchase 50% of products they see. This causes the base probability to be 0.5.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_relate \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "$exists": "product" },
    "relate": [
      { "purchase": true }
    ],
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 2,
  "hits": [
    {
      "related": {
        "purchase": { "$has": true }
      },
      "condition": {
        "product.name": { "$has": "fazer" }
      },
      "lift": 0.2069031456162157,
      "fs": {
        "f": 1680,
        "fOnCondition": 32.8125,
        "fOnNotCondition": 1647.1875,
        "fCondition": 324.84375,
        "n": 3360
      },
      "ps": {
        "p": 0.5,
        "pOnCondition": 0.10345157280810785,
        "pOnNotCondition": 0.5424421109653041,
        "pCondition": 0.0966811520436604
      },
      "info": {
        "h": 1,
        "mi": 0.520157704908141,
        "miTrue": -0.2351425817047657,
        "miFalse": 0.7553002866129067
      },
      "relation": {
        "n": 3360,
        "varFs": [324.84375, 1680],
        "stateFs": [1387.96875, 292.03125, 1647.1875, 32.8125],
        "mi": 0.054990612532037894
      }
    },
    {
      "related": {
        "purchase": { "$has": true }
      },
      "condition": {
        "product.name": { "$has": "puikula" }
      },
      "lift": 0.1827956989247312,
      "fs": {
        "f": 1680,
        "fOnCondition": 16,
        "fOnNotCondition": 1664,
        "fCondition": 184,
        "n": 3360
      },
      "ps": {
        "p": 0.5,
        "pOnCondition": 0.0913978494623656,
        "pOnNotCondition": 0.5236734611408386,
        "pCondition": 0.05476473921098032
      },
      "info": {
        "h": 1,
        "mi": 0.5588814740153283,
        "miTrue": -0.22407973918054175,
        "miFalse": 0.7829612131958701
      },
      "relation": {
        "n": 3360,
        "varFs": [184, 1680],
        "stateFs": [1512, 168, 1664, 16],
        "mi": 0.03213703255071114
      }
    }
  ]
}

Generic query

POST /api/v1/_query

Generic query is a powerful expert interface.

It provides the functionality of every other query type in the API. Search, Similarity, Match, and Recommend can be seen as convenience APIs for the generic query.

The query format resembles the Search-query, except that it supports a "get" statement. Since this endpoint provides functionality of all other queries, "get": "product" is used as a replacement for "predict": "product", "recommend": "product", and "match": "product" counterparts.

The chapter Personalisation also explains a characteristic of the inference model.

Namespace shifting of "get"

The "get" operation changes the namespaces of "select" and "orderBy" operations. The namespace is changed from the "from" table to the linked table (specified with "get").

As an example, think of this query. The impressions table has a column called product which links to a row in products table. The price and title fields are columns of products.

{
  "from": "impressions",
  "where": {
    "query": "macbook air 2018"
  },
  "get": "product",
  "orderBy": ["price"],
  "select": ["title", "$highlight"]
}

When using "select" and "orderBy", we are already in the products table namespace, instead of having to use product.title or product.price.

Related information

Parameters
NameTypeDescription
bodyrequiredobjectGeneric query
Successful responses
ResponseTypeDescription
200 OKobjectQuery results
Request format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results.
"get": Get,
// Declares the sorting order of the result by a field or by a // user-defined score.
"orderBy": OrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}
Response format
{
"offset": integer,
"total": integer,
// Entries returned for a given query. Required.
"hits": Hits
}

Search query

Simple search query with the generic query.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": { "id": "6410402010318" }
  }'

Response

{
  "offset": 0,
  "total": 1,
  "hits": [
    {
      "category": "115",
      "id": "6410402010318",
      "name": "Pirkka tuna fish pieces in oil 200g/150g",
      "price": 1.69,
      "tags": "meat food protein pirkka"
    }
  ]
}

Search query with highlighted results

Search query which returns related products ordered by similarity. The response also contains the highlighted words which matched to the search term.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": {
      "name": { "$match": "coffee" }
    },
    "select": ["id", "name", "tags", "price", "$score", "$highlight"],
    "orderBy": "$similarity"
  }'

Response

{
  "offset": 0,
  "total": 4,
  "hits": [
    {
      "id": "6411300000494",
      "name": "Juhla Mokka coffee 500g sj",
      "tags": "coffee",
      "price": 3.95,
      "$score": 2.1726635013471625,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Juhla Mokka <font color=\"green\">coffee</font> 500g sj"
        }
      ]
    },
    {
      "id": "6420101441542",
      "name": "Kulta Katriina filter coffee 500g",
      "tags": "coffee",
      "price": 3.45,
      "$score": 2.1726635013471625,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Kulta Katriina filter <font color=\"green\">coffee</font> 500g"
        }
      ]
    },
    {
      "id": "6411300164653",
      "name": "Juhla Mokka Dark Roast coffee 500g hj",
      "tags": "coffee",
      "price": 3.95,
      "$score": 2.1726635013471625,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Juhla Mokka Dark Roast <font color=\"green\">coffee</font> 500g hj"
        }
      ]
    },
    {
      "id": "6410405181190",
      "name": "Pirkka Costa Rica filter coffee 500g UTZ",
      "tags": "coffee pirkka",
      "price": 2.89,
      "$score": 2.1726635013471625,
      "$highlight": [
        {
          "score": 1.1194647495169912,
          "field": "name",
          "highlight": "Pirkka Costa Rica filter <font color=\"green\">coffee</font> 500g UTZ"
        }
      ]
    }
  ]
}

Generic similarity query

In the example we're finding similar products based on the given hypothetical new product name.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "orderBy": {
      "$similarity": { "name": "Atria bratwurst 175g" }
    },
    "limit": 2
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 2.7310670795288075,
      "category": "102",
      "id": "6407870070333",
      "name": "Atria lauantaimakkara bread sausage 225g",
      "price": 0.89,
      "tags": "meat sausage with-bread"
    },
    {
      "$score": 2.7310670795288075,
      "category": "102",
      "id": "6407870071224",
      "name": "Atria Gotler ham sausage 300g",
      "price": 1.75,
      "tags": "meat sausage with-bread"
    }
  ]
}

Generic predict query

In the example we're predicting which tags a new hypothetical product could have.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "products",
    "where": { "name": "Atria bratwurst 175g" },
    "get": "tags.$feature",
    "orderBy": "$p",
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 22,
  "hits": [
    { "$p": 0.3542669225099545, "field": "", "feature": "meat" },
    { "$p": 0.23564865431624232, "field": "", "feature": "sausage" },
    { "$p": 0.06458531089293539, "field": "", "feature": "food" },
    { "$p": 0.037776313918509385, "field": "", "feature": "protein" },
    { "$p": 0.03716482390586358, "field": "", "feature": "pirkka" }
  ]
}

Recommend products which a customer would most likely purchase

In the example we're finding the top 5 products which veronica (user id) would most likely to purchase based on her behavior history stored in impressions table.

This example is the the same as in the documentation of Recommendation endpoint, but made with the generic query.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "veronica" },
    "get": "product",
    "orderBy": {
      "$p": {
        "$context": { "purchase": true }
      }
    },
    "limit": 5
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$p": 0.9922839884327421,
      "category": "100",
      "id": "6410405093677",
      "name": "Pirkka iceberg salad Finland 100g 1st class",
      "price": 1.29,
      "tags": "fresh vegetable pirkka"
    },
    {
      "$p": 0.9880776334264704,
      "category": "100",
      "id": "2000503600002",
      "name": "Chiquita banana",
      "price": 0.28054,
      "tags": "fresh fruit"
    },
    {
      "$p": 0.9627692001227784,
      "category": "108",
      "id": "6410405181190",
      "name": "Pirkka Costa Rica filter coffee 500g UTZ",
      "price": 2.89,
      "tags": "coffee pirkka"
    },
    {
      "$p": 0.9370075696184778,
      "category": "109",
      "id": "6420256014134",
      "name": "Tupla Maxi chocolate bar 50g UTZ",
      "price": 0.62,
      "tags": "candy lactose"
    },
    {
      "$p": 0.9181227425357429,
      "category": "101",
      "id": "6437002001454",
      "name": "VAASAN Ruispalat 660g 12 pcs fullcorn rye bread",
      "price": 1.69,
      "tags": "gluten bread"
    }
  ]
}

Query with custom scoring

In the example we're finding the top 5 products which veronica (user id) would most likely to purchase but in addition we're boosting products which have higher price. This would recommend products which are relevant for the user but also bring higher revenue to the shop. This demonstrates a situation where multiple factors should be considered in recommendations.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/_query \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "from": "impressions",
    "where": { "session.user": "veronica" },
    "get": "product",
    "orderBy": {
      "$multiply": [
        {
          "$p": {
            "$context": { "purchase": true }
          }
        },
        "price"
      ]
    },
    "limit": 3
  }'

Response

{
  "offset": 0,
  "total": 42,
  "hits": [
    {
      "$score": 2.78240298835483,
      "category": "108",
      "id": "6410405181190",
      "name": "Pirkka Costa Rica filter coffee 500g UTZ",
      "price": 2.89,
      "tags": "coffee pirkka"
    },
    {
      "$score": 1.9339160813499057,
      "category": "109",
      "id": "6411401015090",
      "name": "Fazer Sininen milk chocolate slab 200g",
      "price": 2.19,
      "tags": "candy lactose"
    },
    {
      "$score": 1.7219800724348475,
      "category": "111",
      "id": "6410405207722",
      "name": "Pirkka paper towel 4 rl",
      "price": 1.95,
      "tags": "paper-towels pirkka"
    }
  ]
}

Create jobs

POST /api/v1/jobs/{query}

Create a job for queries that last longer than 30 seconds. The regular endpoints reach a timeout after 30 seconds.

You can make a job request out of Predict, Match, Similarity, Generic and Evaluate endpoints. The query used is the same as you would use for the regular endpoint.

Parameters
NameTypeDescription
queryrequiredstringAny of the Aito query endpoints
Successful responses
ResponseTypeDescription
200 OKobjectJob info
Response format
{
// JobID.
"id": string,
// Empty for the query API endpoints.
"parameters": object,
// The query path used.
"path": string,
// When the job was started.
"startedAt": string
}

Example request

The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.

The example query is exactly the same as would be when using the regular _evaluate endpoint.

In the example we're evaluating how good results Aito provides when we predict tags for a new hypothetical product. The results give us the accuracy and performance of the prediction example shown in Predict operation's documentation.

$index is a built-in variable which tells the insertion index of a row. In the example, we select 1/4 of the rows in products table to be used as test data. The rest of the rows are automatically used as training data.

Aito iterates through each product in the test data, and tests how accurate the prediction of tags for a given product name was.

curl -X POST \
  https://aito-grocery-store.api.aito.ai/api/v1/jobs/_evaluate \
  -H 'content-type: application/json' \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9' \
  -d '
  {
    "test": {
      "$index": {
        "$mod": [4, 0]
      }
    },
    "evaluate": {
      "from": "products",
      "where": {
        "name": { "$get": "name" }
      },
      "predict": "tags"
    }
  }'

Response

{
  "id": "1c84915c-fa91-44f5-84b2-dacf607b1619",
  "parameters": {  },
  "path": "_evaluate",
  "startedAt": "2020-09-17T10:59:37.443Z"
}

Get status of all jobs

GET /api/v1/jobs/

List all jobs that exist currently.

Successful responses
ResponseTypeDescription
200 OKobjectJob statuses
Response format
{
// Job result will not be available after the date.
"expiresAt": string,
// Job finished running.
"finishedAt": string,
// JobID.
"id": string,
// Empty for the query API endpoints.
"parameters": object,
// The query path used.
"path": string,
// When the job was started.
"startedAt": string
}

Example request

curl -X GET \
  https://aito-grocery-store.api.aito.ai/api/v1/jobs/ \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9'

Response

[
  {
    "id": "1c84915c-fa91-44f5-84b2-dacf607b1619",
    "parameters": {  },
    "path": "_evaluate",
    "startedAt": "2020-09-17T10:59:37.443Z"
  }
]

Get status of a job

GET /api/v1/jobs/{uuid}

If you have started a job for some of the queries, this endpoint can return you the status of the job by its ID.

Successful responses
ResponseTypeDescription
200 OKobjectJob status
Response format
{
// Job result will not be available after the date.
"expiresAt": string,
// Job finished running.
"finishedAt": string,
// JobID.
"id": string,
// Empty for the query API endpoints.
"parameters": object,
// The query path used.
"path": string,
// When the job was started.
"startedAt": string
}

Example request

curl -X GET \
  https://aito-grocery-store.api.aito.ai/api/v1/jobs/1c84915c-fa91-44f5-84b2-dacf607b1619 \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9'

Response

{
  "expiresAt": "2020-09-17T11:14:37.689Z",
  "finishedAt": "2020-09-17T10:59:37.689Z",
  "id": "1c84915c-fa91-44f5-84b2-dacf607b1619",
  "parameters": {  },
  "path": "_evaluate",
  "startedAt": "2020-09-17T10:59:37.443Z"
}

Get result of a job

GET /api/v1/jobs/{uuid}/result

Get the query result for a created job.

Successful responses
ResponseTypeDescription
200 OKobjectEvaluate job result
Response format
{
// The amount of samples used for testing. Required.
"n": integer,
// The amount of samples used for testing. Required.
"testSamples": integer,
// The average amount of samples used for training. Required.
"trainSamples": number,
// The average number of features. Required.
"features": number,
// Complement of `accuracy` (=`1 - accuracy`). Required.
"error": number,
// Complement of `baseAccuracy` (=`1 - baseAccuracy`). // Required.
"baseError": number,
// The accuracy of predictions. Required.
"accuracy": number,
// The simulated accuracy of predictions based on taking the // most frequent value. Required.
"baseAccuracy": number,
// How much better results Aito was able to provide compared to // a naive prediction. Required.
"accuracyGain": number,
// Average rank of the best prediction. Required.
"meanRank": number,
"baseMeanRank": number,
// Improvement of meanRank upon baseMeanRank. Required.
"rankGain": number,
// A measurement which describes the quality of probabilities // (=`h - mxe`). Required.
"informationGain": number,
// Mean cross entropy. Required.
"mxe": number,
// Entropy. Required.
"h": number,
// The mean geometric probability of the predictions. Required.
"geomMeanP": number,
// Base geometric mean probability. Required.
"baseGmp": number,
// Geometric mean lift. Required.
"geomMeanLift": number,
// The mean execution time of the queries in nanoseconds. // Required.
"meanNs": number,
// The mean execution time of the queries in microseconds. // Required.
"meanUs": number,
// The mean execution time of the queries in milliseconds. // Required.
"meanMs": number,
// The median execution time of the queries in nanoseconds. // Required.
"medianNs": number,
// The median execution time of the queries in microseconds. // Required.
"medianUs": number,
// The median execution time of the queries in milliseconds. // Required.
"medianMs": number,
// The time spent for warm-up of indexes and caches for the // given query in milliseconds. Required.
"warmingMs": number
}

Example request

curl -X GET \
  https://aito-grocery-store.api.aito.ai/api/v1/jobs/1c84915c-fa91-44f5-84b2-dacf607b1619/result \
  -H 'x-api-key: bc4Ck3nDwM1ILVjNahNJL8hPEAzCes8t2vGMUyo9'

Response

{
  "n": 11,
  "testSamples": 11,
  "trainSamples": 31,
  "features": 202,
  "error": 0.09090909090909094,
  "baseError": 0.7272727272727273,
  "accuracy": 0.9090909090909091,
  "baseAccuracy": 0.2727272727272727,
  "accuracyGain": 0.6363636363636364,
  "meanRank": 1.9090909090909092,
  "baseMeanRank": 4.454545454545454,
  "rankGain": 2.545454545454545,
  "informationGain": 2.1768738588975234,
  "mxe": 1.4084101290438387,
  "h": 3.585283987941362,
  "geomMeanP": 0.3767266164020222,
  "baseGmp": 0.08331476557218853,
  "geomMeanLift": 4.5217268969640845,
  "meanNs": 3298335,
  "meanUs": 3298.335,
  "meanMs": 3.298335,
  "medianNs": 3065594,
  "medianUs": 3065.594,
  "medianMs": 3.065594,
  "allNs": [
    2853327,
    4401258,
    3450942,
    4502371,
    1963998,
    3065594,
    4391326,
    2419506,
    2897127,
    3017426,
    3318810
  ],
  "allUs": [2853, 4401, 3450, 4502, 1963, 3065, 4391, 2419, 2897, 3017, 3318],
  "allMs": [2, 4, 3, 4, 1, 3, 4, 2, 2, 3, 3],
  "warmingMs": 0,
  "accurateOffsets": [0, 1, 2, 3, 4, 5, 6, 7, 8, 10],
  "errorOffsets": [9],
  "cases": [
    {
      "offset": 0,
      "testCase": {
        "category": "100",
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3854246183889548, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3854246183889548,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 1,
      "testCase": {
        "category": "100",
        "id": "6410405093677",
        "name": "Pirkka iceberg salad Finland 100g 1st class",
        "price": 1.29,
        "tags": "fresh vegetable pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.41960307640793315,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.41960307640793315,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 2,
      "testCase": {
        "category": "101",
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      "accurate": true,
      "top": { "$p": 0.445147520245974, "field": "tags", "feature": "bread" },
      "correct": {
        "$p": 0.445147520245974,
        "field": "tags",
        "feature": "bread"
      }
    },
    {
      "offset": 3,
      "testCase": {
        "category": "102",
        "id": "6410405205483",
        "name": "Pirkka Finnish beef-pork minced meat 20% 400g",
        "price": 2.79,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.4767020957928926, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.4767020957928926,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 4,
      "testCase": {
        "category": "103",
        "id": "6412000030026",
        "name": "Saarioinen Maksalaatikko liver casserole 400g",
        "price": 1.99,
        "tags": "meat food"
      },
      "accurate": true,
      "top": { "$p": 0.2552109631934357, "field": "tags", "feature": "food" },
      "correct": {
        "$p": 0.2552109631934357,
        "field": "tags",
        "feature": "food"
      }
    },
    {
      "offset": 5,
      "testCase": {
        "category": "104",
        "id": "6410405082657",
        "name": "Pirkka Finnish semi-skimmed milk 1l",
        "price": 0.81,
        "tags": "lactose drink pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3058803379364949, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3058803379364949,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 6,
      "testCase": {
        "category": "104",
        "id": "6408430000258",
        "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
        "price": 1.95,
        "tags": "lactose-free drink"
      },
      "accurate": true,
      "top": { "$p": 0.34164387716484773, "field": "tags", "feature": "drink" },
      "correct": {
        "$p": 0.34164387716484773,
        "field": "tags",
        "feature": "drink"
      }
    },
    {
      "offset": 7,
      "testCase": {
        "category": "108",
        "id": "6420101441542",
        "name": "Kulta Katriina filter coffee 500g",
        "price": 3.45,
        "tags": "coffee"
      },
      "accurate": true,
      "top": { "$p": 0.5588708896430944, "field": "tags", "feature": "coffee" },
      "correct": {
        "$p": 0.5588708896430944,
        "field": "tags",
        "feature": "coffee"
      }
    },
    {
      "offset": 8,
      "testCase": {
        "category": "109",
        "id": "6411401015090",
        "name": "Fazer Sininen milk chocolate slab 200g",
        "price": 2.19,
        "tags": "candy lactose"
      },
      "accurate": true,
      "top": { "$p": 0.396203917646309, "field": "tags", "feature": "lactose" },
      "correct": {
        "$p": 0.396203917646309,
        "field": "tags",
        "feature": "lactose"
      }
    },
    {
      "offset": 9,
      "testCase": {
        "category": "111",
        "id": "6413200330206",
        "name": "Lotus Soft Embo 8 rll toilet paper",
        "price": 3.35,
        "tags": "toilet-paper"
      },
      "accurate": false,
      "top": { "$p": 0.2121561238199451, "field": "tags", "feature": "pirkka" }
    },
    {
      "offset": 10,
      "testCase": {
        "category": "115",
        "id": "6410402010318",
        "name": "Pirkka tuna fish pieces in oil 200g/150g",
        "price": 1.69,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.28410516625444, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.28410516625444,
        "field": "tags",
        "feature": "pirkka"
      }
    }
  ],
  "accurateCases": [
    {
      "offset": 0,
      "testCase": {
        "category": "100",
        "id": "2000818700008",
        "name": "Pirkka banana",
        "price": 0.166,
        "tags": "fresh fruit pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3854246183889548, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3854246183889548,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 1,
      "testCase": {
        "category": "100",
        "id": "6410405093677",
        "name": "Pirkka iceberg salad Finland 100g 1st class",
        "price": 1.29,
        "tags": "fresh vegetable pirkka"
      },
      "accurate": true,
      "top": {
        "$p": 0.41960307640793315,
        "field": "tags",
        "feature": "pirkka"
      },
      "correct": {
        "$p": 0.41960307640793315,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 2,
      "testCase": {
        "category": "101",
        "id": "6413467282508",
        "name": "Fazer Puikula fullcorn rye bread 330g",
        "price": 1.29,
        "tags": "gluten bread"
      },
      "accurate": true,
      "top": { "$p": 0.445147520245974, "field": "tags", "feature": "bread" },
      "correct": {
        "$p": 0.445147520245974,
        "field": "tags",
        "feature": "bread"
      }
    },
    {
      "offset": 3,
      "testCase": {
        "category": "102",
        "id": "6410405205483",
        "name": "Pirkka Finnish beef-pork minced meat 20% 400g",
        "price": 2.79,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.4767020957928926, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.4767020957928926,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 4,
      "testCase": {
        "category": "103",
        "id": "6412000030026",
        "name": "Saarioinen Maksalaatikko liver casserole 400g",
        "price": 1.99,
        "tags": "meat food"
      },
      "accurate": true,
      "top": { "$p": 0.2552109631934357, "field": "tags", "feature": "food" },
      "correct": {
        "$p": 0.2552109631934357,
        "field": "tags",
        "feature": "food"
      }
    },
    {
      "offset": 5,
      "testCase": {
        "category": "104",
        "id": "6410405082657",
        "name": "Pirkka Finnish semi-skimmed milk 1l",
        "price": 0.81,
        "tags": "lactose drink pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.3058803379364949, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.3058803379364949,
        "field": "tags",
        "feature": "pirkka"
      }
    },
    {
      "offset": 6,
      "testCase": {
        "category": "104",
        "id": "6408430000258",
        "name": "Valio eila™ Lactose-free semi-skimmed milk drink 1l",
        "price": 1.95,
        "tags": "lactose-free drink"
      },
      "accurate": true,
      "top": { "$p": 0.34164387716484773, "field": "tags", "feature": "drink" },
      "correct": {
        "$p": 0.34164387716484773,
        "field": "tags",
        "feature": "drink"
      }
    },
    {
      "offset": 7,
      "testCase": {
        "category": "108",
        "id": "6420101441542",
        "name": "Kulta Katriina filter coffee 500g",
        "price": 3.45,
        "tags": "coffee"
      },
      "accurate": true,
      "top": { "$p": 0.5588708896430944, "field": "tags", "feature": "coffee" },
      "correct": {
        "$p": 0.5588708896430944,
        "field": "tags",
        "feature": "coffee"
      }
    },
    {
      "offset": 8,
      "testCase": {
        "category": "109",
        "id": "6411401015090",
        "name": "Fazer Sininen milk chocolate slab 200g",
        "price": 2.19,
        "tags": "candy lactose"
      },
      "accurate": true,
      "top": { "$p": 0.396203917646309, "field": "tags", "feature": "lactose" },
      "correct": {
        "$p": 0.396203917646309,
        "field": "tags",
        "feature": "lactose"
      }
    },
    {
      "offset": 10,
      "testCase": {
        "category": "115",
        "id": "6410402010318",
        "name": "Pirkka tuna fish pieces in oil 200g/150g",
        "price": 1.69,
        "tags": "meat food protein pirkka"
      },
      "accurate": true,
      "top": { "$p": 0.28410516625444, "field": "tags", "feature": "pirkka" },
      "correct": {
        "$p": 0.28410516625444,
        "field": "tags",
        "feature": "pirkka"
      }
    }
  ],
  "errorCases": [
    {
      "offset": 9,
      "testCase": {
        "category": "111",
        "id": "6413200330206",
        "name": "Lotus Soft Embo 8 rll toilet paper",
        "price": 3.35,
        "tags": "toilet-paper"
      },
      "accurate": false,
      "top": { "$p": 0.2121561238199451, "field": "tags", "feature": "pirkka" }
    }
  ],
  "alpha_binByTopScore": [
    {
      "meanScore": 0.26433814780107895,
      "maxScore": 0.3058803379364949,
      "minScore": 0.2121561238199451,
      "accuracy": 0.75,
      "n": 4,
      "accurateOffsets": [4, 10, 5],
      "errorOffsets": [9]
    },
    {
      "meanScore": 0.3857188724020112,
      "maxScore": 0.41960307640793315,
      "minScore": 0.34164387716484773,
      "accuracy": 1,
      "n": 4,
      "accurateOffsets": [6, 0, 8, 1],
      "errorOffsets": []
    },
    {
      "meanScore": 0.493573501893987,
      "maxScore": 0.5588708896430944,
      "minScore": 0.445147520245974,
      "accuracy": 1,
      "n": 3,
      "accurateOffsets": [2, 3, 7],
      "errorOffsets": []
    }
  ]
}

Database API

Operations which manipulate the Aito database.

Get database schema

GET /api/v1/schema

Get the schema for the database.

Successful responses
ResponseTypeDescription
200 OKobjectThe current active schema
Response format
{
// Database tables. Required.
"schema": {
// Any schema which is a valid Aito table schema.
"<yourTableName>": UserDefinedTableSchema
}
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/schema \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "schema": {
    "products": {
      "columns": {
        "description": {
          "analyzer": "english",
          "nullable": false,
          "type": "Text"
        },
        "id": { "nullable": false, "type": "Int" },
        "name": { "nullable": false, "type": "String" },
        "price": { "nullable": false, "type": "Decimal" }
      },
      "type": "table"
    }
  }
}

Create database schema

PUT /api/v1/schema

Create or update the schema for the entire database.

Note:

  • An existing table that is not included in the updated schema will not be deleted.
  • An existing table that is included in the updated schema will be updated if the table has no data.
  • The new table names must be valid. See Valid Table Names section for more information.
Parameters
NameTypeDescription
bodyrequiredobjectThe aito schema definition
Successful responses
ResponseTypeDescription
200 OKobjectThe current active schema
Request format
{
// Database tables. Required.
"schema": {
// Any schema which is a valid Aito table schema.
"<yourTableName>": UserDefinedTableSchema
}
}
Response format
{
// Database tables. Required.
"schema": {
// Any schema which is a valid Aito table schema.
"<yourTableName>": UserDefinedTableSchema
}
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.aito.app/api/v1/schema \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "schema": {
      "products": {
        "type": "table",
        "columns": {
          "id": { "type": "Int" },
          "name": { "type": "String" },
          "price": { "type": "Decimal" },
          "description": { "type": "Text", "analyzer": "English" }
        }
      }
    }
  }'

Response

{
  "schema": {
    "products": {
      "columns": {
        "description": { "analyzer": "english", "type": "Text" },
        "id": { "type": "Int" },
        "name": { "type": "String" },
        "price": { "type": "Decimal" }
      },
      "type": "table"
    }
  }
}

Delete database

DELETE /api/v1/schema

Delete the entire database schema.

The operation deletes all data and contents of the database! The action is irreversible.

Successful responses
ResponseTypeDescription
200 OKobjectThe summary of deletion
Response format
{
// Array of table names deleted. Required.
"deleted": [string, ...]
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.aito.app/api/v1/schema \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "deleted": ["products"]
}

Get table schema

GET /api/v1/schema/{table}

Get the schema of the specified table.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
Successful responses
ResponseTypeDescription
200 OKobjectThe current schema of the table
Response format
{
// Type of the database schema item.
"type": string,
// Table columns.
"columns": {
// Type of the column.
"<yourColumnName>": ColumnType
}
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/schema/products \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "columns": {
    "description": { "analyzer": "english", "nullable": false, "type": "Text" },
    "id": { "nullable": false, "type": "Int" },
    "name": { "nullable": false, "type": "String" },
    "price": { "nullable": false, "type": "Decimal" }
  },
  "type": "table"
}

Create table schema

PUT /api/v1/schema/{table}

Update a schema of the specified table.

Note:

  • The table schema cannot be updated if it contains data.
  • The new table name must be valid. See Valid Table Names section for more information.
Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
bodyrequiredobjectThe new schema of the table
Successful responses
ResponseTypeDescription
200 OKobjectThe current schema of the table
Request format
{
// Type of the database schema item.
"type": string,
// Table columns.
"columns": {
// Type of the column.
"<yourColumnName>": ColumnType
}
}
Response format
{
// Type of the database schema item.
"type": string,
// Table columns.
"columns": {
// Type of the column.
"<yourColumnName>": ColumnType
}
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.aito.app/api/v1/schema/products \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "type": "table",
    "columns": {
      "id": { "type": "Int" },
      "name": { "type": "String" },
      "price": { "type": "Decimal" },
      "description": { "type": "Text", "analyzer": "English" }
    }
  }'

Response

{
  "columns": {
    "description": { "analyzer": "english", "type": "Text" },
    "id": { "type": "Int" },
    "name": { "type": "String" },
    "price": { "type": "Decimal" }
  },
  "type": "table"
}

Delete table

DELETE /api/v1/schema/{table}

Delete a single table in the schema.

The operation deletes all data and contents of the table! The action is irreversible.

Note: The delete operation would fail if it leaves the database schema in broken state.

For example, given the following schema:

{
  "schema": {
    "users": {
      "type": "table",
      "columns": {
        "username": { "type": "String" }
      }
    },
    "sessions" : {
      "type": "table",
      "columns": {
         "id"     : { "type" : "String" },
         "user"   : { "type" : "String", "link": "users.username" }
      }
    }
  }
}

The users table cannot be deleted before changing the sessions table first so that sessions.user is not linked to the users table.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
Successful responses
ResponseTypeDescription
200 OKobjectThe summary of deletion
Response format
{
// Array of table names deleted. Required.
"deleted": [string, ...]
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.aito.app/api/v1/schema/products \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "deleted": ["products"]
}

Get column schema

GET /api/v1/schema/{table}/{column}

Get the schema of a column.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
tablerequiredstringThe name of the column
Successful responses
ResponseTypeDescription
200 OKobjectThe current schema of the column
Response format

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/schema/products/name \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "nullable": false,
  "type": "String"
}

Add or replace column

PUT /api/v1/schema/{table}/{column}

Add or replace a column of a table.

If a column with the same name already exists then the operation deletes all data and contents of the column! The action is irreversible.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
tablerequiredstringThe name of the column
bodyrequiredobjectThe schema of the column
Successful responses
ResponseTypeDescription
200 OKobjectThe schema of the column
Request format
{
// Value that existing rows get.
"value":
integer
or
number
or
boolean
or
null
or
string
}
Response format

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X PUT \
  https://your-env-name.aito.app/api/v1/schema/products/quantity \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "type": "Int",
    "nullable": false,
    "value": 0
  }'

Response

{
  "nullable": false,
  "type": "Int",
  "value": 0
}

Delete column

DELETE /api/v1/schema/{table}/{column}

Delete a column from a table.

The operation deletes all data and contents of the column! The action is irreversible.

Note: The delete operation would fail if it leaves the database schema in broken state.

For example, given the following schema:

{
  "schema": {
    "users": {
      "type": "table",
      "columns": {
        "username": { "type": "String" },
        "name": { "type": "String" }
      }
    },
    "sessions" : {
      "type": "table",
      "columns": {
         "id"     : { "type" : "String" },
         "user"   : { "type" : "String", "link": "users.username" }
      }
    }
  }
}

The column username of the users table cannot be deleted before changing the sessions table first so that sessions.user is not linked to users.username.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
tablerequiredstringThe name of the column
Successful responses
ResponseTypeDescription
200 OKobjectThe summary of deletion
Response format
{
// Array of table names deleted. Required.
"deleted": [string, ...]
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X DELETE \
  https://your-env-name.aito.app/api/v1/schema/products/description \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {}'

Response

{
  "deleted": ["description"]
}

Rename a table

POST /api/v1/schema/_rename

Rename a table to the specified name.

Rename the table in the 'from' field to the specified name in the rename field. Set 'replace' to true, if you want to replace an existing table with the specified name.

The new table name must be valid. See Valid Table Names section for more information.

Parameters
NameTypeDescription
bodyrequiredobjectThe request body
Successful responses
ResponseTypeDescription
200 OKobjectRename Table results
Request format
{
// The table to rename. Required.
"from": FromTablemodify,
// The name of the renamed table. Required.
"rename": FromTablemodify,
// If replace is true, operation will overwrite any existing // table. // Default: false
"replace": boolean
}
Response format
{}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/schema/_rename \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "from": "products",
    "rename": "renamed_products"
  }'

Response

{}

Copy a table

POST /api/v1/schema/_copy

Copy a table. This operations creates a copy of the table with the given name. The operation can be very fast, because the copying is done by copying the reference to the underlying immutable data structure.

The 'from' field must contain the name of the copied table. The 'copy' field must contain the new name of the new copy. Set 'replace' field to true, if you want to replace any existing table with the target name.

The new table name must be valid. See Valid Table Names section for more information.

Parameters
NameTypeDescription
bodyrequiredobjectThe request body
Successful responses
ResponseTypeDescription
200 OKobjectCopy Table results
Request format
{
// The existing table to copy. Required.
"from": FromTablemodify,
// The target name of the new copy. Required.
"copy": FromTablemodify,
// If replace is true, operation will overwrite any existing // table. // Default: false
"replace": boolean
}
Response format
{}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/schema/_copy \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "from": "products",
    "copy": "old_products"
  }'

Response

{}

Insert entry

POST /api/v1/data/{table}

Insert entry to a table.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
bodyrequiredobjectAny object which is valid according to the provisioned schema
Successful responses
ResponseTypeDescription
200 OKobjectThe inserted entry
Request format
Response format

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "id": 1,
    "name": "Apple iPhone 8 64 Gt, spacegray",
    "price": 648.9,
    "description": "A11 processor and wireless charging."
  }'

Response

{
  "description": "A11 processor and wireless charging.",
  "id": 1,
  "name": "Apple iPhone 8 64 Gt, spacegray",
  "price": 648.9
}

Insert multiple entries

POST /api/v1/data/{table}/batch

Import multiple entries into the database.

The batch import can be used to upload multiple entries to a single table. The payload needs to be a valid JSON array (instead of ndjson).

Note: batch API supports max 10MB payloads.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
bodyrequiredarrayAn array of objects which are valid according to the provisioned schema
Successful responses
ResponseTypeDescription
200 OKobjectSummary of the inserted entries
Request format
Response format
{
// How many entries were inserted. Required.
"entries": integer,
// Status text. Required.
"status": string
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products/batch \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  [
    {
      "id": 1,
      "name": "Apple iPhone 8 64 Gt, spacegray",
      "price": 648.9,
      "description": "A11 processor and wireless charging."
    },
    {
      "id": 2,
      "name": "Apple iPhone X 32 GB, space gray",
      "price": 1048.9,
      "description": "All‑screen design. Longest battery life ever in an iPhone."
    },
    {
      "id": 3,
      "name": "Samsung Galaxy S9",
      "price": 698.2,
      "description": "The Camera. Reimagined."
    }
  ]'

Response

{
  "entries": 3,
  "status": "ok"
}

Delete entries

POST /api/v1/data/_delete

Delete entries with a Search-like interface.

You can describe the target table and filters for which entries to delete.

An empty proposition will match and delete everything!

Parameters
NameTypeDescription
bodyrequiredobjectTo be clarified
Successful responses
ResponseTypeDescription
200 OKobjectDelete results
Request format
{
// The modified table. Required.
"from": FromTablemodify,
// The entries to delete. Required.
"where": Proposition
}
Response format
{
// The number of rows that was deleted. Required.
"total": integer
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/_delete \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {
    "from": "products",
    "where": { "id": 1 }
  }'

Response

{
  "total": 1
}

Initiate file upload

POST /api/v1/data/{table}/file

Initiate a file upload session.

The file API allows circumventing the batch upload API payload size limit by allowing upload of large data sets. The file API accepts data in gzip compressed ndjson format, stored into a file.

File must be a gzip compressed ndjson, normal JSON arrays are not accepted.

The data file is uploaded to AWS S3 and processed asynchronously. The file must be compressed with gzip before uploading to reduce the size of the transferred data.

The file API is not a single API, but requires a minimum of three calls (per table). The sequence is as follows:

  1. Initiate the file upload process
  2. Upload compressed ndjson file to S3, using the signed URL
  3. Trigger file processing
  4. (Optional) Poll the file processing status
Loading diagram...

You can find the bash implementation of the flow at our tools repository. See the upload-file.sh script.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
Successful responses
ResponseTypeDescription
200 OKobjectThe details to execute the S3 upload and the job's id
Response format
{
// The uuid of the file upload session. Required.
"id": string,
// The presigned S3 url where to push the data. Required.
"url": string,
// The http method used for uploading to S3. Required.
"method": string,
// Defines when the presigned upload link expires. Required.
"expires": string
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products/file \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "expires": "2020-09-17T11:19:41",
  "id": "e707e89a-ff38-4bb9-a029-4973516c3986",
  "method": "PUT",
  "url": "https://aitoai-dev-customer-uploads.s3.eu-west-1.amazonaws.com/localhost/products/e707e89a-ff38-4bb9-a029-4973516c3986?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20200917T105941Z&X-Amz-SignedHeaders=host&X-Amz-Expires=1199&X-Amz-Credential=AKIAJ4K3DQXQTXQLFFEQ%2F20200917%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Signature=285a7e4a8355175152ce1e7e0233b303f720fcd22e9d92db36277c44f7cad034"
}

Trigger file processing

POST /api/v1/data/{table}/file/{uuid}

Start the processing of a previously uploaded file.

Note: This operation is part of the file upload sequence. If you want to read how to execute a full file upload flow, see Initiate file upload documentation.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
uuidrequiredstringThe assigned id of the operation
Successful responses
ResponseTypeDescription
200 OKobjectProcessing started status
Response format
{
// The id of the operation. Required.
"id": string,
// Textual description of the job. Required.
"status": string
}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products/file/9f22c8fc-d033-4010-90ef-3d5a87025221 \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

{
  "id": "9f22c8fc-d033-4010-90ef-3d5a87025221",
  "status": "started"
}

Get file processing status

GET /api/v1/data/{table}/file/{uuid}

Get the file upload progress.

The response is probabilistic and might not contain the very last result, since the status update is asynchronous, and the upload happens in multiple parallel streams. The response, however, will give an idea of approximate progress.

Note: This operation is part of the file upload sequence. If you want to read how to execute a full file upload flow, see Initiate file upload documentation.

Parameters
NameTypeDescription
tablerequiredstringThe name of the table to add data to
uuidrequiredstringThe assigned id of the operation
Successful responses
ResponseTypeDescription
200 OKobjectThe file processing status
Response format
{
"status": {
// Total duration of the file processing elapsed. Required.
"totalDurationMs": number,
// Total duration of the file processing elapsed as human // readable units. Required.
"totalDuration": string,
// Throughput of lines in human readable units.
"throughput": string,
// When the file processing was started. Required.
"startedAt": string,
// When the file processing was finished. Required.
"finishedAt": string,
// Is the job finished or not. Required.
"finished": boolean,
// The number of lines completed so far. Required.
"completedCount": integer,
// Any object which is valid according to the database schema. // Required.
"lastSuccessfulElement": UserDefinedObject
},
"errors": {
// Human consumable description. Required.
"message": string,
// Array of failing elements.
"rows": [UserDefinedObject, ...]
}
}

Request during processing

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/data/products/file/b348a62a-11fd-4e2e-b6eb-f9763f646b4f \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

The example shows what the response looks while data processing is still in progress.

{
  "errors": { "message": "Last 0 failing rows", "rows": null },
  "status": {
    "phase": "AitoDatabaseInsert",
    "finished": false,
    "completedCount": 0,
    "startedAt": "20200917T105942.756Z",
    "throughput": "0/s"
  }
}

Request after processing

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X GET \
  https://your-env-name.aito.app/api/v1/data/products/file/ce82536e-de93-4453-ab3d-fb4abe9274f9 \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY'

Response

The example shows what the response looks after data processing has been successfully done.

{
  "errors": { "message": "Last 0 failing rows", "rows": null },
  "status": {
    "totalDurationMs": 0,
    "finished": true,
    "completedCount": 3,
    "lastSuccessfulElement": {
      "description": "The Camera. Reimagined.",
      "id": 3,
      "name": "Samsung Galaxy S9",
      "price": 698.2
    },
    "totalDuration": "0 milliseconds",
    "startedAt": "20200917T105945.226Z",
    "finishedAt": "20200917T105945.226Z",
    "throughput": "0/s"
  }
}

Optimize the database

POST /api/v1/data/{table}/optimize

Optimize the database for the query performance

Aito.ai database is implemented as a log-structured merge-tree. Because this architecture, Aito's tables are implemented internally as a tree of table segments.

Now, the complexity of the table tree has major implications on both query speed and write speed side. The less segments Aito maintains in the tree, the faster the queries are, but the slower the writes are, because Aito needs to rewrite parts of the tree regularly. Similarly the more segments are allowed, the slower the queries are, but the faster the write speed becomes.

Aito seeks to maintain the approximately O(log N) segments in the table tree in order to maintain a reasonable compromise between the query and the write speeds.

Still, there can be situations, where it is beneficial to rewrite the entire database as a single segment to get the optimal query speed. Optimize operation does this.

It may take minutes or hours to optimize a big table. This means, that optimize should be used to improve the query performance only in situations, when the database and the results need to be updated rarely, for example nightly.

Optimize will maintain a write lock on the database over the entire operation. This means that you cannot add data at the time the optimize operation is running. Still, the queries will work normally. After the optimize is finished, the optimized table needs to be reloaded, which can induce a significant latency for the following query.

Parameters
NameTypeDescription
bodyrequiredobjectAn empty object
Successful responses
ResponseTypeDescription
200 OKobjectAn empty object
Request format
{}
Response format
{}

Example request

This request sample is not directly copy-pasteable. Your own Aito environment is required.

curl -X POST \
  https://your-env-name.aito.app/api/v1/data/products/optimize \
  -H 'content-type: application/json' \
  -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \
  -d '
  {}'

Response

{}

Database Schema

The Aito database requires a schema to operate. The schema defines:

  • The name of the tables
  • The name and the ColumnType of the columns in each table
  • The Analyzer of a column if needed
  • The relationships between tables

Please refer to the Defining a database schema guide for more details.

UserDefinedTableSchema

Any schema which is a valid Aito table schema.

Table schema describes the structure of the table in a formal language. The schema describes all fields (or columns), data types of the fields, and information to help Aito preprocess your data. For example what language a textual data contains.

The contents of the schema depends on the data that will be inserted into the database.

Format
{
// Type of the database schema item.
"type": string,
// Table columns.
"columns": {
// Type of the column.
"<yourColumnName>": ColumnType
}
}

Example

{
  "type": "table",
  "columns": {
    "id": { "type": "Int", "nullable": false },
    "name": { "type": "String", "nullable": false },
    "price": { "type": "Decimal", "nullable": false },
    "description": { "type": "Text", "nullable": false, "analyzer": "English" }
  }
}

ColumnType

Type of the column.

Describes an individual field (or column), the type, and information to help Aito preprocess your data. For example what language a textual data contains.

Format

Examples

{ "type": "int", "nullable": false }
{ "type": "string", "nullable": false }
{ "type": "decimal", "nullable": false }
{ "type": "text", "nullable": false, "analyzer": "english" }

BooleanType

Boolean column type.

When column is a boolean, the only accepted values are true and false.

Format
{
// Type of the column. Required.
"type": string,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Example

{ "type": "boolean" }

DecimalType

Double-precision floating-point number.

Format
{
// Type of the column. Required.
"type": string,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Example

{ "type": "Decimal", "nullable": false }

IntType

Integer column type.

Format
{
// Type of the column. Required.
"type": string,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Examples

{ "type": "Int" }
{ "type": "Int", "link": "users.id" }

StringType

String column type.

The string data type is a primitive version of the Text type. The value is turned into a single feature. For example "lazy black cat" becomes 1 feature: "lazy black cat".

Format
{
// Type of the column. Required.
"type": string,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Examples

{ "type": "String", "nullable": false }
{ "type": "String", "link": "messages.id" }

TextType

Text column type.

The text data type enables smart textual analysis of strings. A text column has an analyzer which defines how the text can be split into words or tokens, which are used as features during inference.

Format
{
// Type of the column. Required.
"type": string,
// Aito analyzers break the [Text type](#schema-text-type) data // into features that can be used for inference.
"analyzer": Analyzer,
// When true, `null` values are allowed. // Default: true
"nullable": boolean,
// Path to a column of a linked row. // Default: null
"link": string
}

Example

{ "type": "Text", "analyzer": "English", "nullable": false }

Analyzer

Aito analyzers break the Text type data into features that can be used for inference.
Let's take a look at an example of predicting the category of a product using its description using the following data:

descriptiontags
Brazilian organic orangeorganic, fruit, imported
Local organic spinachorganic, vegetable, local
Lentil snacksnack
  • Given a description of "organic tomatoes", we would like to predict the tag of this product.
  • If no analyzer is defined, the description is treated as a String type and the description "organic tomatoes" is turned into only 1 feature "organic tomatoes". Since there is no entry in the given data containing the description "organic tomatoes", Aito is not able to provide any meaningful prediction for the tags.
  • Using the default English analyzer, "organic tomatoes" will be turned into 2 features "organ" (the English stem of the word "organic") and "tomato", "Brazilian organic orange" will be turned into 3 features brazilian, "organ", and "orang", other descriptions will be turned into features in a similar fashion.
  • Aito can now find patterns between these features. For example, when the description has a feature "organ", the tag is likely "organic". Hence, using the analyzer, Aito can return reasonable prediction for unseen entry.
Format

Examples

"standard"
"whitespace"
"english"
"en"
{ "type": "delimiter", "delimiter": "," }
{ "type": "language", "language": "en" }
{ "type": "char-ngram", "minGram": 2, "maxGram": 3 }
{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2 }

AliasAnalyzer

Aito has several built-in analyzers and they are selected by using their name in the "analyzer" field of a text column. For instance:

{ "analyzer": "english" }

The built-in analyzers include:

  • Standard Analyzer:
    • Name: "standard"
    • A good default analyzer which Works well in most languages. The analyzer generates features based on the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29. The standard analyzer filters English stop words that are normally not useful.
    • E.g: "the cats are running" will be break down into "cats", "running".
  • Whitespace Analyzer:
    • Name: "whitespace"
    • The analyzer breaks the text into features whenever it encounters a whitespace character. Adjacent sequences of non-Whitespace characters form tokens.
    • E.g: "the cats are running" will be break down into "the", "cats", "are", and "running".
  • Language Analyzer:
    • Alias: the language name or the language ISO 639-1 Code (except some special case)
    • A Language Analyzer with the default setting (no stop words or keywords).
    • See Language Analyzer for supported languages and its aliases.
Format
string

Examples

"standard"
"whitespace"
"english"
"en"
Referenced in

CharNGramAnalyzer

The Character N-gram Analyzer breaks text into n-gram features.

For example, the following n-gram analyzer:

{ "type": "char-ngram", "minGram": 3, "maxGram": 3 }

would break the text "the cats are running" into the following list of features:

["the", "he ", "e c", " ca", "cat", "ats", "ts ", "s a", " ar", "are", "re ", "e r", " ru", "run", "unn", "nni", "nin", "ing"]

The analyzer can be useful for languages that don’t use spaces or that have long compound words, like German.

Format
{
// Type of the analyzer. Required.
"type": string,
// The minimum length of characters in a feature. Required.
"minGram": integer,
// The maximum length of characters in a feature. Required.
"maxGram": integer
}

Example

{ "type": "char-ngram", "minGram": 2, "maxGram": 3 }
Referenced in

DelimiterAnalyzer

The Delimiter Analyzer breaks text into features whenever encounters a specified delimiter character.

With the trimWhitespace option, the analyzer trims the whitespace surrounding a feature.

For example, the following analyzer:

{
  "type": "delimiter",
  "delimiter": ",",
  "trimWhitespace": true
}

would break the text "the, cats,are, running" into 4 features:

["the", "cats", "are", "running"]
Format
{
// Type of the analyzer. Required.
"type": string,
// The delimiter. Required.
"delimiter": string,
// Trims leading and trailing whitespace of the features. // Default: true
"trimWhitespace": boolean
}

Examples

{ "type": "delimiter", "delimiter": "," }
{ "type": "delimiter", "delimiter": "\n", "trimWhitespace": true }
Referenced in

LanguageAnalyzer

Language Analyzers aim to analyze text of a specific language.

When using a language analyzer, text is analyzed into lower-case word stem features. For example, using the following english analyzer:

{ "type": "language", "language": "english" }

a text "the cats are running" will be broken into 4 word stem features:

["the", "cat", "ar", "run"]

The value of the "language" parameter specifies which language will be used. The value can be the name or the ISO 639-1 code of the language. The full list is shown as below:

LanguageNameISO code
Arabicarabicar
Armenianarmenianhy
Basquebasqueeu
Brazilian Portuguesebrazilianpt-br
Bulgarianbulgarianbg
Catalancatalanca
Chinese, Japanese, Koreancjkcjk
Czechczechcs
Danishdanishda
Dutchdutchnl
Englishenglishen
Finnishfinnishfi
Frenchfrenchfr
Galiciangaliciangl
Germangermande
Greekgreekel
Hindihindihi
Hungarianhungarianhu
Indonesianindonesianid
Irishirishga
Italianitalianit
Latvianlatvianlv
Norwegiannorwegianno
Persianpersianfa
Portugueseportuguesept
Romanianromanianro
Russianrussianru
Spanishspanishes
Swedishswedishsv
Thaithaith
Turkishturkishtr

The language analyzers support filtering the stop words (common words that are normally not useful). Each language has a list of default stop words for filtering that can be enabled through the useDefaultStopWords" parameter. Some common English stop words are:

  "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", 
  "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", 
  "such", "that", "the", "their", "then", "there", "these", 
  "they", "this", "to", "was", "will", "with"

By default, "useDefaultStopWords" is set as false. The following analyzer:

{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": true
}

would break the text "the cats are running" into 2 features:

["cat", "run"]

It is also possible to specify a set of words that would be filtered through the "customStopWords" parameter and a set of words that would not be analyzed through the "customKeyWords" parameter. The following analyzer:

{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": false,
  "customStopWords": ["cats"],
  "customKeyWords": ["running"]
}

would break the text "the cats are running" into 3 features:

["the", "ar", "running"]
Format
{
// Type of the analyzer. Required.
"type": string,
// Name or code of the language. Required.
"language": string,
// Use the language default stopwords. // Default: false
"useDefaultStopWords": boolean,
// List of words that will be filtered. // Default:
"customStopWords": [string, ...],
// List of words that will not be featurizerd. // Default:
"customKeyWords": [string, ...]
}

Examples

{ "type": "language", "language": "en" }
{
  "type": "language",
  "language": "english",
  "useDefaultStopWords": true,
  "customStopWords": ["flower"],
  "customKeyWords": ["animal"]
}
Referenced in

TokenNGramAnalyzer

The Token N-gram Analyzer breaks text into token n-grams (shingles) based on a source analyzer. In other words, it combines the features of the source analyzer into new features.

For example, the following Token N-gram Analyzer:

{
  "type": "token-ngram",
  "source": "english",
  "minGram": 1,
  "maxGram": 2,
  "tokenSeparator": "_"
}

would breaks the text "the cat is running" into the following list of features:

["the", "the_cat", "cat", "cat_ar", "ar", "ar_run", "run"]
Format
{
// Type of the analyzer. Required.
"type": string,
// Source analyzer to generate features before being combined // into n-grams. Required.
"source": Analyzer,
// The minimum number of features to be combined. Required.
"minGram": integer,
// The maximum number of features to be combined. Required.
"maxGram": integer,
// The string used to join the features of the source analyzer. // Default: " "
"tokenSeparator": string
}

Examples

{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2 }
{
  "type": "token-ngram",
  "source": { "type": "delimiter", "delimiter": "," },
  "minGram": 1,
  "maxGram": 3,
  "tokenSeparator": "_"
}
Referenced in

Query language

The reference documentation for Aito query language.

Common concepts

Features

To make better analysis of the data, Aito splits fields into features under the hood. How the featurization is done, depends on the field type. For example the Text type supports an "analyzer" option which allows you to control how a text field is split into features.

Some queries, for example Relate, return the features instead of the actual values of the field.

Exclusiveness

Exclusiveness is an option in predictions. In summary, it describes whether the predicted field can have multiple values at the same time or not.

Understanding the concept is easiest through an example. If we were predicting tags for a product, we would want to set "exclusiveness": false, because a product can have multiple tags. A product could be described with the following tags:

Venn diagram of tags

However if we were predicting the user, who would most likely purchase a product, we would want to use "exclusiveness": true (default behavior) because the value can only be one user at a time.

Venn diagram of users

$p vs $lift

If we were trying to find a customer, who is best characterized by a message, we'd need to understand the difference between $p and $lift. To make the difference clear, consider the following situation:

  • Alice messages often, but she doesn't mention iPhone often
  • Bob messages rarely, but only about iPhones

Querying users by $p quite likely finds Alice, because she may be overall the more likely person to mention "iPhone". Querying users by $lift, on the other hand will very certainly find Bob, because $lift describes that how characteristic the feature "iPhone" is for the user.

A more mathematical and technical description for the phenomenon is the following:

Aito uses Bayesian probability inference to estimate p(Xcontext)\footnotesize p(X|context) so that p(Xcontext)=p(X)lift(Xcontext)\footnotesize p(X|context) = p(X) * lift(X|context) where the probability lift component lift(Xcontext)=p(contextX)/p(context)\footnotesize lift(X|context) = p(context|X)/p(context)

The probability lift component describes that how much more likely X is true in the specified context, when compared to average.

In Aito query syntax: $p stands for the p(Xcontext)\footnotesize p(X|context), while $lift stands for the lift(Xcontext)\footnotesize lift(X|context) component.

Text operators

Useful for creating conditional queries with text fields.

$match

Operator to check if a textual field fuzzy matches a given string.

Case insensitive. The matched text is split to tokens with the analyzer specified for the field in schema. For example { "$match": "great programmers" } will match strings "Bob is the greatest programmer!", and "Programmers are having great fun" if the field is properly analyzed with the English analyzer.

Format
{
"$match":
string
}

Examples

{ "$match": "coffee" }
{
  "from": "products",
  "where": {
    "name": { "$match": "coffee" }
  }
}
Referenced in

$startsWith

Operator to check if a textual field starts with a given string. Case sensitive.

Format
{
"$startsWith":
string
}

Examples

{ "$startsWith": "Cucumber" }
{
  "from": "products",
  "where": {
    "name": { "$startsWith": "Cucumber" }
  }
}
Referenced in

Comparison operators

Useful for creating conditional queries.

$gt

Operator to check if a field is greater than a given value.

Format
{
"$gt":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$gt": 8 }
{ "$gt": 231.1 }
{ "$gt": "20150308" }
{
  "from": "products",
  "where": {
    "price": { "$gt": 2.14 }
  }
}
Referenced in

$gte

Operator to check if a field is greater than or equal to a given value.

Format
{
"$gte":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$gte": -2 }
{ "$gte": 0 }
{ "$gte": "20180502" }
{
  "from": "products",
  "where": {
    "price": { "$gte": 2 }
  }
}
Referenced in

$lt

Operator to check if a field is less than a given value.

Format
{
"$lt":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$lt": 4 }
{ "$lt": -12.1 }
{ "$lt": "20180502" }
{
  "from": "products",
  "where": {
    "price": { "$lt": 1.24 }
  }
}
Referenced in

$lte

Operator to check if a field is less than or equal to a given value.

Format
{
"$lte":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$lte": 8 }
{ "$lte": 0 }
{ "$lte": "20180502" }
{
  "from": "products",
  "where": {
    "price": { "$lte": 1 }
  }
}
Referenced in

$has

Has operation checks whether the field has the specified feature.

$has is a low level operation, that operates at the feature level. The features can differ significantly from the original data, specifically in case of text, when analyzers are used.

For example if you have field called content with the text "programmers and horses", the field would have features 'programmer' and 'hors', which are stems by the English analyzer.

Format
{
"$has":
integer
or
number
or
null
or
boolean
or
string
}

Examples

{ "$has": "drink" }
{
  "from": "products",
  "where": {
    "tags": { "$has": "drink" }
  }
}
Referenced in

$defined

Operator to select rows based on if an nullable field has been defined or not.

Format
{
"$defined": boolean
}

Example

{ "$defined": true }
Referenced in

$exists

An operator to get features of given field(s).

Format
{
// PropositionSet expression is used to describe a collection // of propositions.
"$exists": PropositionSet
}

Examples

{
  "$exists": ["query", "product.tags"]
}
{
  "from": "impressions",
  "where": {
    "$on": [
      {
        "$exists": ["query", "customer.tags"]
      },
      { "click": true }
    ]
  },
  "relate": ["product.title", "product.tags"]
}
Referenced in

Logical operators

Useful for combining multiple conditions in conditional queries.

$and

Performs a logical and operation on the given array containing two or more Propositions.

  • With the non-inference query (e.g: Search, Similarity), the $and operator guarantees that all propositions are met. For instance, the following search query:
    {
      "from": "products",
      "where": {
        "$and": [
          { "description": "super slim laptop" },
          { "price": { "$gt" : 200 } }
        ]
      }
    }
    
    will always find products of which description is super "slim laptop" and price is greater than 200
  • With the inference query (e.g: Predict, Match, Recommend), the $and operator does not guarantee that all propositions are met. For instance, the following predict query:
    {
      "from": "products",
      "where": {
        "$and": [
          { "description": "super slim laptop" },
          { "price": { "$gt" : 200 } }
        ]
      },
      "price": "tag"
    }
    Aito might look for products with a price greater than 200 but do not match the description of super slim laptop or products that match the description but do not meet the price condition. This is because there might be a lack of data (e.g: not enough products in the price range) to make a sophisticated prediction.
    

To guarantee that all propositions are met in a inference query, refer to $atomic

Format
{
"$and": [, ...]
}

Examples

{
  "$and": [
    { "$gt": 10 },
    { "$lt": 20 }
  ]
}
{
  "from": "products",
  "where": {
    "price": {
      "$and": [
        { "$gt": 1.5 },
        { "$lt": 2.1 }
      ]
    }
  }
}
Referenced in

$or

Performs a logical or operation on the given array containing two or more Propositions.

Format
{}

Examples

{
  "$or": [
    { "tags": "cover" },
    { "tags": "laptop" }
  ]
}
{
  "from": "products",
  "where": {
    "price": {
      "$or": [
        { "$lt": 0.9 },
        { "$gt": 2.1 }
      ]
    }
  }
}
Referenced in

$not

Performs a logical not operation on the given Proposition.

Format

Examples

{
  "$not": { "tags": "laptop" }
}
{
  "$not": { "$lt": 0 }
}
{
  "from": "products",
  "where": {
    "price": {
      "$not": { "$lt": 1.1 }
    }
  }
}
Referenced in

Sort operators

Can be used in "orderBy" clause to declare the sorting order of the result.

$asc

Sort returned hits in ascending order (A-Z) based on the given attribute or custom scoring function.

Format
{
// Value expression resolves to a primitive like int or json, // score, probability or.
"$asc": Value
}

Examples

{ "$asc": "price" }
{ "$asc": "product.price" }
{
  "$asc": {
    "$multiply": ["product.price", "$p"]
  }
}
Referenced in

$asc(Relate)

Sort returned hits in ascending order (A-Z) based on the given attribute (or column).

Format
{
"$asc": string
}

Example

{ "$asc": "lift" }
Referenced in

$desc

Sort returned hits in descending order (Z-A) based on the given attribute or custom scoring function.

Format
{
// Value expression resolves to a primitive like int or json, // score, probability or.
"$desc": Value
}

Examples

{ "$desc": "price" }
{ "$desc": "product.price" }
{
  "$desc": {
    "$multiply": ["product.price", "$p"]
  }
}
Referenced in

$desc(Relate)

Sort returned hits in descending (Z-A) order based on the given attribute (or column).

Format
{
"$desc": string
}

Example

{ "$desc": "info.miTrue" }
Referenced in

Arithmetic operators

Can be used in conditional queries or scoring in "orderBy" clauses.

$mod

Operator to check if the value of a field divided by a divisor has the specified remainder.

In other words perform a modulo operation. This operator supports object or array form. Note that the field will be converted to an integer (effectively a math floor) before the modulo operation.

Format

Examples

{
  "$mod": [2, 0]
}
{
  "$mod": { "divisor": 2, "remainder": 0 }
}
{
  "from": "products",
  "where": {
    "price": {
      "$mod": { "divisor": 2, "remainder": 0 }
    }
  }
}
Referenced in

$multiply

Multiplication operation of given items.

Format
{
"$multiply": [Score, ...]
}

Example

{
  "$multiply": ["price", 2]
}
Referenced in

$divide

Division operation.

Format
{
"$divide":
{
// Score expression resolves to a numeric score value or // probability.
"dividend": Score,
// Score expression resolves to a numeric score value or // probability.
"divisor": Score
}
or
}

Example

{
  "$divide": ["cost", 4]
}
Referenced in

$pow

Exponentiation operation. First item raised to the power of the second.

Format

Example

{
  "$pow": ["width", 2]
}
Referenced in

$sum

Calculates sum of given items.

Format
{
"$sum": [Score, ...]
}

Example

{
  "$sum": ["priceNet", "priceVat"]
}
Referenced in

$subtract

Subtraction operation.

Format
{
"$subtract":
{
// Score expression resolves to a numeric score value or // probability.
"minuend": Score,
// Score expression resolves to a numeric score value or // probability.
"subtrahend": Score
}
or
}

Example

{
  "$subtract": ["price", 2]
}
Referenced in

Advanced operators

More advanced operators which can improve query results in certain situations.

$atomic

Transforms a statement into a 'black box' proposition.

This prevents Aito from analyzing the proposition and using its parts separately in the statistical reasoning.

In practice the difference between normal 'white box' expressions, and the $atomic's black box expressions is: that the atomic expressions have a smaller bias, but a higher measurement error.

Consider the following example:

{
  "tags": "pen",
  "price": { "$gte": 200 } }
}

During the statistical reasoning: Aito may recognize that pens are often sold, and that over 200€ product purchases are somewhat common. As a result, Aito might assume the over 200€ pen to be a popular product.

Now, consider the expression:

{
  "$atomic": {
    "tags": "pen",
    "price": { "$gte" : 200 }
  }
}

The results of this expression will depend of the amount of data. If there are no over 200€ pens in the data: Aito will make no assumptions of the proposition's effect. On the other hand, if you have the data: Aito will recognize correctly, that the over 200€ pens are bought extremely rarely.

Format
{
// Proposition expression describes a fact, or a statement.
"$atomic": Proposition
}

Examples

{
  "$atomic": {
    "tags": "pen",
    "price": { "$gte": 200 }
  }
}
{
  "from": "products",
  "where": {
    "$atomic": {
      "tags": "pen",
      "price": { "$gte": 200 }
    }
  }
}
Referenced in

$context

Provides ability to access the fields of the table specified in "from", instead of fields of the table in "get".

Format
{
// Proposition expression describes a fact, or a statement.
"$context": Proposition
}

Examples

{
  "$context": { "click": true }
}
{
  "from": "impressions",
  "where": { "customerEmail": "john.doe@aito.ai", "query": "laptop" },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}
Referenced in

$hit

Provides ability to access the fields of the hit.

Format
{
// Score expression resolves to a numeric score value or // probability.
"$hit": Score
}

Examples

{ "$hit": "price" }
{ "$hit": "$similarity" }
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$multiply": [
      { "$hit": "$similarity" },
      { "$hit": "price" }
    ]
  }
}
Referenced in

$on

$on operator is used to define conditional propositions or hard filters.

This is useful when you have limited amount of data and the condition would help to limit the context and provide better results. This can be done by providing a list containing of two items, the first object (or "prop") is the hypothesis and the second object (or "on") is the conditional.

In Aito the where clause contains propositions which aren't hard filters. Instead, Aito will turn all the propositions into features (the user's ID, every word in a text field, etc.). There are many of these and they are not statistically independent. Aito picks a subset of these features that are the best predictors of the field that is to be predicted. So what goes into the "where" is a description of the situation you're in and Aito tells you what you should expect to find if you look in a field. But the description is not taken at face value, Aito will ignore parts of it if it doesn't help the prediction.

However, there is another way to achieve this: the "$on" proposition. It is modeled after conditional probability. It is divided into two parts, the normal "where" parts and the conditional part ("hard filters"). The "$on" parameters explained:

{
  "from": "...",
  "where": {
    "$on": [
      {
        "message": "hello, world",
        "something": true,
        // other things you put in your "where" clause
      },
      {
        // The subset of data that exactly matches these conditions
        "userId": 42,
        "day": "monday"
      }
    ]
  },
  "predict": "..."
}

The $on can also be combined with normal query. If the $on condition is too strong, you could move parts of the filtering back to the where clause:

{
  "from": "...",
  "where": {
    "$on": [
      {
        "message": "hello, world",
        "something": true,
        // other things you put in your "where" clause
      },
      {
        // The subset of data that exactly matches these conditions
        "day": "monday"
      }
    ],
    "user_id": 42
  },
  "predict": "..."
}
Format

Examples

{
  "$on": {
    "prop": { "click": true },
    "on": { "user.tags": "nyc" }
  }
}
{
  "$on": [
    { "click": true },
    { "user.tags": "nyc" }
  ]
}
Referenced in

$knn

The $knn operator is an adaptation of the classic k-nearest neighbor algorithm.

Aito's $knn operator identifies k most similar rows to the conditions defined in the 'near' parameter. The similarity metric is the same metric used in the similarity query. The k nearest rows can be used in inference.

The $knn operator can be useful in situation where there is no training data. For example:

{
  "from": "impressions",
  "where": {
    "product.name": "Columbian Coffee",
    "product.tags": "high quality coffee"
  },
  "predict": "purchase"
}

The query would not yield sensible results since there's no such product existed in the current data. This can be improved by using the $knn operator:

{
  "from": "impressions",
  "where": {
    "$knn": {
      "k": 5,
      "near": {
        "product.name": "Columbian Coffee",
        "product.tags": "high quality coffee"
      }
    }
  },
  "predict": "purchase"
}

In the query above, Aito would look for 5 entries that are most similar to the given criteria in "near" and use that for inference.

Format

Examples

{
  "$knn": [
    4,
    { "tags": "laptop" }
  ]
}
{
  "$knn": {
    "k": 4,
    "near": { "tags": "laptop" }
  }
}
Referenced in

$numeric

Operator to check if a numeric field fuzzy matches a given number.

By default, numbers are compared exactly against one another. The $numeric proposition signifies that comparisons should be inexact and that the target is somewhere close to the specified number. The size of the region depends on the spread and density of the data.

Format
{
"$numeric":
integer
or
number
or
null
}

Examples

{ "$numeric": 42 }
{ "$numeric": 3.14 }
Referenced in

$hash

$hash converts the field value into a hash integer.

The hash code can be used to split non-integer data pseudo-randomly in the evaluate query.

Format

Example

{
  "$hash": {
    "$mod": [2, 1]
  }
}
Referenced in

$toString

$toString operator is used to convert a nummeric value to string.

This is useful when you want to use a numeric as input for an operator or a field that requires text input. For example:

{
  "description": {
    "$match": {
      "$toString": { "$get": "id" } 
    }
  }
}
Format
{
"$toString":
integer
or
number
or
null
or
boolean
or
string
}

Example

{ "$toString": 4 }

Scoring operators

Can be used in "orderBy" clause to sort or create an advanced scoring algorithm.

$lift

"$lift" can be used in the "orderBy" clause of the Generic query to get the most likely values based on lifts of features with regard to other features.

In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest lifts that it would be purchased:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": "$lift"
}

Running the following query would yield the most likely product based on all the fields of the linked product table:

{
  "from": "impressions",
  "where": {
    "session.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": "$lift"
}

Since the product field in the impressions table is linked to the products table, Aito would find all the statistical relations between what is declared inside the "where" clause and all the fields feature of a product, that is, id, name, category, price, tag. In this case, the lift score is the product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select" clause (e.g: "select": ["$score", "$why"]):

"$why": {
  "type": "product",
  "factors": [
    {
      "type": "hitPropositionLift",
      "proposition": { "id" : 6410405093677 },
      "value": 1.9827806375460209,
      "factors": [
        {
          "type": "relatedPropositionLift",
          "proposition": { "purchase" : true },
          "value": 1.9827806375460209
        }
      ]
    },
    {
      "type": "hitPropositionLift",
      "proposition": { "$not": { "name" : {"$has": "puikula" } } },
      "value": 1.0472308585357502,
      "factors": [
        {
          "type": "relatedPropositionLift",
          "proposition" : { "purchase" : true },
          "value": 1.0472308585357502
        }
      ]
    }
    ...
  ]
}

We can see that the lift score is composed of lift of an id feature, a name feature and others.

See also $p and $lift.

Format
string

Examples

"$lift"
{
  "from": "messages",
  "get": "user",
  "orderBy": "$lift",
  "where": {
    "message": { "$match": "dog" }
  }
}
Referenced in

$p

"$p" can be used in the "orderBy" clause of the Generic query to get the most probable values. When used this way, it is similar to the Match query.

In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest probabilities that it would be purchased:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": "$p"
}

Similar to the Match query, running the following query would yield the most likely product based on all the fields of the linked product table:

{
  "from": "impressions",
  "where": {
    "session.user": "bob",
    "purchase": true
  },
  "get": "product",
  "orderBy": "$p"
}

Since the product field in the impressions table is linked to the products table, Aito would find all the statistical relations between what is declared inside the "where" clause and all the fields feature of a product, that is, id, name, category, price, tag. In this case, the probability score is the normalized product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select" clause (e.g: "select": ["$score", "$why"]):

"$why": {
  "type": "product",
  "factors": [
    {
      "type": "hitPropositionLift",
      "proposition": { "id" : 6410405093677 },
      "value": 1.9827806375460209,
      "factors": [
        {
          "type": "relatedPropositionLift",
          "proposition": { "purchase" : true },
          "value": 1.9827806375460209
        }
      ]
    },
    {
      "type": "hitPropositionLift",
      "proposition": { "$not" : { "name" : { "$has": "puikula" } } },
      "value": 1.0472308585357502,
      "factors": [
        {
          "type": "relatedPropositionLift",
          "proposition": { "purchase" : true },
          "value": 1.0472308585357502
        }
      ]
    }
    ...
  ]
}

We can see that the probability score is composed of lift of an id feature, a name feature and others.

See also $p and $lift.

Format
string

Examples

"$p"
{
  "from": "messages",
  "get": "user",
  "orderBy": "$p",
  "where": {
    "message": { "$match": "dog" }
  }
}
Referenced in

$similarity

"$similarity" can be used in Generic query to get most similar rows based on the contents of the "where" clause.

Consider the following example. It will return all the products, that contain 'iphone' in the title. It also sorts the results by their similarity to the 'iphone' and highlight the 'iphone' term in the product title field.

{
  "from": "product",
  "where": { "title": { "$match": "iphone" } },
  "get": "message",
  "orderBy": "$similarity",
  "select": ["title", "$highlight"]
}
Format
string

Examples

"$similarity"
{
  "from": "product",
  "get": "message",
  "orderBy": "$similarity",
  "where": {
    "title": { "$match": "iphone" }
  }
}
Referenced in

$lift object

Conceptually similar to the plain $lift operator, but allows using a customized proposition for the lift score calculation.

This $lift operator enables more options to customized the lift score calculation, especially when getting the values of linked table.

  1. Narrow down the fields that are used to calculate the lift:

    This is similar to the behavior of the "basedOn" clause of the Match query

    When calculating the lift of a linked field, aito used all the fields of the linked table (See $lift for how the lift is calculated for a linked field).

    If you would like to narrow down how the lift is calculated, you can add the field name following the $lift. For example, find the most likely product based on only the product name:

    {
      "from": "impressions",
      "where": {
        "session.user": "bob",
        "purchase": true
      },
      "get": "product",
      "orderBy": {
        "$lift": "name"
      }
    }
    

    You can also calculate the lift based on multiple fields by using the array format. For instance:

    {
      "$lift": ["category", "tag"]
    }
    
  2. Calculate the lift based on a specific context:

    This is similar to the behavior of the Recommend query

    By combining with the $context operator, the lift score can be defined as the lift of a context. For instance, to find the products with the highest lift of getting purchased:

    {
      "from": "impressions",
      "where": {
        "session.user": "bob"
      },
      "get": "product",
      "orderBy": {
        "$lift": {"$context": {"purchase": true}}
      }
    }
    
Format
{
// PropositionSet expression is used to describe a collection // of propositions.
"$lift": PropositionSet
}

Examples

{ "$lift": "tags" }
{
  "$lift": ["tags", "title"]
}
{
  "$lift": {
    "$context": { "click": true }
  }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": { "$lift": "tags" }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$lift": ["tags", "title"]
  }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$lift": {
      "$context": { "click": true }
    }
  }
}
Referenced in

$p object

Conceptually similar to the plain $p operator, but allows using a customized proposition for the probability score calculation.

This $p operator enables more options to customized the probability score calculation, especially when getting the values of linked table:

  1. Narrow down the fields that are used to calculate the probability:

    This is similar to the behavior of the "basedOn" clause of the Match query

    When calculating the probability of a linked field, aito used all the fields of the linked table (See $p for how the probability is calculated for a linked field).

    If you would like to narrow down how the probability is calculated, you can add the field name following the $p. For example, find the most likely product based on only the product name:

    {
      "from": "impressions",
      "where": {
        "session.user": "bob",
        "purchase": true
      },
      "get": "product",
      "orderBy": {
        "$p": "name"
      }
    }
    

    You can also calculate the probability based on multiple fields by using the array format. For instance:

    {
      "$p": ["category", "tag"]
    }
    
  2. Calculate the probability based on a specific context:

    This is similar to the behavior of the Recommend query

    By combining with the $context operator, the probability score can be defined as the probability of a context. For instance, to find the products with the highest probability that the product would be purchased:

    {
      "from": "impressions",
      "where": {
        "session.user": "bob"
      },
      "get": "product",
      "orderBy": {
        "$p": {"$context": {"purchase": true}}
      }
    }
    
Format
{
// PropositionSet expression is used to describe a collection // of propositions.
}

Examples

{ "$p": "tags" }
{
  "$p": ["tags", "title"]
}
{
  "$p": {
    "$context": { "click": true }
  }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": { "$p": "tags" }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$p": ["tags", "title"]
  }
}
{
  "from": "impressions",
  "where": {
    "product.title": { "$match": "iphone" }
  },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}
Referenced in

$similarity object

Conceptually similar to the plain $similarity operator, but allows using a customized proposition for the similarity score calculation.

The plain $similarity operator calculates the similarity score based on the "where" clause contents, whereas this $similarity operator calculates the similarity score based on the given proposition.

These Generic Queries would yield the same results:

{
  "from": "products",
  "where": {
    "name": {"$match": "coffee"}
  },
  "orderBy": "$similarity"
}
{
  "from": "products",
  "orderBy": {
    "$similarity": {
      "name": "coffee"
    }
  }
}

This $similarity operation is useful for customizing scoring as the example below. Please refer to GenericQuery query with custom scoring example.

{
  "from": "impressions",
  "where": {
    "session.user": "veronica"
  },
  "get": "product",
  "orderBy": {
    "$multiply": [
      {
        "$p": {
          "$context": {
            "purchase": true
          }
        }
      },
      {
        "$similarity": {
          "name": "coffee"
        }
      }
    ]
  }
}
Format
{}

Examples

{
  "$similarity": { "title": "apple iphone", "tags": "premium ios phone" }
}
{
  "from": "products",
  "orderBy": {
    "$similarity": { "title": "apple iphone", "tags": "premium ios phone" }
  }
}
Referenced in

$f

"$f" can be used in the "orderBy" clause of the Generic query to get the frequency of a feature.

Format
string

Examples

"$f"
{ "from": "impressions", "get": "product", "orderBy": "$f" }
Referenced in

$normalize

$normalize operator can be used in the "orderBy" clause of the Generic query to make a score to sum to 1. For example, you can normalize the $lift or the $lift object to 1:

{
  "from": "impressions",
  "where": {
    "product.name": {"$match": "lactose"}
  },
  "get": "purchase",
  "orderBy": {
    "$normalize": "$lift"
  }
}

or

{
    "from": "impressions",
    "where": {
        "session.user": "bob",
        "purchase": true
    },
    "get": "product",
    "orderBy": {
        "$normalize": {
      "$lift": { "$context": { "click": "true" } }
    }
    }
}
Format
{
// Score expression resolves to a numeric score value or // probability.
"$normalize": Score
}

Examples

{ "$normalize": "$lift" }
{
  "$normalize": { "$lift": "name" }
}
Referenced in

Built-in attributes

Can be used in "select" clause.

$sort

$sort is a built-in field that can be used to access the sort value used in the orderBy-clause.

Format
string

Example

"$sort"
Referenced in

$score

$score is a built-in field that can be used to access the sort value used in the orderBy-clause, when the sort-value is a numeric score like a probability.

Format
string

Example

"$score"
Referenced in

$index

$index is a built-in variable which indicates the insertion index of a row. It can be used together with $mod to select parts of a table. It's useful for example in Evaluate query for selecting training or test data.

Format
string

Example

"$index"
Referenced in

$p

$p is a built-in field that can be used to access the value used by orderBy-clause, when the sort-value is a probability. See $p for more information.

Format
string

Example

"$p"
Referenced in

$why

When selecting $why, Aito opens up why a certain result was predicted. Explanation contains 3 different factors, which are explained below.

The three different factors are for an estimate of form:

p(xiA,B,C)p(x_i | A, B, C)

"baseP"

The base probability.

p(X)p(X)

"normalizer"

Aito has two different normalizes, that are

  • exclusiveness normalizer
  • trueFalseExclusiveness normalizer

These normalizes are often grouped into a single 'product' component.

{
  "type" : "product",
  "factors" : [ {
    "type" : "normalizer",
    "name" : "exclusiveness",
    "value" : 1.0119918068684681
  }, {
    "type" : "normalizer",
    "name" : "trueFalseExclusiveness",
    "value" : 1.09917613448721
  } ]
}

The exclusiveness normalizer is only used, when exclusiveness is on. In this case, it is assumed that only one feature can be true at the same time, and that one feature will be true. In practice, exclusiveness enforces the probabilities of alternative features to sum to 1.0.

The normalizer is of form:

1sum((p(X0)+p(X1)+...))\dfrac{1}{sum((p(X_0) + p(X_1) + ...))}

Aito makes a probability estimation for both X and ¬X on the background and uses the trueFalseExclusiveness normalizer to assert that the probabilities P(X) and P(¬X) sum to 1.0.

The normalizer is of form:

1p(X)+p(¬X)\dfrac{1}{p(X) + p(\neg X)}

Probability lifts. For example: the lift may say a product is clicked with 2.3x likelihood (or 130% higher likelihood), when it has 5 stars.

A probability lift is of form:

p(AX)p(A)\dfrac{p(A | X)}{p(A)}
Format
string

Example

"$why"
Referenced in

$value

$value is a built-in field which contains the value of the returned object. $value can be used to access the field value referred in the predict, match, recommend and get-clauses, when the returned item is either a field value or a field feature/proposition. $value is intended to replace the 'feature' field in the long term.

The $value field has been added to contain the information in the ‘feature’ so that for query:

  {
    "from" : "products",
    "where" : {
      "title" : "apple iphone"
    },
    "predict": "tags",
    "select" : ["$p", "$value"],
    "limit":3
  }

The result will be the following:

  {
    "offset" : 0,
    "total" : 10,
    "hits" : [ {
      "$p" : 0.3656914544001758,
      "$value" : "premium"
    }, {
      "$p" : 0.1546922568903658,
      "$value" : "cover"
    }, {
      "$p" : 0.09493670104339776,
      "$value" : "macosx"
    } ]
  }

Value works similarly, when predicting the field value, using the generic query.

  {
    "from" : "products",
    "where" : {
      "title" : "apple iphone"
    },
    "get": "tags",
    "orderBy" : "$p",
    "select" : ["$p", "$value"],
    "limit":3
  }

Or when when predicting the field features with the generic query:

  {
    "from" : "products",
    "where" : {
      "title" : "apple iphone"
    },
    "get": "tags.$feature",
    "orderBy" : "$p",
    "select" : ["$p", "$value"],
    "limit":3
  }
Format
string

Example

"$value"
Referenced in

$proposition

$proposition is a built-in field which contains the proposition object of the returned feature. The returned proposition is compatible with the proposition format and it can be used as such in the where clause.

Consider the following query:

  {
    "from": "products",
    "where": {
      "title": "Apple"
    },
    "predict": {
      "$on": [
        { "$exists": "tags" },
        { "$and": [
          { "tags": { "$match": "phone" } },
          { "$not": { "tags": { "$match": "laptop" } } }
        ] }
      ]
    },
    "select": ["$p", "$value", "$proposition"],
    "limit": 1
  }

This provides the following results:

  {
    "offset" : 0,
    "total" : 10,
    "hits" : [ {
      "$p" : 0.22622976807854914,
      "$value" : "phone",
      "$proposition" : {
        "$on" : [ {
          "tags" : {
            "$has" : "phone"
          }
        }, {
          "$and" : [ {
            "tags" : {
              "$has" : "phone"
            }
          }, {
            "$not" : {
              "tags" : {
                "$has" : "laptop"
              }
            }
          } ]
        } ]
      }
    } ]
  }
Format
string

Example

"$proposition"
Referenced in

Explanation objects

Explanation object when using the "$why" operator.

BaseLiftExplanation

Conceptually similar to BaseProbabilityExplanation but show the prior lift instead of prior probability.

See more Probability vs. Lift

Format
{
// The explanation type: baseLift. Required.
"type": string,
// The prior lift. Required.
"value": number
}

Example

{ "type": "baseLift", "value": 31 }
Referenced in

BaseProbabilityExplanation

Explain the initial weight of a feature. It can be understand as the prior probability p(X)p(X) of a feature.

Let's take a look at an example of a Predict query:

{
  "from": "products",
  "where": {
    "name": "Columbian coffee"
  },
  "predict": "tags",
  "select": ["$p", "feature", "$why"]
}

When opening up the explanation with "$why" operator, a tag's feature "coffee" has a BaseProbabilityExplanation:

{
  "type": "baseP",
  "value": 0.16
}

This explanation tells that Aito gives the feature "coffee" a prior probability of 0.16.

Format
{
// The explanation type: baseP. Required.
"type": string,
// The prior probability. Required.
"value": number,
// The variable.
"proposition": PropositionExplanation
}

Example

{
  "proposition": { "click": true },
  "type": "baseP",
  "value": 0.5
}
Referenced in

ExponentExplanation

Explain how an exponent score was calculated.

The ExponentExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of an exponent score such as: 1. The tf-idf score to calculate the similarity in the Similarity query. 1. The score of the $pow operator.

Format
{
// The explanation type: exponent. Required.
"type": string,
// The exponent score. Required.
"value": number,
// The explanation of the base score element. Required.
// The explanation of the power score element. Required.
}

Example

{
  "base": { "type": "idf", "value": 1.7551720221592049 },
  "power": { "type": "tf", "value": 1 },
  "type": "exponent",
  "value": 1.7551720221592049
}
Referenced in

FieldExplanation

Explain how a field score was calculated.

The field explanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a score that was calculated using:

  1. A field value

    {
      "from" : "impressions",
      "where" : {
        "product.name":{"$match": "coffee"}
      },
      "get":"product",
      "orderBy" : {
        "$multiply": ["$p", "price"]
      },
      "select": ["$score", "$why"]
    }
    

    The explanations would contains the value of the "price" field that was use in the $multiply operator.

    {
      "type": "field",
      "field": "price",
      "value": 3.95
    }
    
  2. A field feature (e.g: $f operator for frequency):

    {
      "from" : "impressions",
      "where" : {
        "product.name":{"$match": "coffee"}
      },
      "get":"product",
      "orderBy" : "$f",
      "select": ["$score", "$why"]
    }
    

    The explanation would contains the frequency of the feature.

    {
      "type": "field",
      "field": "$f",
      "value": 152.0
    }
    
Format
{
// The explanation type: field. Required.
"type": string,
// The name or feature of the field. Required.
"field": string,
// The score value. Required.
"value": number
}

Example

{ "field": "price", "type": "field", "value": 1500 }
Referenced in

HitPropositionLiftExplanation

Explain how a propositions's lift was calculated.

A hit score was calculated by aggregating the score of its propositions (features). The HitPropositionLiftExplanation explains how different proposition was calculated.

A HitPropositionLift can be:

  1. A similarity score

    A hit's field can contain a word that match the stem of the given similarity condition. That word would have a HitPropositionLift that is a similarity score. Let's take a look at an example of Similarity query:

    {
      "from": "products",
      "similarity": {
        "name": "Columbian coffee",
        "tags": "expansive coffee"
      },
      "select": ["$score", "name", "tags", "$why"]
    }
    

    When opening up the explanation with "$why" operator, we can see that a hit with name "Juhla Mokka coffee 500g sj" containing the word coffee has a HitPropositionLiftExplanation as follows:

    {
      "type": "hitPropositionLift",
      "proposition": "name:coffe",
      "value": 2.1726635013471625,
      "factors": [
        {
          "type": "exponent",
          "value": 2.1726635013471625,
          "base": {
            "type": "idf",
            "value": 2.1726635013471625
          },
          "power": {
            "type": "tf",
            "value": 1.0
          }
        }
      ]
    }
    
  2. An aggregated score of BaseLift and RelatedPropositionLift

Let's take a look at an example of Match query:

{
  "from": "impressions",
  "where": {
    "session.user": "larry"
  },
  "match": "product",
  "select": ["$score", "name", "$why"]
}

When opening up the explanation with "$why" operator, the first hit has a HitPropositionLiftExplanation as follows:

{
  "type": "hitPropositionLift",
  "proposition": { "id" : { "$has" : 6410405216120 } },
  "value": 599.5491890842981,
  "factors": [
    {
      "type": "baseLift",
      "value": 265.0
    },
    {
      "type": "relatedPropositionLift",
      "proposition": { "session.user": { "$has" : "larry" } },
      "value": 2.2624497701294266
    }
  ]
}

This explains that the initial lift of the feature "id:6410405216120" is 265 and when the user is Larray, the relatedPropositionLift is 2.2624497701294266. Hence the aggregated lift is 2652.2624497701294266=599.5491890842981265 * 2.2624497701294266 = 599.5491890842981

Format
{
// The explanation type: hitPropositionLift. Required.
"type": string,
// The proposition. Required.
"proposition": PropositionExplanation,
// The aggregated lift value. Required.
"value": number,
// The factors contributing to the aggregated lift and their // explanation. Required.
"factors": [ScoreExplanation, ...]
}

Example

{
  "factors": [
    { "type": "baseLift", "value": 31 }
  ],
  "proposition": { "field": 4 },
  "type": "hitPropositionLift",
  "value": 31
}
Referenced in

Explain how a propositions's lift was calculated.

HitLinkPropositionLiftExplanation explains the impact of the value that links to table containing the returned hits.

Let's consider an example, where there is an impression table that has a numeric field 'product' that links to the product table. In such a case the HitLinkPropositionLift would explain the significance of the field 'product' in the impression table. E.g., if the product link's value is 4, the HitLinkPropositionLiftExplanation will explain the effect of the proposition { "product" : 4 }. If the value is 2.0, it means that the 4 product is estimated to be twice as probable just based on the statistics of the linking column.

Format
{
// The explanation type: hitLinkPropositionLift. Required.
"type": string,
// The link proposition. Required.
"proposition": PropositionExplanation,
// The lift value of the linked proposition. Required.
"value": number
}

Example

{
  "proposition": { "product": 5 },
  "type": "hitLinkPropositionLift",
  "value": 2.32
}
Referenced in

InverseDocumentFrequencyExplanation

Explain the inverse document frequency score.

The inverse document frequency score is one component of the term frequency-inverse document frequency score which is used in Aito's similarity metrics.

Format
{
// The explanation type: idf. Required.
"type": string,
// The inverse document frequency score. Required.
"value": number
}

Example

{ "type": "idf", "value": 1.7551720221592049 }
Referenced in

NamedExplanation

Explain how a special named score was calculated.

The NamedExplanation now only appears when calculating a score with exclusiveness. In this case, it explains the normalizer that enforces the probabilities of a feature to have sum of 1.0.

Format
{
// Normalizer. Required.
"type": string,
// Exclusiveness. Required.
"name": string,
// The value of the normalizer. Required.
"value": number
}

Example

{ "name": "exclusiveness", "type": "normalizer", "value": 0.2982788431762749 }
Referenced in

PredictExplanation

Explain how a probability was calculated.

The PredictExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a Predict query. Let's take a look at an example of Predict query:

  {
    "from": "products",
    "where": {
      "name": "Columbian coffee"
    },
    "predict": "tags",
    "select": ["$p", "feature", "$why"],
    "limit": 22
  }

The first hit has an explanation of"

{
  "type": "product",
  "factors": [
    {
      "type": "baseP",
      "value": 0.16
    },
    {
      "type" : "product",
      "factors" : [ 
        {
          "type" : "normalizer",
          "name" : "exclusiveness",
          "value" : 1.0119918068684681
        }, 
        {
          "type" : "normalizer",
          "name" : "trueFalseExclusiveness",
          "value" : 1.09917613448721
        } 
      ]
    },
    {
      "type": "relatedVariableLift",
      "variable": "name:coffe",
      "value": 8.45603245079726
    }
  ]
}
Format
{
// The explanation type: product. Required.
"type": string,
// The explanation of the probability's factors. Required.
"factors": [ScoreExplanation, ...]
}

Example

{
  "factors": [
    { "type": "baseP", "value": 0.8048780487804879 },
    {
      "name": "exclusiveness",
      "type": "normalizer",
      "value": 0.04604801347746731
    }
  ],
  "type": "product"
}
Referenced in

ProductExplanation

This explanation object explains how a product score was calculated. It occurs in $why results, if $multiply operation is used.

The ProductExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of:

  1. Aggregated score by product. For example, in a Match query:
    {
      "from": "impressions",
      "where": {
        "session.user": "larry"
      },
      "match": "product",
      "select": ["$score", "name", "$why"]
    }
    
    The final score is a product of multiple score components:
    {
      "type": "product",
      "factors": [
        {
          "type": "hitPropositionLift",
          "proposition": { "id" : 6410405216120 },
          "value": 599.5491890842981,
          "factors": [
            {
              "type": "baseLift",
              "value": 265.0
            },
            {
              "type": "relatedPropositionLift",
              "proposition": { "session.user" : "larry" },
              "value": 2.2624497701294266
            }
          ]
        },
        ...
      ]
    }
    
  2. A score calculated using the $multiply operator.
Format
{
// The explanation type: product. Required.
"type": string,
// The explanation of the product score's factors. Required.
"factors": [ScoreExplanation, ...]
}

Example

{
  "factors": [
    {
      "factors": [
        { "type": "baseLift", "value": 31 }
      ],
      "proposition": { "id": 3 },
      "type": "hitPropositionLift",
      "value": 31
    },
    { "field": "price", "type": "field", "value": 1500 }
  ],
  "type": "product"
}
Referenced in

DivisionExplanation

This explanation object explains how a division result was calculated. It occurs in $why results, if $divide operation is used.

Format
{
// The explanation type: divide. Required.
"type": string,
// The divided value. Required.
"dividend": ScoreExplanation,
// The divider value. Required.
"divisor": ScoreExplanation
}

Example

{
  "dividend": { "field": "return", "type": "field", "value": 400000 },
  "divisor": { "field": "investment", "type": "field", "value": 250000 },
  "type": "division"
}
Referenced in

Explain how a related variable's lift was calculated.

A related variable (feature) most commonly appears when doing inference with some conditions. The RelatedVariableLiftExplanation explains how a variable of the conditions affecting the lift of a hit's variable.

Let's take a look at an example of Match query:

{
  "from": "impressions",
  "where": {
    "session.user": "larry"
  },
  "match": "product",
  "select": ["$score", "name", "$why"]
}

When opening up the explanation with "$why" operator, the first hit has an explanation as follows:

{
  "type": "hitVariableLift",
  "variable": "id:6410405216120",
  "value": 599.5491890842981,
  "factors": [
    {
      "type": "baseLift",
      "value": 265.0
    },
    {
      "type": "relatedVariableLift",
      "variable": "session.user:larry",
      "value": 2.2624497701294266
    }
  ]
}

This explains that the feature "session.user:larry" extracted from the conditions "where": { "session.user": "larry" } enhances the likelihood that the product having an id of 6410405216120 with a lift of 2.2624497701294266.

Format
{
// The explanation type: relatedPropositionLift. Required.
"type": string,
// The related proposition. Required.
"proposition": PropositionExplanation,
// The lift value of the related proposition. Required.
"value": number
}

Examples coming later.

Referenced in

ScoreExplanation

SumExplanation

Explain how a summation was calculated.

The SumExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a score calculated using the $sum operator.

Format
{
// The explanation type: sum. Required.
"type": string,
// The explanation of the summed score's terms. Required.
"terms": [ScoreExplanation, ...]
}

Example

{
  "terms": [
    { "field": "id", "type": "field", "value": 4 },
    { "field": "price", "type": "field", "value": 1500 }
  ],
  "type": "sum"
}
Referenced in

SubtractionExplanation

This explanation object explains how a substraction result was calculated. It occurs in $why results, if $subtract operation is used.

Format
{
// The explanation type: divide. Required.
"type": string,
// The subtracted value. Required.
"minuend": ScoreExplanation,
// The subtraction value. Required.
"subtrahend": ScoreExplanation
}

Example

{
  "minuend": { "field": "price", "type": "field", "value": 119.5 },
  "subtrahend": { "field": "cost", "type": "field", "value": 100.5 },
  "type": "subtraction"
}
Referenced in

TermFrequencyExplanation

Explain the term frequency score.

The term frequency score is one component of the term frequency-inverse document frequency score which is used in Aito's similarity metrics.

Format
{
// The explanation type: tf. Required.
"type": string,
// The term frequency score. Required.
"value": number
}

Example

{ "type": "tf", "value": 1 }
Referenced in

DefaultValueExplanation

Default value explanation describes the default score for some operation.

For example TF-IDF scoring assigns default lift of 1.0 for all rows without matching terms.

Format
{
// The explanation type: default value. Required.
"type": string,
// The default value. Required.
"value": number
}

Example

{ "type": "default", "value": 1 }
Referenced in

ConstantExplanation

Default value explanation describes a constant value, typically given by the user.

Format
{
// The explanation type. Required.
"type": string,
"value": number
}

Examples coming later.

Referenced in

Explanation proposition objects

Explanation proposition object when using the "$why" or "relate" operator.

FieldPropositionExplanation

FieldPropositionExpanation expresses a statement about a document field

For example the expression

{
  "tags": { "$has" : "laptop" }
}

states, that the tags field contains the "laptop" feature

Format
object

Examples coming later.

HasExplanation

This format is used in the $why and relate explanations for the '$has'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$has":
integer
or
number
or
boolean
or
null
or
string
}

Examples coming later.

AndExplanation

This format is used in the $why and relate explanations for the '$and'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$and": [PropositionExplanation, ...]
}

Examples coming later.

OrExplanation

This format is used in the $why and relate explanations for the '$or'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$or": [PropositionExplanation, ...]
}

Examples coming later.

OnExplanation

This format is used in the $why and relate explanations for the '$on'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$on": [PropositionExplanation, ...]
}

Examples coming later.

NotExplanation

This format is used in the $why and relate explanations for the '$not'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
// This is the format for all propositions used in the $why and // relate explanations.
}

Examples coming later.

StartsWithExplanation

This format is used in the $why and relate explanations for the '$startsWith'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$startsWith": string
}

Examples coming later.

GtExplanation

This format is used in the $why and relate explanations for the '$gt'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$gt":
integer
or
number
or
boolean
or
null
or
string
}

Examples coming later.

GteExplanation

This format is used in the $why and relate explanations for the '$gte'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$gte":
integer
or
number
or
boolean
or
null
or
string
}

Examples coming later.

LtExplanation

This format is used in the $why and relate explanations for the '$lt'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$lt":
integer
or
number
or
boolean
or
null
or
string
}

Examples coming later.

LteExplanation

Format
{
"$lte":
integer
or
number
or
boolean
or
null
or
string
}

Examples coming later.

DefinedExplanation

This format is used in the $why and relate explanations for the '$defined'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$defined": boolean
}

Examples coming later.

NumericExplanation

This format is used in the $why and relate explanations for the '$numeric'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$numeric": number
}

Examples coming later.

KnnExplanation

This format is used in the $why and relate explanations for the '$knn'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
{
"$knn": {
"k": integer,
"near": [PropositionExplanation, ...]
}
}

Examples coming later.

IsPropositionExplanation

This format is used in the $why and relate explanations for the is-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.

Format
integer
or
number
or
boolean
or
null
or
string

Examples coming later.

PropositionExplanation

Other types

All other API types.

ColumnName

Name of a column in a table. Links are supported.

Format
object

Examples

"id"
"age"
"product.id"
Referenced in

EvaluateGenericQuery

A Generic query to be evaluated in the Evaluate query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results.
"get": Get,
// Declares the sorting order of the result by a field or by a // user-defined score.
"orderBy": OrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}

Example

{ "from": "impressions", "get": "product", "limit": 20, "offset": 10 }
Referenced in

EvaluateGroupedOperation

Supported query to be evaluated in EvaluateGroupedQuery. Currently only support Generic query and Recommend query

Format

Examples

{
  "from": "impressions",
  "goal": { "purchase": "true" },
  "recommend": "product",
  "where": {
    "product.name": { "$get": "query" },
    "session.user": { "$get": "session.user" }
  }
}
{
  "from": "impressions",
  "get": "product",
  "orderBy": {
    "$p": { "purchase": true }
  },
  "where": {
    "product.name": { "$get": "query" },
    "session.user": { "$get": "session.user" }
  }
}

EvaluateGroupedQuery

The EvaluateGroupedQuery is similar to the EvaluateQuery with an addition option to group multiple entries into a single test case.

For example, if there exists a "customerCohort" identifier in "impressions" table, we can evaluate by the customerCohort instead of the individual customer with the following EvaluateGroupedQuery:

{
  "evaluate": {
    "from": "impressions",
    "where": {
      "customer": { "$get": "customer" }
    },
    "recommend": "product",
    "goal": { "purchase": true }
  },
  "group": "customerCohort",
  "test": {
    "customerCohort": { "$gte": 5 }
  },
  "select": ["trainSamples", "testSamples", "meanRank"]
}
Format
{
// Proposition expression describes a fact, or a statement.
"train": Proposition,
// Proposition expression describes a fact, or a statement.
"test": Proposition,
// TestSource enables more options to choose the testing data // in the [Evaluate Query](#post-api-v1-evaluate).
"testSource": TestSource,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"group": object,
// Proposition expression describes a fact, or a statement.
"goal": Proposition,
// Supported query to be evaluated in // [EvaluateGroupedQuery](#schema-evaluate-grouped-query). // Required.
}

Example

{
  "evaluate": {
    "from": "impressions",
    "goal": { "purchase": "true" },
    "recommend": "product",
    "where": {
      "product.name": { "$get": "query" },
      "session.user": { "$get": "session.user" }
    }
  },
  "group": "userGroup",
  "select": ["accuracy", "meanRank", "n"],
  "test": {
    "userGroup": { "$gte": 5 }
  }
}
Referenced in

EvaluateMatch

A Match query to be evaluated in the Match query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// Get expression defines what items are returned as query // results. Required.
"match": Get,
// PropositionSet expression is used to describe a collection // of propositions.
"basedOn": PropositionSet,
// The number of results to skip from the beginning. // Default: 0
"offset": integer,
// The maximum number of results to retrieve. // Default: 10
"limit": integer
}

Example

{
  "from": "impressions",
  "limit": 2,
  "match": "prevProduct",
  "offset": 2,
  "select": ["title", "description", "price"],
  "where": { "customer": 4, "query": "laptop" }
}
Referenced in

EvaluateMultiGenericQuery

The Generic Query to be evaluated in a EvaluateGroupedQuery

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results. Required.
"get": Get,
// Declares the sorting order of the result by a field or by a // user-defined score.
"orderBy": OrderBy,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"offset": integer,
"limit": integer
}

Example

{
  "from": "impressions",
  "get": "product",
  "orderBy": {
    "$p": { "purchase": true }
  },
  "where": {
    "product.name": { "$get": "query" },
    "session.user": { "$get": "session.user" }
  }
}

EvaluateOperation

Examples

{
  "from": "messages",
  "get": "product",
  "similarity": {
    "description": { "$get": "message" },
    "title": { "$get": "message" }
  }
}
{
  "from": "products",
  "get": "product",
  "orderBy": "$p",
  "where": {
    "name": { "$get": "name" }
  }
}
{
  "from": "products",
  "predict": "category",
  "where": {
    "name": { "$get": "name" }
  }
}
{
  "from": "messages",
  "match": "product",
  "where": {
    "message": { "$get": "message" }
  }
}
Referenced in

EvaluatePredict

A Predict query to be evaluated in the Evaluate query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// PropositionSet expression is used to describe a collection // of propositions. Required.
"predict": PropositionSet,
"exclusiveness": boolean,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"offset": integer,
"limit": integer
}

Example

{
  "from": "products",
  "predict": "category",
  "where": {
    "name": { "$get": "name" }
  }
}
Referenced in

EvaluateQuery

A query to evaluate:

Format
{
// Proposition expression describes a fact, or a statement.
"train": Proposition,
// Proposition expression describes a fact, or a statement.
"test": Proposition,
// TestSource enables more options to choose the testing data // in the [Evaluate Query](#post-api-v1-evaluate).
"testSource": TestSource,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
// Operation to be evaluated. Required.
"evaluate": EvaluateOperation
}

Example

{
  "evaluate": {
    "from": "products",
    "predict": "category",
    "where": {
      "name": { "$get": "name" }
    }
  },
  "select": ["accuracy", "meanRank", "n"],
  "test": {
    "$index": {
      "$mod": [10, 1]
    }
  }
}
Referenced in

EvaluateRecommend

A Recommend query to be evaluated in the Evaluate query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results. Required.
"recommend": Get,
// Proposition expression describes a fact, or a statement. // Required.
"goal": Proposition,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"offset": integer,
"limit": integer
}

Example

{
  "from": "impressions",
  "goal": { "purchase": "true" },
  "recommend": "product",
  "where": {
    "product.name": { "$get": "query" },
    "session.user": { "$get": "session.user" }
  }
}

EvaluateSimilarity

A Similarity query to be evaluated in the Evaluate query

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement.
"where": Proposition,
// Get expression defines what items are returned as query // results. Required.
"get": Get,
// Describes the fields and/or built-in attributes to return.
"select": Selection,
"offset": integer,
"limit": integer
}

Example

{
  "from": "messages",
  "get": "product",
  "similarity": {
    "description": { "$get": "message" },
    "title": { "$get": "message" }
  }
}
Referenced in

ExponentPropositionArray

Define the base and the exponent of the $pow operator in the array format.

The first item of the array is the base and the second item of the array is the exponent.

Format

Example

["width", 2]
Referenced in

ExponentPropositionObject

Define the base and the exponent of the $pow operator in the object format.

Format
{
// Score expression resolves to a numeric score value or // probability. Required.
"base": Score,
// Score expression resolves to a numeric score value or // probability. Required.
"exponent": Score
}

Example

{ "base": "width", "exponent": 2 }
Referenced in

EmptyDocument

The empty response object may contain more information in the future.

Format
{}

Examples coming later.

FieldProposition

FieldProposition expresses statements about a field in a table.

For example, the following expression

"price": {"$lt": 500 }

describes a statement that price is under 500.

Format

Examples coming later.

Referenced in

From

From expression declares the examined table.

Format

Examples

{
  "from": "impressions",
  "where": { "click": true }
}
"impressions"
{
  "from": {
    "from": "impressions",
    "where": { "click": true }
  },
  "orderBy": "$p",
  "where": { "query": "laptop" }
}
{ "from": "impressions" }

FromTablemodify

From expression declares the used table

Format
string

Examples

"impressions"
"products"
"customers"
"messages"

FromTablequery

From expression declares the examined table.

Format
string

Examples

"impressions"
"products"
"customers"
"messages"
Referenced in

FromWhere

FromWhere expression allows you to narrow the examined table.

When using the FromWhere, Aito would only consider that narrowed slice of table.

For instance, this query:

{
  "from": {
    "from": "impressions",
    "where": {
      "session.user": "larry"}
  },
 "match": "product"
}

is different from:

{
  "from": "impressions",
  "where": {
      "session.user": "larry"
  },
 "match": "product"
}

In the first query, Aito matches Larry with products only based on Larry impressions data while in the second query, Aito matches Larray with products based on Larry and other users' impressions data.

Format
{
// From expression declares the examined table. Required.
"from": From,
// Proposition expression describes a fact, or a statement. // Required.
"where": Proposition,
"limit": integer
}

Example

{
  "from": "impressions",
  "where": { "click": true }
}
Referenced in

Get

Get expression defines what items are returned as query results.

By default, the hits are from the table defined in "from" clause. In some cases, you may want to declare propositions like 'query is laptop' in impression table, while returning results from the separate products table, based on click likelihood. In this case, you may have query such as

{
  "from": "impressions",
  "where": { "query": "laptop" },
  "get": "product",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

The "get" expression takes a field name as a parameter. If the field is link, the returned results are from the linked table. If the field is not link, the field values are returned as results.

Normally, the result of a query consists of the field values that best fulfill the query conditions. Field analyzers extract features from text fields and the $feature property can be used to return features instead of complete field values. For instance, the following example demonstrates how to discover product tags which are likely to lead to sales

{
  "from": "impressions",
  "where": { "query": "cheap phone" },
  "get": "product.tags.$feature",
  "orderBy": {
    "$p": {
      "$context": { "click": true }
    }
  }
}

The $feature syntax also allows you to examine the values/features of a link field like it would be a regular field.

Format
string

Examples

"product"
"user"
"text.$feature"
"link.field"
"link.$feature"
"link.text.$feature"

GetValueExpression

$get is used to access external variables in the evaluate query.

$get is currently only used in the the Evaluate queries. The evaluate tests a specified query by examining the table rows one-by-one. $get allows accessing the tested row's properties.

Consider the following example.

Given a table containing products data with the following schema:

"products": {
  "type": "table",
  "columns": {
    "title": { "type": "Text", "analyzer": "English" },
    "description": { "type": "Text", "analyzer": "English" }
  }
}

and a table containing impressions data with the following schema:

"impressions": {
  "type": "table",
  "columns": {
    "customer": { "type": "Int", "link": "customers.id" },
    "product": { "type": "Int", "link": "products.id" },
    "query": { "type": "Text", "analyzer":