MindsDB empowers organizations to harness the power of AI by abstracting AI models as Generative AI Tables. These tables are capable of learning from the input data and generating predictions from the underlying model upon being queried. This abstraction makes AI highly accessible, enabling development teams to use their existing SQL skills to build applications powered by AI.

What are Generative AI Tables?

Generative AI is a subfield of artificial intelligence that trains AI models to create new content, such as realistic text, forecasts, images, and more, by learning patterns from existing data.

MindsDB revolutionizes machine learning within enterprise databases by introducing the concept of Generative AI tables. These essentially abstract AI models as virtual AI tables, capable of producing output when given certain input.

Working with Generative AI Tables

AI tables, introduced by MindsDB, abstract AI models as virtual tables so you can simply query AI models for predictions.

With MinsdDB, you can join multiple AI tables (that abstract AI models) with multiple data tables (that provide input to the models) to get all predictions at once.

Let’s look at some examples.

Deploying AI Models as AI Tables

You can deploy an AI model as a virtual AI table using the CREATE MODEL statement.

Here we create a model that classifies sentiment of customer reviews as instructed in the prompt template message. The required input is the review and output is the sentiment predicted by the model.

CREATE MODEL sentiment_classifier_model
PREDICT sentiment
USING
    engine = 'openai_engine',
    model_name = 'gpt-4',
    prompt_template = 'describe the sentiment of the reviews
                        strictly as "positive", "neutral", or "negative".
                        "I love the product":positive
                        "It is a scam":negative
                        "{{review}}.":';

Next we create a model that generates responses to the reviews. The required input includes review, product name, and sold product quantity, and output is the response generated by the model.

CREATE MODEL response_generator_model
PREDICT response
USING
    engine = 'openai_engine',
    model_name = 'gpt-4',
    prompt_template = 'briefly respond to the customer review: {{review}}, added by a customer after buying {{product_name}} in quantity {{quantity}}';

Follow this doc page to configure the OpenAI engine in MindsDB.

Now let’s look at the data tables that we’ll use to provide input data to the AI tables.

Preparing Input Data

The amazon_reviews table stores the following columns:

+----------------------------+-----------------------------+------------------------+-------------+
| created_at                 | product_name                | review                 | customer_id |
+----------------------------+-----------------------------+------------------------+-------------+
| 2023-10-03 16:30:00.000000 | Power Adapter               | It is a great product. | 1           |
| 2023-10-03 16:31:00.000000 | Bluetooth and Wi-Fi Speaker | It is ok.              | 2           |
| 2023-10-03 16:32:00.000000 | Kindle eReader              | It doesn’t work.       | 3           |
+----------------------------+-----------------------------+------------------------+-------------+

It provides sufficient input data for the sentiment_classifier_model, but not for the response_generator_model.

The products_sold table stores the following columns:

+----------------------------+-----------------------------+-------------+----------+
| sale_date                  | product_name                | customer_id | quantity |
+----------------------------+-----------------------------+-------------+----------+
| 2023-10-03 16:30:00.000000 | Power Adapter               | 1           | 20       |
| 2023-10-03 16:31:00.000000 | Bluetooth and Wi-Fi Speaker | 2           | 5        |
| 2023-10-03 16:32:00.000000 | Kindle eReader              | 3           | 10       |
+----------------------------+-----------------------------+-------------+----------+

The reponse_generator_model requires the two tables to be joined to provide it with sufficient input data.

Making Predictions

You can query the AI tables directly or join AI tables with data tables to get the predictions.

There are two ways you can provide input to the models:

  1. If you query the AI table directly, you can provide input data in the WHERE clause, like this:

    SELECT review, sentiment
    FROM sentiment_classifier_model
    WHERE review = 'I like it';
    
  2. You can provide input data to AI tables from the joined data tables, like this:

    SELECT inp.product_name,
          inp.review,
          m1.sentiment,
          m2.response
    FROM data_integration_conn.amazon_reviews2 AS inp
    JOIN data_integration_conn.products_sold AS inp2
    ON inp.customer_id = inp2.customer_id
    JOIN sentiment_classifier_model AS m1
    JOIN response_generator_model AS m2;
    

    The sentiment_classifier_model requires a parameter named review, so the data table should contain a column named review, which is picked up by the model.

    Note that, when joining data tables, you must provide the ON clause condition, which is implemented implicitly when joining the AI tables.

Moreover, you can combine these two options and provide the input data to the AI tables partially from the data tables and partially from the WHERE clause, like this:

SELECT inp.product_name,
       inp.review,
       m1.sentiment,
       m2.response
FROM data_integration_conn.amazon_reviews2 AS inp
JOIN sentiment_classifier_model AS m1
JOIN response_generator_model AS m2
WHERE m2.quantity = 5;

Here the sentiment_classifier_model takes input data from the amazon_review table, while the response_generator_model takes input data from the amazon_reviews table and from the WHERE clause.

Furthermore, you can make use of subqueries to provide input data to the models via the WHERE clause, like this:

SELECT inp.product_name,
       inp.review,
       m1.sentiment,
       m2.response
FROM data_integration_conn.amazon_reviews2 AS inp
JOIN sentiment_classifier_model AS m1
JOIN response_generator_model AS m2
WHERE m2.quantity = (SELECT quantity
                     FROM data_integration_conn.products_sold
                     WHERE customer_id = 2);

Difference between AI Tables and Standard Tables

To understand the difference, let’s go over a simpler example. Here we will see how traditional database tables are designed to give you a deterministic response given some input, and how Generative AI Tables are designed to generate an approximate response given some input.

Let’s consider the following income_table table that stores the income and debt values.

SELECT income, debt
FROM income_table;

On execution, we get:

+------+-----+
|income|debt |
+------+-----+
|60000 |20000|
|80000 |25100|
|100000|30040|
|120000|36010|
+------+-----+

A simple visualization of the data present in the income_table table is as follows:

Querying the income table to get the debt value for a particular income value results in the following:

SELECT income, debt
FROM income_table
WHERE income = 80000;

On execution, we get:

+------+-----+
|income|debt |
+------+-----+
|80000 |25100|
+------+-----+

And here is what we get:

But what happens when querying the table for an income value that is not present there?

SELECT income, debt
FROM income_table
WHERE income = 90000;

On execution, we get:

Empty set (0.00 sec)

When the WHERE clause condition is not fulfilled for any of the rows, no value is returned.

When a table doesn’t have an exact match, the query returns an empty set or null value. This is where the AI Tables come into play!

Let’s create a debt_model model that allows us to approximate the debt value for any income value. We train the debt_model model using the data from the income_table table.

CREATE MODEL mindsdb.debt_model
FROM income_table
PREDICT debt;

On execution, we get:

Query OK, 0 rows affected (x.xxx sec)

MindsDB provides the CREATE MODEL statement. On execution of this statement, the predictive model works in the background, automatically creating a vector representation of the data that can be visualized as follows:

Let’s now look for the debt value of some random income value. To get the approximated debt value, we query the mindsdb.debt_model model instead of the income_table table.

SELECT income, debt
FROM mindsdb.debt_model
WHERE income = 90000;

On execution, we get:

+------+-----+
|income|debt |
+------+-----+
|90000 |27820|
+------+-----+

And here is how it looks: