Introduction
By using MindsDB, you can train models easily and instantly with the help of simple SQL statements, and even predict outcomes using the best-in-class machine learning algorithms. It acts as a layer over the existing tables to provide the best AI capabilities on top.
To make it easy for users to manage their existing data infrastructure along with MindsDB, MindsDB offers a wide selection of integrations with almost all the available databases and many machine learning frameworks. Additionally, MindsDB offers its service in two ways, Local Deployments (Using Docker or PIP) as well as MindsDB Cloud, both of which have a free tier for all users.
In this tutorial we will be Predicting Logistics Quality based on its feature set using MindsDB Cloud.
Importing Data to MindsDB Cloud
In this tutorial, we will use this dataset from Kaggle to train our Predictor model. You can also get free datasets from Kaggle, Datahub, and Google Dataset Search.
It's time to get this dataset added to MindsDB Cloud.
Step 1: We need to first sign in to the MindsDB Cloud console or start registering for a new account.
Step 2: A MindsDB Cloud Editor appears once you sign in. The Query Editor is at the top, where you can write queries and then press the Run (Shift+Enter) button above it to execute those queries. At the bottom, you can see the results of your queries, and finally, you have a Learning Hub on your right, which serves as a learning aid.
Step 3: Now find the Add Data button from the top right corner and tap on it. Then switch the tab at the top to Files instead of Databases and click on the Import File button.
Step 4: Once you have selected the dataset file you just downloaded from the above link, click the Import File section, browse and select it. Provide a name for the table in the Table name field, and click Save and continue to proceed.
Step 5: As soon as the dataset is imported successfully, we will be redirected to the MindsDB Cloud Editor. We will now see two basic SQL queries listed in the Query Editor Let's run them one by one and see what happens.
The first query should show a list of available tables. Check to see if there is a table in the list that matches the name we provided when importing the dataset.
SHOW TABLES FROM files;
The second query lets you check whether we have the right data records present inside the table that we just imported.
SELECT * FROM files.logistics LIMIT 10;
We are now ready with the data table and can now proceed to the next section.
Training a Predictor Model
You can train a Predictor Model with MindsDB as easily as writing a SQL query and executing it. Here are the steps we can follow for creating and training the model.
Step 1: MindsDB offers the CREATE PREDICTOR statement. The syntax of this statement is as follows.
CREATE PREDICTOR mindsdb.predictor_name (Your Predictor Name)
FROM database_name (Your Database Name)
(SELECT columns FROM table_name LIMIT 10000) (Your Table Name)
PREDICT target_parameter; (Your Target Parameter)
An actual query with real field_names instead of the placeholders will look like the one below.
CREATE PREDICTOR mindsdb.logisticService
FROM Logistic
(SELECT * FROM logisitcs LIMIT 10000)
PREDICT logisitcsQualityofLogisticsServices;
Step 2: The model might take some time to complete its training based on the size of the training data provided.
While we wait, we can check the status of the model with the command below. If the query returns complete, then the model is ready to do the predictions. But, if it returns generating or training, it is advised to wait until the status is complete.
SELECT status
FROM mindsdb.predictors
WHERE name='Name_of_the_Predictor_Model';
The real query will be something like this.
SELECT status
FROM mindsdb.predictors
WHERE name='logisticService';
As we got the complete status, we can now do the predictions for logistics quality.
Describing the Predictor Model
It is really very important to understand the details about our Predictor model before directly jumping in to do the predictions.
So, in this section we will try to figure out the details of our model in 3 different ways using the DESCRIBE statement.
By Features
By Model
By Model Ensemble
By Features
This query is designed to return the roles of each column in the table for the model and also mentions the specific encoders used on each of these columns to train the model.
DESCRIBE mindsdb.predictor_name.features;
By Model
This query fetches the list of all the available candidate models that were used during training. The candidate model which has 1 under its selected column is selected to be used in the Predictor model and is supposed to have the best performance value.
DESCRIBE mindsdb.predictor_name.model;
By Model Ensemble
This query is designed to provide us with a JSON output with the list of different parameters that helped to determine the best candidate model for the Predictor.
DESCRIBE mindsdb.predictor_name.ensemble;
It's time to now move on to the interesting part of predicting the logistics quality values.
Querying the Model
It is possible to predict target values using only the SELECT statement in MindsDB Here, we can use a SELECT query to predict the quality of logistics and ask the model to return the predicted quality to us.
It should be noted that the quality of logistics is determined by a combination of several feature values.
We can still use a query like this to predict logistics quality based on a few feature sets, but the accuracy may degrade if some of these values are left out.
SELECT target_value_name, target_value_confidence, target_value_confidence
FROM mindsdb.predictor_name
WHERE feature1=value1 AND feature2=value 2,...;
Our real query will take the values of the feature parameter set like this.
SELECT QualityofLogisticsServices
FROM mindsdb.logisticQuality
WHERE Stateslogisticsenablinginitiatives =3 AND Assessmentofvariableoflogisticsease = 2.67;;
Here the predicted quality (logistics Quality) is 2.12
That's it! We have now successfully predicted the logistics quality using the Predictor Model.
Conclusion
Having finished this tutorial, I wanted to give you a quick recap of what we have learned. Initially, we created a MindsDB Cloud account, imported data, created a table using the cloud GUI, trained a Predictor model, explained its details in three different ways, and finally predicted the Logistics Quality. As this tutorial is over, it would be a great idea to sign up for your own MindsDB account and try it out.
Lastly, before you leave, don't forget to key in your feedback in the Comments section below and show some love by dropping a LIKE on this article
Top comments (1)
Hi Arman,
Really helpful piece. What do we do if we want multiple rows of output? Meaning not for just a day, but for a date range? Is that possible?