Skip to content

DEV Community

‪Kareem Negm‬‏ for AWS MENA Community

Posted on Nov 24, 2021

6

2

Amazon Machine Learning| ML Key Concepts

#machinelearning #aws #datascience

Amazon Machine Learning Key Concepts

Data sources

Term	Definition
Attribute	A unique, named property within an observation. In tabular-formatted data such as spreadsheets or CSV files
Datasource Name	A unique name for a dataset
Input Data	Collective name for all the observations that are referred to by a datasource.
Location	Amazon ML can use data that is stored within Amazon S3 buckets, Amazon Redshift databases, or MySQL databases in Amazon Relational Database Service (RDS)
Observation	A single data point that is part of a datasource
Schema	The information needed to interpret the input data, including attribute names and their assigned data types, and names of special attributes.
Statistics	Summary statistics for each attribute in the input data
Status	Indicates the current state of the datasource, such as In Progress, Completed, or Failed.
Target Attribute	the target attribute is the attribute whose value will be predicted by a trained ML model

ML Models

Term	Definition
Regression	ML model to predict a numeric value
Multiclass	ML model to predict values that belong to a limited, pre-defined set of permissible values.
Binary	ML model to predict values that can only have one of two state
Model Size	ML models capture and store patterns. The more patterns a ML model stores, the bigger it will be. ML model size is described in Mbytes.
Number of Passes	he number of times that you let Amazon ML use the same data records is called the number of passes.
Regularization	Regularization is a machine learning technique that you can use to obtain higher-quality models

Evaluations

Term	Definition
Model Insights	Amazon ML provides you with a metric to evaluate the predictive performance of your model.
Precision	the number of positive class predictions that actually belong to the positive class.
Recall	the number of positive class predictions made out of all positive examples in the dataset.
AUC	Area Under the ROC Curve (AUC) measures the ability of a binary ML model to predict a higher score for positive examples as compared to negative examples
Accuracy	Accuracy measures the percentage of correct predictions.
F1-score	The macro-averaged F1-score is used to evaluate the predictive performance of multiclass ML models.
RMSE	The Root Mean Square Error (RMSE) is a metric used to evaluate the predictive performance of regression ML models.
Cut-off	The cut-off is the threshold that you use to determine whether a predicted value is correct or not.

Batch Predictions

Term	Definition
Output Location	The results of a batch prediction are stored in an S3 bucket output location.
Manifest File	This file relates each input data file with its associated batch prediction results. It is stored in the S3 bucket output location.

Real-time Predictions

Real-time predictions are for applications with a low latency requirement, such as interactive web, mobile, or desktop applications.

Term	Definition
Real-time Prediction API	The Real-time Prediction API accepts a single input observation in the request payload and returns the prediction in the response.
Real-time Prediction Endpoint	To use an ML model with the real-time prediction API, you need to create a real-time prediction endpoint. Once created, the endpoint contains the URL that you can use to request real-time predictions.

AWS WhitePaper Summary

Top comments (0)

Subscribe

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.