DEV Community

Cover image for Saying hello to Amazon Sagemaker
Mursal Furqan
Mursal Furqan

Posted on

Saying hello to Amazon Sagemaker

Hey. Hi Friends ✋
So I had been learning Machine Learning lately, and I thought about let's start a series about Machine Learning articles, but later changed my mind to stay limited to Amazon Sagemaker, only. 😄 Here, I would share my experiences about Amazon sagemaker, and how is this an awesome tool to Machine Learning.

My goal with this article is to enlighten you by addressing the "WHY," "HOW," and "WHAT" sequences of questions for picking the appropriate platform for the vast majority of machine learning problems in companies that we solve today. Then I'd to link ‘AWS Sagemaker' to the problem as a viable candidate, emphasizing the reasons mentioned in ‘Why do we need AWS Sagemaker?' There are clearly other types of challenges and experiments in businesses for which this would not be the greatest solution; I'll attempt to identify a few situations where it would not be the ideal solution.

Alt Text

Why do we really need a platform?

The most efficient approach to handle huge machine learning issues is to assist a data scientist with the essential software skills in a tidy abstract yet effective way to offer an ML solution as a highly scalable web-service (API). The API may be integrated into the appropriate software systems, and the ML service can be abstracted as simply another service wrapped around an API by the software development team.

Therefore, we need a platform that can enable a data scientist with the necessary tools to independently execute a machine learning project in a truly end-to-end way.

HOW can a platform solve this problem?

Given the project's three stages are so dissimilar, we need to come up with a solution that puts the data scientist in the driver's seat. We can overcome the issues raised above if we have a platform that provides tidy abstractions to augment the needed remaining abilities in each of these phases while being extremely effective and adaptable for a data scientist to provide results.

Therefore, we need a platform where the data scientist will be able to leverage his existing skills to engineer and study data, train and tune ML models and finally deploy the model as a web-service by dynamically provisioning the required hardware, orchestrating the entire flow and transition for execution with simple abstraction and provide a robust solution that can scale and meet demands elastically.

Why would I choose Amazon Sagemaker?

AWS Sagemaker, I believe, is the best fit for us. It includes Jupyter NoteBooks with R/Python kernels, as well as a compute instance that we may pick on demand based on our data engineering needs. Using standard approaches (such as Pandas + Matplotlib or R + ggplot2 or other popular combinations), we may display, analyze, clean, and transform the data into the appropriate forms. We may train the models utilizing a different compute instance dependent on the model's compute needs, such as memory optimized or GPU enabled, after data engineering.

For a range of models, take advantage of clever default high-performance hyperparameter tweaking options. Use performance-optimized algorithms from AWS's extensive library, or bring our own algorithms in using industry-standard containers. Deploy the trained model as an API as well, this time utilizing a separate compute instance to match business needs and to grow elastically.

And the entire process of provisioning hardware instances, running high-capacity data jobs, orchestrating the entire flow with simple commands while abstracting the mammoth complexities, and finally enabling serverless elastic deployment can be done with just a few lines of code while remaining cost-effective. Sagemaker is a game-changing enterprise solution.

So folks, that was all for the sagemaker, and it's importance. My next article would be on where NOT to use Sagemaker 😄

Discussion (0)