My top 10 re:invent 2022 sessions for Data Scientists and ML Developers

#aws #cloud #reinvent2022 #sagemaker

Once more reinvent is around the corner and we are excited to attend one of the biggest cloud events, where developers, customers, engineers, etc all come together to discuss the latest developments in the AWS cloud space, network and enrich their knowledge.

The event is held in Las Vegas, and runs from November 28th to December 2nd this year. It is a paid event when you attend it physically, but there is also a possibility to register and attend it online for free. You can check out the registration link here, and chose which of the 02 options is convenient for you.

The good news about reinvent is that there is a lot that is being covered and a lot of sessions you can attend. But the bad news is that given so many sessions, it is easy to get lost when you see the cavalry of sessions available for you to attend. That is the purpose of this guide.

I have looked through the machine learning sessions for developers and data scientists interested in using notebooks to solve their ML problems, and came up with the top 10 sessions I believe will be best to attend.

It is true there are other sessions for those interested in low-code machine learning or those interested in building end-to-end ML pipelines through MLOps. Offcourse I love using notebooks as well as building end-to-end pipelines, but let us leave out the advanced MLOps pipeline now and focus on sessions for those data scientists and machine learning developers, interested to leverage coding with notebooks.

So what then are my Top 10 sessions for re:Invent 2022?

Below are the sessions I would advice you to try to add to attend:

1.) AIM208 : Idea to production on Amazon SageMaker, with Thomson Reuters

This is the first session you need to attend. It ties everything together from serverless infrastructure, the tools required and the high level workflow. Whether as a beginner, a business analyst, developer or data scientist, you would go through how to build, train and deploy ML models on AWS. You will follow through on how Sagemaker covers the steps in ML lifecycle.

2.) AIM210 : Solve common business problems with AWS AI/ML services

Here you will see how companies are using machine learning and Artificial Intelligence (AI) across different industries. It would be good to get inspiration and set your creative juice flowing. Some of the use cases you will learn from include:

How AI/ML can be used to boost customer experience and satisfaction
How it can be used to speed up decision making in an organization
How it helps in cost cutting
How it is used in product development, to create new products.

So at the end of the session you should be able to sell what AI/ML can do for a company

3.) ANT301 : Democratizing your organization’s data analytics experience

With the developments made in machine learning models and frameworks, the limiting factor for most machine learning projects now is tied to data. Looking back at data and analytics is very important now, more than ever before, and this session helps you do just that.

You would learn how to leverage the analytics services available to gain better and faster insights from your data. You would also learn how to democratize your data. Learning to use the most optimized services to facilitate data preparation, and hence reducing the data challenges which machine learning models are currently facing.

4.) BOA322 : Build and deploy a live, ML-powered music genre classifier

The majority of machine learning problems are classification problems. So attending a session with classification is a good idea. Also, the Sagemaker Serverless Inference that was launched last year is a cost effective solution to deploying ML applications with highly volatile traffic. You would use some live music to learn how to deploy a classification model using the Sagemaker Serverless Inference.

5.) AIM302 : Deploy ML models for inference at high performance & low cost, feat. AT&T

In this session you would learn the wide range of possibilities you have available from Sagemaker to be able to deploy your models and chose the most optimal inference option.

Sagemaker inference options which depend on the nature of your data , such as real-time, serverless, asynchronous or batch inference.

These can also be split into single-model, multi-model, and multi-container endpoints.

In this session, you will learn how AT&T , used sagemaker to optimize model deployment at scale.

6.) BOA304 : Building a product review classifier with transfer learning

Deep learning is on the rise, as data increases exponentially everyday. Natural Language programming applications are now very common. And their popularity only keeps increasing as people are requesting for solutions like

Test summarization
Text classification

Also, in this NLP space , you must have noticed that Hugging Face is fast growing and delivering very accurate nlp models.

So in this session you will see how to benefit from transfer learning, by leverage highly robust and performant Hugging face transformers solutions to solve your specific text problems.

7.) AIM343 : Minimizing the production impact of ML model updates with shadow testing

Most of the times, beginners think machine learning ends after deployment. But in reality, it is just halfway when you deploy a model. You need to monitor and maintain the model.

There are usually needs to do some A/B testing, release new versions of the ML models, updating serving containers, modify the underlying infrastructure. These can cause serious performance issues.

This session will teach you how to use shadow testing to mitigate performance risks after your model has been deployed. You will see how HERE uses shadow mode to evaluate the performance of the models after deployment.

8.) AIM320 : Boost ML development productivity with managed Jupyter notebooks

If you love to build models using Jupyter notebooks, Amazon offers 02 options for you. This session will teach you how to use the quick-start notebook templates, already available AWS.

You will learn how to launch standalone Sagemaker notebook instances, that offer flexibility as to how you can use them for your workloads.

9.) AIM322 : Accelerate data preparation with Amazon SageMaker Data Wrangler

Sagemaker wrangler helps in data preparation, as it focuses on normalizing data and performing feature engineering. These are the first stages in the machine learning cycle.

They usually include data selection, data cleaning, exploratory data analysis, bias detection and visualization.

You will learn how to slash the data preparation time from weeks to minutes with SageMaker Data Wrangler.

10.) ARC313-R : Building modern data architectures on AWS

It is good to be up to date with the recommended infrastructure by engineers from AWS. These are usually tried and tested. We are now moving from data warehouse to a more modern infrastructure such as the use of data lakes, and optimized combinations of analytics services to facilitate the identification of deep and impactful insights as quickly as possible, as the value of data reduces over time.

There are so many sessions, but these are the 10 I would chose for data scientists and machine learning developers who prefer using notebooks for their work.

Hope it helps you get the best out of reinvent 2022.

Wish you Good Data Luck!!!