Niklas Westerstråhle

Posted on Jan 2, 2023

My experience starting out with Deepracer (Q4/22)

#aws #deepracer

What is Deepracer

I don't think I'll spend too much time writing about the history of deepracer, or what it is. You can read up on it on AWS website https://aws.amazon.com/deepracer/

Really short, it's gamified reinforcement learning. My stepping stone into learning machine learning.

Why on earth did I start now? Or didn't start earlier?

In short I have to thank a colleague for getting me started. He told me his organising an event in our company, where we're going to race live on a real track during H1/23 - Knowit League. So I figured it's time to start learning how to do this.

Earlier I thought this is just too hard, saw Jouni Luoma competing (ex-colleague) at prior events in Stockholm Summit and go on competing in Reinvent. As I hold him as an experienced AI and Machine learning guru - I thought deepracing is beyond me. Boy how mistaken I was.

The only thing I blame Victor for, is losing loads of credits running deepracer training. And myself for not picking up on this earlier - lots of fun.

So I started learning

First took a look at the console, started training an example model - just to see what happens. It'll get you driving around the track quite fast.

But I wanted to do more, learn more. So some googling followed, found some blog posts with code snippets that others had used. Copied pasted, put some thoughts into what I think of these pieces, and how would they help me.

Tried things like, are the wheels pointing towards where the track is going.

# Calculate the difference between the track direction and the heading direction of the car
   direction_diff = abs(track_direction - params['heading'])
   if direction_diff > 180:
        direction_diff = 360 - direction_diff

   abs_heading_reward = 1 - (direction_diff / 180.0)
   heading_reward = abs_heading_reward * heading_weight

# Reward if steering angle is aligned with direction difference
   abs_steering_reward = 1 - (abs(params['steering_angle'] - direction_diff) / 180.0)
   steering_reward = abs_steering_reward * steering_weight

Or if we're on straight, more points for going as fast as the car goes.

Trained a model that does the Reinvent 2017 track in about 11s constantly around the virtual track. But now I need to wait to get to try em on a real track.

Luckily I have access to a track, just need to find a space for it - and borrow a car. I'll write another blog after I have set it up, with experiences from that.

I have since then learned that they're most likely not going to work at all. :D That real world models are different - but more on that later.

Joining the October Qualifier 2022 - Open Division

My competitive side took over after I got the model "done" for our internal race, and I decided to try my luck in the October Open - that Deepracer League runs on AWS Virtual Circuit in the console, you can automatically submit your models to race once they've completed training.

Watching some YouTube videos of guys talking about hyper parameters and other ways to make your deepracer train better, one topic was the idea of a a race line. That most of the fastest teams with deepracer seem to follow a race line, instead of following the track itself.

It's like a F1 car, it drives around the track the shortest/fastest possible route.

Deepracer should be able to do this as well - even if you just tell it to "run as fast as you can, get more reward for faster time". Given enough time to train, it would find that route.

But to train it a bit faster, I went on to learn how to calculate a race line, using https://github.com/dgnzlz/Capstone_AWS_DeepRacer

I started up an Amazon Sagemaker Notebook instance, downloaded the GitHub repo and started following the notebook. I ran into some errors, needed to tweak something a bit. But in the end ended up with a nice looking route for our internal race on the 2017 track (as that's the only one we have as a real track), as well as one for the October qualifier.

For the most part used their reward function as a base, honestly not even 100% sure which one I ended up using. There was a few additions from other snippets - might have worked without as well. I tend to doo too many things at once. :)

Days of training, multiple days. I think my model ran for 3-5days total, in 12-18h runs.

It still drove off track sometimes, but resubmitting the model - second try did it without driving off track. I ended there.

Ended up #10 - which I think is an excellent position - sure it's 9.600s slower than the winner, but these guys have been doing this for years, I merely days. Of total 3940 models submitted.

I hear people have automation submitting models multiple times for evaluation in races, as the same model can drive faster in ideal conditions. Next time I'll know to submit a few extra times, maybe save a second or few from my time.

500 USD later

I kept the AWS bill in my mind, and took a look at what running deepracer costs, 3.5USD/hour does rack up costs.

My aim was to unlock pro league for the next races, so I can participate in monthly races that could get me invited to future ReInvents as a participant in that years competitor. And that was easily done.

On the positive side - I got some nice SWAG finishing the October open in top 10%, so you could say it's an "expensive hoodie and a cap".

Running it cheaper?

As I can't keep putting 500 USD a month into this hobby, I needed to figure how to do this in the future. Found out you can even train a model on your laptop, or running training on an EC2 instance. Spot or on demand. You could even join the dark side and run it on Azure - but lets not go there.

I decided to run training on spot - which is actually more annoying than one would think.

As you can see, price per training hour is a lot less. And my feeling is that it actually trains faster, when I run multiple training sessions parallel.

You can find more details how to do this at https://github.com/aws-deepracer-community/deepracer-for-cloud

I think I just ran out of time to get it running on spot perfectly. Spot instances kept dying, and continuing a model from pre-existing one seemed to make the model do worse than running from scratch. According to some graphs (I'm still newbie understanding what all the graphs tell me).

Ended up trying to get a model to converge for the Reinvent 2022 Championship track, but mostly my model kept just turning right all the time, never getting around the track. Not sure which parts contributed to that. I need to try again for the next races and tracks.

I don't know how I should analyse logs, and what those would tell me. I can get some details out of the training I'm running - but what should it be telling me, how should I make my reward function better.

I'm also going to request the use of on demand P instances, as I'll rather pay the on demand price, than play with dying instances. It will still be a fraction of the costs compared to the Deepracer console.

In case you were not aware, to use these type of instances on AWS you will need to request instance quota limit raise from 0 to bigger from AWS - and explaining them your use case for running them. And those go though an approval process. And the request is to be done by region.

I initially requested us-east-1, and later eu-north-1 (as it's closer to me). And can now run I think 2 instances per region on spot - for machine learning trainings.

ReInvent 2022 - reality of live cars hits me

Had the pleasure of attending Reinvent 2022, and since there were tracks - I tried some of my mostly failing models for the Championship track. Just to see them gloriously fail totally, stop in the middle of the track and just not move.

This is just mind blowing - what, it doesn't work at all like the models in the virtual environment.

Talked with some of the guys who have been doing this for a longer time, got a lot of tips on how one could train a model for the real world applications.

Next up on my path

In the end, I'm like a newborn baby in this reinforcement learning environment. Let's see what next year of learning brings.

I need to go and relearn everything for the physical car - see you in 2023 on the virtual and live tracks, aiming to join Stockholm summit race.

Analysing training logs better - a must learn thing.

And I think talking of Deepracer a bit, some community event, meetup. Maybe I'll get a few more guys interested in starting to race.

DEV Community