DEV Community

Daniel
Daniel

Posted on • Edited on

AWS DeepRacer Student League Guide

I've already written a starter guide for AWS DeepRacer, that can be used both for Regular League and Student League that outlines some valuable information for getting started, but in this guide I will outline in detail the default configuration for Student League. I recommended you read the starter guide to learn about DeepRacer for Cloud.

Who is it for?

Any Student 16+ doesn't matter if high school, home school, or university. there are generally scholarship prizes and the season is typically paired with a Udacity Scholarship for various Nano Degrees.

Log Analysis

Currently, there isn't a way for the student league to do any log analysis, the only package they can download from the console is the car package to load the model into the car. None of these files give valuable insight into the training performance.

Ways to Analyze the Performance

Ideally, since you don't have log analysis and a limited number of hours to train on the Student League console. You can do some mock testing with sample data sets and pass the value to your reward function within Excel. The only feasible way to robustly test your reward function would be to use DeepRacer for Cloud on your hardware. This way you can save your 10 hours until you are ready to use them to get the best performance possible. I'll outline the default that applies to both AWS Console and DRfC

Action Space

Student League has no control over the action space within the console. They are strictly defined as a Continuous action space, With a min and max speed of 0.5 to 1, and a steering angle of -30 to 30.

In DRfC(see starter guide) this is defined as a JSON document like this:

{
    "action_space": {"speed": {"high": 1, "low": 0.5}, "steering_angle": {"high": 30, "low": -30}},
    "sensor": ["FRONT_FACING_CAMERA"],
    "neural_network": "DEEP_CONVOLUTIONAL_NETWORK_SHALLOW",
    "training_algorithm": "clipped_ppo", 
    "action_space_type": "continuous",
    "version": "5"
}
Enter fullscreen mode Exit fullscreen mode

πŸ’‘it is important to note that the speed that gets passed to the reward function will never exceed your max speed in the action space. a lot of people get confused by this partly because the video has an error output of speed. I believe this is caused by Python's Global Interpreter Lock due to the nature of how the data is called causing timing issues. while the math behind it is sound there is something fundamentally throwing off the output. Which I believe it to be caused by GIL

Hyperparameters

Again, this is another item Student League has no control over, because of this no control it adds another layer of difficulty due to the discount factor being 0.999 meaning the reinforcement learning is looking at the whole lap to calculate the future reward to make a decision and not just a little bit ahead. this is also defined by a JSON document within DRfC. These are the default student league is stuck with while training.


{
    "batch_size": 64,
    "beta_entropy": 0.01,
    "discount_factor": 0.999,
    "e_greedy_value": 0.05,
    "epsilon_steps": 10000,
    "exploration_type": "categorical",
    "loss_type": "huber",
    "lr": 0.0003,
    "num_episodes_between_training": 20,
    "num_epochs": 10,
    "stack_size": 1,
    "term_cond_avg_score": 350.0,
    "term_cond_max_episodes": 1000,
    "sac_alpha": 0.2
  }

Enter fullscreen mode Exit fullscreen mode

Reward Function

as I said in my starter guide, there are many different parameters you can pick from and some are for object avoidance. Which isn't needed for the time trial races. Again See my starter guide for additional Details on your reward function. Instead Going to outline some issues I've seen from other Racers.

  • Min and Max Reward Cannot exceed 10k.

If it does this will either cause a validation error or cause training to stop.

  • Import reporting module not available

DeepRacer doesn't use the latest Numpy so this generally reports as a syntax error. if you attempt to import/use a module that doesn't exist in that version of numpy.

  • logic errors + syntax errors

the validator only passes in a small sample set of data. Your carefully crafted reward function might not throw a validation error, but later cause your training to stop. this is generally because the sample set never fully triggered your reward function. Logic errors could also cause unexpected behavior. Always test your reward function with sample parameters to see if you get the expected reward.

Additionally, to avoid some of these errors it is best practice to initialize your variables before you get to any logic statements. this will avoid edge cases where you call a variable without any assignment.

  • expecting speed to exceed max speed of action space

I've said it above, and I'll say it again because I've had to explain this several times to some people before it finally clicks. Speed will never exceed the maximum speed you have set in your action space.

Udacity Scholarship

Generally, there are a few steps to do for the Udacity Scholarship, one is completing a lap under X number of minutes. you'll be submitting to the virtual circuit track to achieve this requirement. Which should be found on the left-hand side of the navigation panel. This step can be difficult to achieve depending on the length of the track.

Top comments (0)