This is the second article in a series of tutorials that attempt to fill gaps in this excellent article: Announcing YOLTv4: Improved Satellite Imagery Object Detection. The first article, covered building a container to run the model. In this article, I cover how to get the training data.
We need to train the model with a data set. The article points to the RarePlanes data set, which is an open-source dataset provided by In-Q-Tel's CosmiQ Works. The data is stored in AWS bucket and accessible with the AWS CLI. The first step to getting the data is install the AWS client.
To install on macOS, you can use the brew package manager:
$ brew install awscli
or you can download the GUI installer.
To install on Linux:
$ curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" $ unzip awscliv2.zip $ sudo ./aws/install
To install on Windows, download the installer and follow the installation prompts.
The training data doesn't require an AWS account, and you can see the contents of the S3 bucket:
$ aws s3 ls s3://rareplanes-public/ --no-sign-request PRE real/ PRE synthetic/ PRE weights/ 2020-06-09 08:27:59 20605 LICENSE.txt
To download the data set use the
NOTE: The RarePlanes training data is over 500GB, make sure you have sufficient storage.
$ aws s3 sync s3://rareplanes-public/ --no-sign-request
Depending on the speed of your Internet connection, this can take some time.
In the next article I'll go through the training steps in the python notebook in the YOLTv4 repo.