DEV Community

loading...

Reddit Junkie, a CLI tool for downloading images from Reddit!

prpe profile image Muhammadreza Haghiri ・2 min read

What would you say, if I tell you there's a tool installed on my Ubuntu machine which makes downloads from reddit much easier?

In past few days, I was busy creating a machine learning project. I needed tons of images and I haven't found anywhere better than reddit, the front page of the internet for crawling and downloading pictures I needed. Pictures are provided by people and most of them are real world pictures and not fancy advertisements for a luxury restaurant.

So, I couldn't crawl reddit using BeautifulSoup or Nokogiri. But I realized something. For a project, I have used JSON API to get a bunch of pictures. So, I wanted an automation for the downloads! I opened up my VS Code, grabbed a cup of coffee, went to my black metal playlist on Spotify and started coding.

Now, I have this really cool tool which can help me create datasets for my A.I. project!

CLI tool

Installing the reddit_junkie tool

On a Linux, BSD, macOS or WSL machine, you need to install ruby first. my personal preference is always RVM, but as long as what you have installed can handle httparty gem, that's OK.

For installing, just run this command:

gem install reddit_junkie

and it'll be available as a command line tool for you.

Downloading 25 images, in the default "images" directory

reddit_junkie --subreddit SUB

for example, if you want the latest things from r/skyporn you just run :

reddit_junkie --subreddit skyporn

Downloading 25 images in a custom directory

reddit_junkie --subreddit SUB --directory DIR

For example, you've built a folder called sky and you want to save the pictures there. Also, if you haven't created the folder, reddit_junkie will create it for you.

reddit_junkie --subreddit skyporn --directory sky

Downloading more than 25 images in default "images" directory

reddit_junkie --subreddit SUB --count COUNT

For example, you want to download 300 pictures of the sky :

reddit_junkie --subreddit skyporn --count 300

Downloading more than 25 images in a custom directory

reddit_junkie --subreddit SUB --count COUNT --directory DIR

For example, you want to download 300 pictures of the sky, in your sky directory :

reddit_junkie --subreddit skyporn --count 300 --directory sky




Known issues / not tested

  • The CLI tool isn't tested with the --endpoint flag yet. It seems OK though.
  • In case of more than 100 images, you only can do the download for numbers dividable by 100. Like 300 or 1000 or 25000. As I made this tool to help me make a dataset, I haven't spent much time on fixing this issue.
  • CLI flags/parameters reading isn't really good. It works just fine, but not absolutely in the POSIX way.

Links

Discussion (0)

pic
Editor guide