loading...
Cover image for Python Selenium Tutorial - Make your own bot

Python Selenium Tutorial - Make your own bot

renaissancetroll profile image Renaissance Troll ・7 min read

Summary

  • What is Selenium and Why learn it?
  • Setup
  • Basics of Selenium
  • Make a Reddit Bot with Selenium
  • Most commonly used Selenium features
  • Advanced Selenium Concepts

In this tutorial you'll be making a simple Reddit bot using Python and Selenium Webdriver. While this tutorial uses Python specifically, the code can be easily modified to work with other programming languages like Javascript because of Selenium's consistent API and client libraries for numerous languages like Javascript, C#, PHP, and more.

when this tutorial is done you'll have a basic Reddit Bot that is logged in, you can then use what you've learned to add features like scraping content, upvoting/downvoting, or creating a post

What is Selenium

Selenium is a tool designed for automating web browsers programmatically. Selenium's primary use is for automated software testing but it is also commonly used for scraping content where rendering Javascript is necessary and any other activity requiring automation in the browser such as bots.

Setup

To follow this tutorial you'll need 2 things:

  • Basic Python environment for development
  • Google Chrome and Chromedriver(stable version 85 used for this tutorial)

Download Chromedriver here for your operating system and place the executable file in the folder you are using for this project so Python can find it.

Video Tutorial

If you prefer watching a tutorial you can follow along here:

Basics of Selenium

First you need to install the Selenium client library for python using Pip:

pip install selenium

Now let's make sure everything is working properly with a simple command, we will use selenium to load the Google homepage with the following code:

from selenium import webdriver
import time

driver = webdriver.Chrome()
driver.get('https://google.com')
time.sleep(5)
driver.close()

The code above should open up a Chrome window, load Google's homepage, pause for 5 seconds and then close the window. If not, be sure to check that you've installed Selenium and have Chromedriver in the same directory that you are running your code.

Working with Elements in Selenium

Once we've loaded a web page, we obviously want to be able to interact with the page as well. To do this selenium gives us a number of different ways to locate elements and then interact with them, just like we would using a mouse and keyboard. There are 2 main things you'll need to do when automating a task with Selenium:

  1. Find the element you want to interact with using various options to locate the DOM element on the web page
  2. Use built-in methods to interact with the element such as clicking on a button or typing into a form

Finding elements with Selenium

There are many ways to skin a cat and with Selenium in most cases you could locate the same element 2-4 different ways. If possible though you should use simpler methods like CSS class names or ID tags when possible, more complex tools like Xpath should only be used when working with complex web pages that require it.

As an example, you could grab the search bar element on Google's home page using any of the following lines of code:

search_bar = driver.find_element_by_name("q")
search_bar = driver.find_element_by_css_selector(".gLFyf.gsfi")
search_bar = driver.find_element_by_xpath("//input[contains(@class, 'gLFyf') and contains(@class, 'gsfi')]")

The above code will only return the 1st elements matching the query, if you want to return a list of all elements that match you can change element to elements for the method you used like this:

find_elements_by_id('element-ID')

Selenium will return a list of web elements that you can iterate over using a basic for loop.

The best way to find locators for elements is to open up your developer tools and find the element, then look at it to see what your options are. In this case, the Google search bar doesn't have an ID tag so we have to use something else. My personal order for using locators is something like this:

  1. ID, CSS class name
  2. CSS selector, HTML name attribute
  3. Xpath, link text

dev tools to find elements

Always use the easier options when possible, no point making things more complicated than they need to be.

Interacting with Elements

The above locator methods will return Selenium WebElement objects which have different methods we can use to interact with the DOM element. Some of the functions you will use most often are:

  • element.click() - Click on the element
  • element.get_attribute() - Can check if an element has a certain attribute, returns true/false
  • element.send_keys() - Type value into input field
  • element.is_displayed() - Check if element is visible, useful for checking if web page is displaying properly under certain conditions such as a user being logged in
  • element.submit() - submit a form element
  • element.location - property that returns x/y location in pixels
  • element.text - returns text value of element

Making a Reddit Bot

Now let's put what you've learned above into practice by creating a simple bot that logs into Reddit. Here's how you can do that in a few lines of code:

driver = webdriver.Chrome()

try:
    driver.get('https://reddit.com')

    login_btn = driver.find_element_by_class_name('_3Wg53T10KuuPmyWOMWsY2F')
    login_btn.click()

    frame = driver.find_element_by_class_name('_25r3t_lrPF3M6zD2YkWvZU')

    #switch to iframe popup context
    driver.switch_to.frame(frame)

    username_field = driver.find_element_by_id('loginUsername')
    username_field.click()
    username_field.send_keys('username')

    password_field = driver.find_element_by_id('loginPassword')
    password_field.click()
    password_field.send_keys('password123')

    submit_btn = driver.find_element_by_class_name('AnimatedForm__submitButton')
    #sleep to see result
    time.sleep(17)

    #click on submit button to login
    submit_btn.click()

    driver.close()
#print any exceptions such as element not found error, then close browser
except Exception as e:
    print(e)
    print('driver closing on error')
    driver.close()

The key thing to note with the above code is that the code is wrapped in a try/except block for error handling. Printing out errors is very useful when working with Selenium so you can find exactly why you have a bug with your code, the majority of bugs you will run into will be dealing with not using the right selector to find a element.

Another thing that is rarely encountered but initially caused me problems is dealing with iframes in Selenium. You have to explicitly switch to the iframe, otherwise you won't be able to interact with the elements located within the iframe. This is what the frame element is used for.

Advanced Selenium concepts

We've covered the basics and those will be all you need for 90% of use cases, but below I'll go over some other useful features of Selenium that I've used after running into various real world problems.

Options and Capabilities

The Options and capabilities objects can be passed as arguments to Selenium when you create a browser instance. Some useful features include running in headless mode, uploading browser extensions, preventing browser notifications, choosing size of browser window, and running Selenium from behind a proxy.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.proxy import Proxy

capabilities_chrome = webdriver.DesiredCapabilities.CHROME
option = Options()

#use proxy server with Selenium
prox = Proxy()
prox.proxy_type = ProxyType.MANUAL
prox.http_proxy = '00.000.0000'
prox.ssl_proxy =  '000.000.0000'
#add proxy using chrome capabilities
prox.add_to_capabilities(capabilities_chrome)

#run selenium in headless mode to save resources
option.add_argument("--headless")
#disable infobar showing chrome is being run by automate software
option.add_argument("--disable-infobars")
#start browser window at full size of screen
option.add_argument("start-maximized")
#add ublock ad blocker to Selenium browser instance
path_to_ublock = 'file path'
option.add_extension(path_to_ublock)

#block notifications in window, like those asking for location
option.add_experimental_option("prefs", { 
    "profile.default_content_setting_values.notifications": 1 
})

driver = webdriver.Chrome(chrome_options=option, desired_capabilities=capabilities_chrome

For a full list of options and capabilities you can visit the following links:

Use browser extensions

In some cases it might be useful to load your Selenium browser instance with an extension like Ad block. Doing this is fairly simple, you just need to download the extension itself and then pass the CRX file path to Selenium using code similar to this:

file_path = 'path_to_file.crx'
options = webdriver.ChromeOptions()
options.add_extension(file_path)
driver = webdriver.Chrome(chrome_options=options)

Action Builders and Action Chains

Action chains are used for when you need more fine tuned controls over low-level browser actions like mouse movement, mouse button actions, key presses or double clicks, and context menu interaction like testing custom right-click features.

from selenium.webdriver.common.action_chains import ActionChains

driver = webdriver.Chrome()
actions = ActionChains(driver)
actions.move_by_offset(x_value, y_value)
actions.click()
#perform all chained actions added so far
actions.perform()

#other potential actions

actions.double_click()

#right click
actions.context_click()

action.drag_and_drop()

action.key_down()
action.key_up()

Storing cookies

If you are working with a website that requires being logged in, storing session cookies can make things easier so you don't have to log in every time you run Selenium in a new instance. To do this you simply have to export and save the cookies and then add them to Selenium the next time you run an instance.

import pickle
driver = webdriver.Chrome()

#store cookies from current session
pickle.dump(driver.get_cookies(), open('cookie_file.pkl', 'wb')

#load cookies from a previous session
cookies = pickle.load('cookie_file.pkl', 'rb')
for cookie in cookies:
    driver.add_cookie(cookie)

File Uploads

Selenium can't interact with the native file explorer that opens when you click on a file upload button in the browser, instead we grab the element for uploading the file and then type the entire file path as input, then click on the submit button:

driver = webdriver.Chrome()

#go to page where you want to upload file
driver.get('page-with-file-upload.com')

file_upload = driver.find_element_by_id('file-upload')
file_upload.send_keys('file path as string')

submit_button = driver.find_element_by_id('button')
submit_button.click()

Conclusion

Hopefully this tutorial gave you a decent understanding of what Selenium is capable of. Pretty much anything you can do manually with a web browser can also be done with Selenium, so the only limit is your imagination.

If you have any questions about Selenium or requests for future tutorials leave a comment below!

Posted on by:

renaissancetroll profile

Renaissance Troll

@renaissancetroll

The renaissance man is dead, long live the renaissance troll

Discussion

pic
Editor guide