The selenium module is used to control web browsers. You can control Chrome, Firefox, Chrome Mobile and many other browsers directly from Python code. You should know the basics of Python.
Install
Install the selenium module and web driver. Use the python package manager pip to install the module.
pip3 install selenium
Each browser has a specific web driver. You can find a list of drivers on the selenium website.
For Chrome you need chromedriver. For Firefox you need geckodriver. The version of the driver needs to match the browser version on your pc.
Selenium
After the driver and module are installed, you can fire up a browser. First thing to do is import the selenium module and the time module.
Then start a browser instance
#!/usr/bin/python3
browser=webdriver.Firefox()
Get the webpage:
#!/usr/bin/python3
browser.get("https://twitter.com")
So you have this code
#!/usr/bin/python3
from selenium import webdriver
import time
# start web browser
browser=webdriver.Firefox()
# get source code
browser.get("https://twitter.com")
html = browser.page_source
time.sleep(2)
print(html)
# close web browser
browser.close()
This then opens the browser and shows the HTML. You can interact with the browser just as you would normally do: click elements, type, scroll and much more.
Related links:
Top comments (2)
Nice quick intro to Selenium! I like the brevity.
It didn't occur to me that I can run Selenium from Python. Derp!
If anybody else gets this error 'FileNotFoundError: [Errno 2] No such file or directory: 'geckodriver'', there's a nice StackOverflow article here: StackOverflow that explains how to get the geckodriver binary for your OS.
Any best alternative please