DEV Community

Cover image for How to use Selenium on linux
Paulo Mota
Paulo Mota

Posted on

How to use Selenium on linux

Developers always are looking for ways to reuse their codes as well, for that, I'm gonna share with you a way to use your selenium code as well.

First of all, you should install the libs in your project:

pip install selenium
I really like to use selenium, because there are many javascript on the web which most of the time you need to wait for the website to load completely for starting scrap.

After that run these commands on Linux in the same folder of your project(.py file)!

wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip
https://trendoceans.com/how-to-install-and-setup-selenium-with-google-chrome-on-ubuntu/
unzip chromedriver_linux64.zip

It will download the chrome web driver for Linux.

Let's code!

Imports

lib that you allow to chose your browser
from selenium import webdriver

It is very useful to do some time.sleep() and wait any seconds on page.
import time

Very nice to use to get information from html

from bs4 import BeautifulSoup

I use to see the progressive bar in for loops
import tqdm

Pandas I use to create dataframes and export my information scraped on csv
import pandas as pd

You can use this to wait a specific element load
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

manage drop down components on web
from selenium.webdriver.support.ui import Select

os system commands
import os

Download the webdriver in your project automatic (Just for Windows)
import chromedriver_autoinstaller

You will use to check the operational system your bot is running( Windows or Linux )
import platform

Implementation

Check your operational system!
OP_SYSTEM = platform.system()
print(OP_SYSTEM)
if OP_SYSTEM.lower() == 'windows':
chromedriver_autoinstaller.install()

Create a folder to recieve your donwloads
try:
os.mkdir(os.path.dirname(os.path.realpath(__file__)) + '//data')
except:
pass

folder = os.path.dirname(os.path.realpath(__file__)) + '/data'# Set Google Options
options = webdriver.ChromeOptions()

Define donwload settings
Set a specific folder to download files from selenium ( Default is download folder)
prefs = {
"download.default_directory": r"%s" % folder,
"download.prompt_for_download": False,
"download.directory_upgrade": True
}

options.add_experimental_option('prefs', prefs)
This option hide the browser... to see the browser comment this line below
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--allow-running-insecure-content")
options.add_argument("--window-size=1920,1080")
options.add_argument("--disable-extensions")
options.add_argument("--proxy-server='direct://'")
options.add_argument("--proxy-bypass-list=*")
options.add_argument("--start-maximized")
options.add_argument('--disable-gpu')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--ignore-certificate-errors')
options.add_experimental_option('excludeSwitches', ['enable-logging'])

Remove selenium logs on console ( More clean! )
options.add_argument('--log-level=3')
`
Chose the webdriver according with your system
Windows or Linux

if OP_SYSTEM.lower() == 'windows':
driver = webdriver.Chrome(chrome_options=options)
else:
driver = webdriver.Chrome(executable_path='chromedriver', chrome_options=options)

driver.get("https://google.com")

Your code to scrap start here , everything above I like as default in my codes!
search_box = driver.find_element_by_name('q')
search_box.send_keys('What is Python?')
search_click = driver.find_element_by_name('btnK')
search_click.submit()
time.sleep(2)
tiles = driver.find_elements_by_tag_name('h3')
for title in tiles:
print(title.text)

good practice to kill the process, for dont speeding too much resources
driver.close()
driver.quit()

Good Hacking!

Add me on Linkedin!

'Reach me onLinkedin'

Top comments (0)