What is Selenium? Why do we use selenium for Automation?

Introduction to Selenium:

Automation:

Automation is a technology that is used to perform certain tasks with minimum or no human interactions or interference or efforts.

Selenium:

Selenium is an automation framework. It is an open source softwares (It is nothing but where the source code is distributed, making it available for using, modifying or distributing with it original rights. And it does not require any license to download or use it.)
It was created by Jasson Huggins in 2004. It was written in pure Java.
Selenium can be used with various other programming languages like Python, JavaScript, C#, Java, PHP and so on.
With selenium we can validate or verify the web applications.
Selenium is predominantly used in Automation Testing

Advantages of Selenium:

It is a open source framework
It can work with multiple operating systems like Windows, Mac, Linux / Unix
It works with different and multiple web browsers like Microsoft Edge, Chrome, Firefox, Safari, Opera etc.
You can integrate with any other testing framework like Pytest, Python Behave and so on along with Selenium
It uses less CPU and RAM during its working.

Disadvantages of Selenium:
It supports only web-based applications
You need to have a knowledge of Selenium to work with it as you need to know Python / Any programming language and the framework well. (The learning curve is hard)
You cannot automate CAPTCHA and SMS based OTP verification.
It has a small community so it lacks proper online support and due to this you need to put a lot of efforts to finding the solution to your problem
The program writing time is high.

Selenium Architecture

The architecture of Selenium is composed of:
Selenium IDE
IDE - Integrated Development Environment
It is just a web-browser extension that you need to download and install the extension for that particular web browser and start working with it.
One of the major advantages of using Selenium IDE is that it can automate and record the entire automation process as well
People generally do not use IDE, they rather prefer selenium scripts in order to do the same job.

Selenium Remote Control
It is an outdated and deprecated technology. We do not use it these days.
It has been replaced with WebDriver which is far more better and easy to use.

Selenium WebDriver:
It is the most important and major component of selenium architecture.
It provides a programming interface between the language and the web-browser.
It is composed of:
Selenium Client Library:
It consists of language bindings or commands which you will be using to write in your automation scripts.
They are compatible with W3C protocols like HTTP, HTTPS, TCP, UDP etc
They are the wrappers which send the script commands to the network, for the execution into the web-browser.

Selenium API
It contains a set of rules and regulation which your Programming language (Python) used to communicate with the Selenium
It helps in automation without the need for the user to understand what is happening in the background.

JSON wire protocol
The commands that you write are converted into JSON which is then transmitted across the network or to your web-browser so that it can be executed.
The JSON data are sent to the client using the HTTP protocol
It is used by the browsers also

Browser Drivers
It acts as a bridge between the Selenium Script, Libraries and the browser.
It helps us to run the selenium commands on the web browser.
https://github.com/mozilla/geckodriver/releases/tag/v0.34.0
https://googlechromelabs.github.io/chrome-for-testing/
https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/?form=MA13LH

Browser:
All the web browser.

Selenium Grid:
It is used to run parallel test on multiple devices running same or different browsers and we could also simulate at different geographical locations.
We can run multiple test-cases simultaneously at the same time.
It uses the master-slave architecture.

PyCharm:
It is a popular IDE for python
https://www.jetbrains.com/pycharm/download/?section=windows

Commands to install selenium and webdriver-manager:

pip install selenium
pip install webdriver-manager

XPATH in Selenium:

XPATH is a selenium technique that is used to navigate through HTML Structure of a webpage. It is a syntax that allows us to find elements on a webpage using XML Path expression.

Syntax of XPath:

Xpath = //tagname[@attribute = ‘value’]

Types of XPath:

Absolute XPath:
Absolute Xpath refers to the direct way of finding an element. The major drawback of Absolute XPATH is that if there are any changes in the element’s path, then the XPATH will fail.
It starts from the tag.
It always begins with the single forward slash (/)
Syntax of the Absolute XPath will look like:

/html/body/header/nav/div/div/div[2]/div[1]/ul/li[2]/a

Relative XPath:

In case of relative xpath, the path begins from the middle of the HTML DOM structure. Here, the syntax starts with the double forward-slash (//) that states that the element can be searched anywhere on the webpage.
It helps us to avoid to write a long XPATH.
Syntax:

Xpath = //input[@id=”fname”]

Xpath = //*[@id="navbarDropdown"]/text

Functions in XPATH:

contains():
The xpath contains() is used if part of the value of any attribute changes dynamically. The function can navigate to the web elements with the partial text present.

//tag_name[contains(@attribute, ‘value of the attribute’)]

text():
This function is used to locate the element on a web page using the web element’s text value. This is mainly used for elements that contains text like p, a, label etc.

//tag_name[text() = “Text of the element”]

start-with()
This function is used to find the element in which the attribute value starts with some specific character or a sequence of characters. This once again can be useful in dynamic web pages.

//tagname[starts-with(@attribute, ‘first_part_of_the_attribute_value’)]