Overview
Selenium Web driver is one of the most widely used tools for automating web applications. By far it is the most important component of Selenium Suite. It provides different drivers for different browsers and supports multiple programming languages. This article describes about Selenium WebDriver Architecture and to know about how selenium works internally.
The above picture depicts, there are four components of Selenium Architecture:
1. Selenium Client Library
2. JSON Wire Protocol over HTTP
3. Browser Drivers
4. Browsers
We will try to understand each of these four components briefly:
1. Selenium Client Library
Selenium supports multiple libraries such as Java, Ruby, Python, etc.
It means that we can write our code with any of these scripting/programming languages in our IDE.
After we trigger the test, the complete selenium code written in the IDE will be converted into the JSON format in the backend.
2. JSON Wire Protocol over HTTP
JSON stands for JavaScript Object Notation. It is a lightweight data-interchange format which transfers the data between a server and a client on the web. JSON wire protocol acts as a mediator between client libraries and Web Drivers. Because servers do not understand programming languages, JSON Wire Protocol uses the process of serialization (converting object data to JSON format) and de-serialization (converting JSON format to object). JSON Wire Protocol has REST APIs working over HTTP.
3. Browser Driver
Browser Drivers are used to communicate with browsers. Each browser has its specific Browser WebDriver. Each of the browsers has separate drivers, which can be downloaded from Selenium’s official repository.
Generated JSON is sent to Browser Driver (server)through http protocol.
Browser Driver receives the response back from the browser and it sends the JSON response back to the client.
We set the path of the driver downloaded in our code which is mandatory for our selenium script to get connected to the browser driver.
4. Browsers
Selenium supports multiple browsers such as Firefox, Chrome, IE, Safari etc.
Real browsers receive the requests from their browser driver and operate on the application elements according to the request. The browser responds back to its browser driver with the output and the browser driver in turn sends back the response received through JSON format via http protocol to the client.
This is how Selenium WebDriver works internally but Selenium 4 has migrated from JSON wire protocol to W3C WebDriver protocol.
W3C is a standard approved by World Wide Web Consortium for designing all the web applications and browsers. Browsers like Firefox, Chrome, IE, Safari etc. and executables (drivers) were following W3C protocol. But Selenium API alone was not following W3C protocol in earlier versions of Selenium. From Selenium 4 onwards JSON protocol is no longer followed and is replaced with W3C. Hence Selenium API, Driver and browsers all commonly follow W3C protocol which makes the communication between them much faster and easier.
W3C Advantages
Automated tests will run more consistently between different browsers and devices since they all implement the same standard.
The protocol has been developed having in mind stability. Therefore, it is expected that as a consequence the test cases written in Selenium will be much more stable.
You can do things like multi-touch actions, press two keys at the same time, and zoom, among others.
Conclusion
In a nutshell in layman's term our code what we write in our IDE is not directly interacting with the browser. In the middle there is a server called Browser Driver which is responsible to automate on the browser.