Most of the software applications built today are written to run as web applications in the browsers. In this age of highly interactive and responsive software processes where many organizations are using some form of Agile methodology, using automated checks in Testing is becoming a must-requirement for them. Selenium is possibly the most widely-used open source solution to perform automation testing of the web-based applications. Although used primarily for UI testing, Selenium at its core is browser user-agent library.
Since then, Selenium has grown and matured a lot with the introduction of Selenium IDE, Selenium RC, Selenium WebDriver and Selenium Grid. Selenium WebDriver has now become a World Wide Web Consortium (W3C) recommendation which means that it is now officially supported and endorsed by W3C. You can read about the detailed changes due to this from here.
Some of the key features of Selenium are mentioned below:
– Selenium is an Open Source project which means that it is free-to-use
– Selenium IDE has the ability to record and playback automation steps and generate codes in C#, Java, Python and Ruby
– Selenium Grid is used to run parallel tests on multiple machines having multiple browsers and multiple OS
– Selenium supports the following programming languages:
– Selenium can run on the following operating systems:
Windows, Linux, macOS, Android, iOS
– Selenium scripts can run on the following internet browsers:
Google Chrome, Mozilla Firefox, Internet Explorer, Microsoft Edge, Opera, Safari
– Selenium can be integrated with other tools/libraries/frameworks like TestNG, Junit, Maven, Gradle, Ant, Jenkins, Docker etc.
– Selenium WebDriver does not require server installation as it can interact directly with the browsers.
SELENIUM WEBDRIVER ARCHITECTURE
The Selenium WebDriver Architecture follows the popular Client-Server architecture and consists mainly of four components:
- Selenium Client and Language Bindings
- JSON Wire Protocol over HTTP
- Browser Drivers
- Web Browsers
Below diagram shows in detail the Selenium WebDriver Framework Architecture with its components:
Selenium Client and Language Bindings
JSON Wire Protocol over HTTP
The browser drivers are servers that implement the JSON wire protocol and they know how to convert the Selenium commands into specific browser’s proprietary native APIs without revealing the internal logic of browser’s functionality. The browser drivers which are used along with the Selenium client libraries are ChromeDriver, FirefoxDriver, EdgeDriver, SafariDriver, OperaDriver, HTMLUnitDriver and GhostDriver. The browser drivers act as servers and receive HTTP requests from the selenium client in the form of URLs and send HTTP responses back to them thereby implementing the Client-Server architecture through the JSON Wire Protocol.
The web browsers are software programs that allow users to locate, access and display web pages as well as other contents created using HTML (Hyper Text Markup Language) and XML (Extensible Markup Language) languages. All the executions of the Selenium commands are performed in the Web Browsers (Chrome, Firefox, Edge, Safari, Opera and Internet Explorer) through their respective browser drivers which act as middlemen.
Let’s now understand the flow through an example.
Suppose you write the below Selenium code (using its Java binding) in an IDE (Integrated Development Environment) of your choice.
WebDriver driver=new ChromeDriver();
Once you run this code, Chrome browser will get launched and you will be navigated to the home page of google. Internally, what happens is that – Every statement of the code gets converted to an URL with the help of JSON Wire Protocol over HTTP. The URLs are then passed to the Browser Drivers. In the above case, the Java client library will convert the Java code statements to JSON format and communicate with the “ChromeDriver” browser driver executable file. The URL will look like below:
Every browser driver uses a HTTP server to receive the HTTP requests. Once the URL reaches the browser driver, then the browser driver will pass that request to the respective web browser over HTTP and the selenium commands will get executed on the browser. For an HTTP POST request, there will be an action on the browser and for an HTTP GET request, the response will get generated at the browser end and will be sent over HTTP to the browser driver. The browser driver will then send the response to the IDE via the JSON Wire Protocol.