This document provides an overview of WebDriver including:
- It describes WebDriver as a W3C specification for remotely controlling user agents through a REST API.
- It outlines the WebDriver architecture including drivers, bindings, and frameworks.
- It discusses the history and evolution of Selenium to the current WebDriver standard.
4. 4
Scope of this talk
Selenium
Language
Bindings
Selenium
IDE
Selenium
Grid
Selenium
the “brand”
Also known as
“Selenium WebDriver”
the W3C
WebDriver
Specification
use
uses
uses
we will focus
here in this talk
5. 5
a W3C Specification
https://www.w3.org/TR/webdriver/
WebDriver is …
“WebDriver is a remote control interface
that enables introspection and control of user agents.
It provides a platform and language-neutral wire protocol
as a way for out-of-process programs
to remotely instruct the behavior of web browsers.”
REST API
e.g. Web Browsers
JSON over HTTP
e.g. Selenium
… or even Mobile /
Desktop Apps
6. 6
Simon Mavi Stewart
@shs96c
W3C WebDriver Specification - Editors
David Burns
@AutomatedTester
Browser Testing and Tools Working Group
https://www.w3.org/testing/browser/
7. 7
WebDriver in your stack
Browser
DriverBindings
JSON over
HTTP
WebDriver
REST API
Framework
FireFoxgeckodriverSelenium (Java)JUnit
ChromechromedriverSelenium (Ruby)WatirRSpec
WebdriverIO or Nightwatch.jsMocha
Emulates User
Controls Browser
Exposes REST
API
Manages
Sessions
Language Specific
WebDriver SDK
Driver client
Higher Level API
Test Assertions
Run / Reports
Config / Tags etc.
Selenium (Python)
Selenium
Base
Pytest
Implemented By
Browser Vendors
8. 8
History : Selenium WebDriver W3C
Selenium 2
Released
2011 W3C Specification Work
2018 W3C
Recommendation
Selenium &
WebDriver
Merge
Selenium
Core
Browser
Selenium
RC
WebDriver
Se
IDE
2008-9
(Selenium Grid not
shown for simplicity)
https://twitter.com/shs96c/status/1060117993898807301
Feb 2008
Selenium
Core
Browser
Selenium
RC
Se
IDE
2004
Jason Huggins
Paul Hammant
Selenium
Created
2016
Selenium 3
Released
Browser
WebDriver
(drivers)
JSON Wire
Protocol
Selenium
(bindings)
WebDriver
(W3C drivers)
Browser
Selenium
IDE (new)
Selenium
(bindings)
W3C
Protocol
2006-7
Simon
Stewart
WebDriver
WebDriver
Created
Browser
9. 9
Method URI Template Command
POST /session New Session
DELETE /session/{session id} Delete Session
GET /status Status
GET /session/{session id}/timeouts Get Timeouts
POST /session/{session id}/timeouts Set Timeouts
POST /session/{session id}/url Navigate To
GET /session/{session id}/url Get Current URL
POST /session/{session id}/back Back
POST /session/{session id}/forward Forward
POST /session/{session id}/refresh Refresh
GET /session/{session id}/title Get Title
GET /session/{session id}/window Get Window Handle
DELETE /session/{session id}/window Close Window
POST /session/{session id}/window Switch To Window
GET /session/{session id}/window/handles Get Window Handles
POST /session/{session id}/window/new New Window
POST /session/{session id}/frame Switch To Frame
POST /session/{session id}/frame/parent Switch To Parent Frame
GET /session/{session id}/window/rect Get Window Rect
POST /session/{session id}/window/rect Set Window Rect
POST /session/{session id}/window/maximize Maximize Window
POST /session/{session id}/window/minimize Minimize Window
POST /session/{session id}/window/fullscreen Fullscreen Window
GET /session/{session id}/element/active Get Active Element
POST /session/{session id}/element Find Element
POST /session/{session id}/elements Find Elements
Method URI Template Command
POST /session/{session id}/element/{element id}/element Find Element From Element
POST /session/{session id}/element/{element id}/elements Find Elements From Element
GET /session/{session id}/element/{element id}/selected Is Element Selected
GET /session/{session id}/element/{element id}/attribute/{name} Get Element Attribute
GET /session/{session id}/element/{element id}/property/{name} Get Element Property
GET /session/{session id}/element/{element id}/css/{property name} Get Element CSS Value
GET /session/{session id}/element/{element id}/text Get Element Text
GET /session/{session id}/element/{element id}/name Get Element Tag Name
GET /session/{session id}/element/{element id}/rect Get Element Rect
GET /session/{session id}/element/{element id}/enabled Is Element Enabled
POST /session/{session id}/element/{element id}/click Element Click
POST /session/{session id}/element/{element id}/clear Element Clear
POST /session/{session id}/element/{element id}/value Element Send Keys
GET /session/{session id}/source Get Page Source
POST /session/{session id}/execute/sync Execute Script
POST /session/{session id}/execute/async Execute Async Script
GET /session/{session id}/cookie Get All Cookies
GET /session/{session id}/cookie/{name} Get Named Cookie
POST /session/{session id}/cookie Add Cookie
DELETE /session/{session id}/cookie/{name} Delete Cookie
DELETE /session/{session id}/cookie Delete All Cookies
POST /session/{session id}/actions Perform Actions
DELETE /session/{session id}/actions Release Actions
POST /session/{session id}/alert/dismiss Dismiss Alert
POST /session/{session id}/alert/accept Accept Alert
GET /session/{session id}/alert/text Get Alert Text
POST /session/{session id}/alert/text Send Alert Text
GET /session/{session id}/screenshot Take Screenshot
GET /session/{session id}/element/{element id}/screenshot Take Element Screenshot
WebDriver Commands
Session
Management
Navigation
Title
Window
Frame
Window Size
Get / Find Element(s)
Find Element(s) from Element
Element State
Element Actions
Page Source
Execute Script
Cookies
Keyboard / Mouse / Touch
Alert / Dialog
Screenshot
10. 10
curl -d '{"desiredCapabilities":{"browserName":"Chrome"}}' -X POST
http://localhost:9515/session
curl -d '{"url":"https://github.com/login"}' -X POST
http://localhost:9515/session/{session id}/url
curl -d '{"using":"css selector","value":"#login_field"}' -X POST
http://localhost:9515/session/{session id}/element
curl -d '{"value":["hello"]}' -X POST
http://localhost:9515/session/{session id}/element/{element id}/value
Demo: Browser Remote Control with cURL
First, start chromedriver, default port: 9515
11. 11
https://github.com/intuit/karate
• API Testing
• API Mocking
• API Perf-Testing
Example Framework - Karate
https://tinyurl.com/karatejp
Takanori Suzuki
Open Source Test Automation Framework
@KarateDS
L
12. 12
https://tinyurl.com/karatedriver
W3C WebDriver support in Karate (Alpha)
BrowserDriverBindings
JSON over
HTTP
WebDriver
REST API
Framework
Chromechromedriver
Windows AppWinAppDriverREST
HTTP
Client
W3C
WebDriver
Adapter
Karate
Script
(Gherkin)
Karate
Core
(Java)
Since Dec 2018
From Microsoft
14. 14
Drivers
Target Driver
Chrome chromedriver https://sites.google.com/a/chromium.org/chromedriver/home
FireFox geckodriver https://github.com/mozilla/geckodriver
Safari safaridriver (Mac) https://webkit.org/blog/6900/webdriver-support-in-safari-10/
MS Edge MicrosoftWebDriver (Win 10) https://docs.microsoft.com/en-us/microsoft-edge/webdriver
(Windows Apps) WinAppDriver (Win 10) https://github.com/Microsoft/WinAppDriver
Internet Explorer IEDriverServer (Win) https://github.com/SeleniumHQ/selenium/wiki/InternetExplorerDriver
Only driver maintained by the Selenium
team (planned end of support: Jul 19)
Jim Evans | @jimevansmusic
also known for the Selenium +
WebDriver compliance tests
http://webdriver-herald.herokuapp.com
21. 21
References (1 of 2)
WebDriver GitHub: https://github.com/w3c/webdriver
Selenium WebDriver (New Documentation): https://seleniumhq.github.io/docs/wd.html
The Architecture of Open Source Applications: Selenium WebDriver - https://www.aosabook.org/en/selenium.html
Selenium 1 / Remote Control (RC): https://www.seleniumhq.org/docs/05_selenium_rc.jsp
Selenium History: https://www.seleniumhq.org/docs/01_introducing_selenium.jsp#selenium-history
Selenium History: https://www.seleniumhq.org/about/history.jsp
GTAC 2007: Huggins & Stewart - Selenium-RC Vs WebDriver: https://www.youtube.com/watch?v=Vlz-WmcrBL8
Happy 10th Birthday, Selenium (by Paul Hammant): https://www.thoughtworks.com/insights/blog/happy-10th-birthday-selenium
The Faces Behind Selenium: https://smartbear.com/blog/test-and-monitor/the-faces-behind-selenium/
Selenium User Meetup 2008 at Google (lightning talks, venue of photo on slide #7): https://youtu.be/EDb8yOM3Vpw
Summary of the above event (Matt Raible’s blog): https://raibledesigns.com/rd/entry/last_night_s_selenium_users
Selenium Contributors (old site): https://www.seleniumhq.org/about/contributors.jsp
WebDriver W3C Draft Announcement: https://www.w3.org/News/2012.html#entry-9496
22. 22
References (2 of 2)
Selenium 2 Announcement: https://seleniumhq.wordpress.com/2011/07/08/selenium-2-0/
Selenium 2 to 3 changes: https://saucelabs.com/blog/move-over-selenium-2-why-its-time-to-upgrade-to-selenium-3
Selenium 3 Announcement: https://seleniumhq.wordpress.com/2016/10/13/selenium-3-0-out-now/
Selenium 4 changes: https://www.datio.com/qa/selenium-4-is-coming/
Selenium State of the Union (Se Conf Chicago 18): https://youtu.be/Qlt5YUGmN1Y
Selenium 4 Project Board (GitHub): https://github.com/SeleniumHQ/selenium/projects/2
Sauce Labs Selenium 4 FAQ: https://saucelabs.com/blog/frequently-asked-questions-about-selenium-4
Selenium and WebDriverIO: https://medium.com/@specktackle/selenium-and-webdriverio-a-historical-overview-6f8fbf94b418
Evaluating Cypress and TestCafe: https://medium.com/yld-engineering-blog/evaluating-cypress-and-testcafe-for-end-to-end-testing-fcd0303d2103
Comparison of E2E Testing Tools: https://blog.scottlogic.com/2018/01/08/pros-cons-e2e-testing-tools.html
Selenium Atoms: https://firefox-source-docs.mozilla.org/testing/marionette/marionette/SeleniumAtoms.html
Watir History: http://watir.com/history/
Notes de l'éditeur
A W3C Recommendation means that is is a standard.
The official definition as it appears in the spec is shown here.
I break it down to explain what it means in other words on the right.
Many others contributed via the working group. Refer to the link provided at the bottom.
This slide has animations.
Please note that C# / .NET is not included for simplicity !
These are just examples and not meant to be comprehensive.
Things like the Selenium grid and IDE are not covered here.
The components of a typical WebDriver based test-framework are shown, and their relationships with each other.
Note how the critical work of controlling the browser is the responsibility of the browser vendors.
The Selenium project can focus on the client-library and language-bindings.
JSON over HTTP is “platform neutral” which means you can mix and match the left side and the right side.
Note how you would need a framework to handle typical “test automation” responsibilities such as assertions and reporting.
There are different options you can choose from for each programming language.
Note that some of the bindings do not depend on Selenium (WebdriverIO, Nightwatch.js etc) - and directly implement the W3C JSON protocol.
This slide is heavily animated and tries to tell the story of Selenium and how we got to today – with the W3C specification.
Shinya Kasatani is the creator of the original Selenium IDE who is Japanese. I don’t think he is active now. He is 5th from the left.
Jason Huggins is on the extreme left. Simon needs no pointing out Paul Hammant is 4th from right.
This map omits the Selenium Grid for simplicity.
Selenium has been around for a long time ! So many people have contributed in different forms.
When this presentation is released, there is a big list of references at the end for those who would like to read more about all this.
Specifically refer: https://www.thoughtworks.com/insights/blog/happy-10th-birthday-selenium
These are all the REST API operations that the specification defines. I have tried to group them into different types of actions.
It is clear that you have many ways to control a browser and emulate a user.
Demo time ! You can easily try this at home.
I personally think HTTP + REST and JSON is one of the simplest technologies around – which is why it is so effective and popular at the same time.
You will see how using simple commands and some JSON ”POST” requests – you can control Google Chrome.
At the end we should be even able to enter text into a login field via remote control.
Before moving on, I’d like to introduce the framework which I’ve been working on – some of you may be using this already.
It is a unique framework that supports a 3-in-one feature-set of API functional tests, mocks and even performance tests.
It has a lot of users now and one nice thing about it is that it does not need much programming experience.
Special thanks to Suzuki Takanori-san who has created a very detailed presentation on it – the link which I am providing for your reference.
So Karate has implemented a W3C WebDriver client which is still in an Alpha / experimental status.
We do not recommend that you use this in production yet - but if you can contribute, that would be great !
In the demo I am about to show (continuation from the last slide animation) – we will automate a Windows application – which shows how flexible and powerful the WebDriver specification is.
Some of the things you expect from a framework are reporting and logging – so that you can troubleshoot things easily when needed.
It is also good to be able to have the ability to step through, debug and even re-play actions or test steps.
Using the same concept we should be able to even automate mobile applications.
(Depending on time - we will also demo parallel execution of a cross-browser test, this shows the responsibility of the framework to do these kinds of things – e.g. parallel testing)
This slide is self explanatory.
Jim Evans is just one example of the many who work hard in their own time on Selenium – as Simon mentioned in his keynote yesterday.
Please continue to support everyone ! Be patient and contribute wherever you can.
A lot of hard work goes into implementing the spec (by the browser vendors) and validating that they are compliant.
And the Selenium team has done a great job influencing the browser teams to do the right thing and in a timely manner.
Here we see that the ChromeDriver has just checked in the change for defaulting to W3C mode – something that the team were pushing for – for a long time.
This just happened last week.
John Jansen is a manager on the Microsoft Edge team at Microsoft.
So good news, WebDriver support will be possible for Edge even in the future !
Please note that this is not a complete list.
But it gives an idea of the difference between Frameworks and Bindings – and that there are multiple options
You would have heard about these from Simon’s keynote – but just a summary.
TODO: add / revise in real time.