Prerequisites: Javascript training (other programing languages can be used with Selenium but for this tutorial we are using node.js)
Foreword: This is meant to be an introduction to automation and selenium. A lot of the steps are repetitive by design to help you memorize commands and also to get you thinking about ways to improve your code and testing strategy.
Selenium is a set of tools that automates a browser. Selenium has been around for over 15 years. It’s evolved a lot over time.
The very first version of selenium required a tool called Selenium RC. The tool worked by opening up a web page that installed a core javascript library that could receive commands. That page would then open the web page under test and then interact and send information back. However, this was limited due to the scope of javascript’s reach and it made some things impossible, like working with file upload controls or native pop-ups.
Selenium Server is a companion piece to Selenium RC. It allows remote machines to connect with the server which then can be used to control or drive the browser on various server hardware. This enables Selenium based test code to be run on multiple machines with potentially different configurations, operating systems, browsers, browser versions, plug-ins, and any number of other possible variations.
Selenium Integrated Development Environment is a plug-in for Mozilla Firefox that allows you to record your actions on a page and then play them back. It produces commands that are understood by Selenium RC. It can be used to document defects and it can be used to record exploratory testing. But, it is definitely not suited for large scale automated web testing. It’s outdated now and support seems pretty limited so use this with caution.
Selenium WebDriver is also known as Selenium 2.0. It is the result of the merging of an open source project which was named WebDriver. It was created by a Google engineer who was an admirer of the Selenium project but needed to get around those javascript limitations. What it does is to create the ability to directly program interactions with the browser instead of interacting with the browser’s javascript sandbox. The merge kept the mature API inherited from Selenium RC.
Selenium Grid is a smart proxy server that allows for communicating with a cluster of machines. This allows for parallel execution of tests. It also enables tests to be run against environments with a variety of specific configurations. It has been used commercially by companies to offer test execution environments as a service.
You can think of Selenium as an umbrella project for a range of tools and libraries that enable and support the automation of web browsers.
Fifteen years is quite a long time so you may encounter terms or solutions that are no longer relevant.
The Current State of Selenium
Selenium RC - has been deprecated. This is also sometimes referred to as Selenium 1. It is still supported by Selenium 2 but you should definitely be using Selenium 2 (Selenium WebDriver)or higher.
Selenium 3.0 - the current release. It has removed the Selenium core ability. The Selenium IDE Firefox plugin doesn’t seem to be updated to work with Selenium 3.0.
Selenium Server and Selenium Grid - as far as server side solutions go, Selenium Server and Selenium Grid have now merged into a single product called: Selenium 2.0+ Standalone Server
Install Node: If you’ve never used node before, don’t fear. Node is a javascript runtime that allows you to write javascript applications and run them locally as well as on a server. Let's make sure we have Node installed:
Open your terminal and type in where node
(PC) or which node
(MAC) then press Enter. If you have Node already the terminal will give you a location, if you don’t have it, it will give you some kind of error message. If you don’t have Node, use the following instructions to install it: Windows or MAC
Once you have node installed, you can check to see what version of node you are running by entering node -v
into the command line. Ensure that you are running at least version 6.x.x or higher. If not, update your version by following the instructions for updating on the pages linked to in Step 1, above.
Now we want to make a directory or folder to store the code that we will be working on during this tutorial. You can call this directory whatever you want. I am going to call it Selenium Basics. From the command link enter mkdir selenium-basics
Once you create the directory you can change your current directory to this new directory by entering cd selenium-basics
Since we have Node installed we also have NPM (node package manager) which allows you to install node based modules. We want to install one of those open source modules, the one for Selenium. From the command line, in your selenium-basics directory, enter npm install selenium-webdriver
. It’s going to go and download that stuff and it’s going to give us some warnings. These warnings are basically just because we haven’t yet made our folder a package. What it did do was install a bunch of node modules.
We can see the node modules that we downloaded by using ls node_modules
(MAC) or dir node_modules
(PC).
Now that we have the package installed we actually need to install the webDriver, the Browser Specific WebDriver. Let’s start with the Chrome version. We’ll install this globally so you don’t need to do it over and over again. Enter npm install chromedriver -g
. The “-g” part installs this globally. You need to run this command as an administrator or “root” on a MAC you would append “sudo” to the command so it would be sudo npm install chromedriver
and on PC you should just open an administrative command prompt window. You can do this several ways, one way is to select Run As Administrator from the start menu after you search for Command Prompt.
Once we have it installed we can enter chromedriver --version
just to make sure that it is working and is really installed.
I have created an app for us to use for testing. Please download the entire folder here and save it somewhere that you can remember, like on your desktop.
Open the app in any browser by selecting CTRL-O and opening the index.html file located within the folder. You can also just double click the file and choose which browser to open with from there.
Test out the webpage a bit to see how things work. Add a new Invitee, Edit, etc. As you can see there is a little bit of interactivity going on here, and we can automate it!
Copy the URL (your URL will be unique to your computer)
Go back over to your terminal or command line. Before we start, enter npm install chromedriver
again to make sure you are still up to date. Node offers what is known as a repl (read eval print loop). It basically allows you to build a program up as you go and works great for this sort of exploratory coding. To start you just enter node
First we need to grab the module we installed. So we will just use javascript in the command line/ terminal to do that. Selenium is the name of the package. Enter const selenium = require(“selenium-webdriver”);
The package lets you use the builder design pattern to configure the driver that you exactly want. You call build on the builder and it returns the right version. We will call our variable “driver”and then we just ask the builder for what we want. Right now we just want Chrome. Enter const driver = new selenium.Builder().forBrowser(“chrome”).build();
(if this is giving you an error message please see section 1 of troubleshooting for known Windows issues at the end of this tutorial)
Now we want to set up a URL and paste in the URL that we copied in step 4. Enter const url = “http://yourURLthat YouCopied ”;
Replacing the URL here with your URL
Now we go get the webpage by entering driver.get(url);
this will give back a whole bunch of information but it should also have opened up an instance of chrome (this instance may have already been opened in step 7 for windows users) This instance will now have our web page that we will be using for testing. You can see that there is a message stating that “Chrome is being controlled by automated test software.”
To close these instances of the webpage and chrome you will enter driver.close();
To get out of the REPL you press CTRL+d. You can close these at any time and REPL will remember what you typed in so you wont need to type all that in again. Why not close everything out and take a break? :)
Get REPL back up and running by opening your command line and navigating to the selenium-basics folder. After you are in the selenium-basics folder, type in node
[ENTER]
Let's open up our page again with const selenium = require(“selenium-webdriver”);
Then we build our driver with const driver = new selenium.Builder().forBrowser(“chrome”).build();
(remember to see section 1 of troubleshooting guide if having issues)
Then we get our URL: Open the app with CTRL-O from the browser and then select the app to open (from the previous step 4) Copy the url then set the URL: const url = “paste the url”;
Get URL: driver.get(url);
Now the page has been launched. Let’s try to automate this page. First let's try to type something in the Invite Someone text field from our code. Take a look at this field in chrome dev tools. Right click on the field> Inspect
You want to be in the Elements tab, find the line of code that corresponds to the element (you can tell it's the right one when the element is highlighted)
In order to get ahold of this element we are going to need to have a way to tell our webdriver where to find it. There are a few ways of doing this with webDriver. WebDriver understands xpath queries, which is a way of navigating a tree of xml, like we have here in our html document. Html is a subset of xml. We will go through more xpath stuff later in this course. The elements tab actually has a little built in tool that we can use to test our query.
If you press Command-F (mac) or Control-F (Windows) a search bar will appear. The search field understands xpath.
Xpath allows you to be very explicit about the elements that you are looking for. Let's build a really explicit path. To represent the route you start with a single slash. Type /html
into the search bar. Notice that the <html> tag is now highlighted.
The body is the next container in the path so we will add /body
to the path in the search field. The next container is /div then /header
then /form
and finally the first input /input The input field should now be highlighted.
So now we know that our xpath query matches our input field. So lets copy the xpath to the clipboard and then we can feed that to our driver. Copy the xpath and then you can close the dev tools window. Go back to REPL.
The selenium module exports a way to get to these locators. Typically you import it like this: Enter this in REPL: const By = selenium.By;
(this will give us a shorthand for using this later)
There is a method on driver called find element and it takes what as known as a locator, and there is a static method off of the By class, named xpath and that will take our query. (paste the xpath from the dev tools path that we found earlier) const field = driver.findElement(By.xpath(‘/html/body/div/header/form/input’));
This returns a locator that findElement needs and stores it in a variable (field)
Now we can actually send a series of keystrokes to that element. field.sendKeys(“I am a Tester at Ultra”);
We found our element to interact with but our locator was way too specific. Any change to the page could potentially break our ability to access that element. Detailed knowledge about the structure of the layout of the page is an automation anti-pattern. You want to be as specific as you can be but not too rigid as to require the entire structure of the document to remain unchanged. If you fall into that trap any change to the layout of the page will break the locator strategy. This “How do I be specific about an element without relying on the page structure” problem is common. We want to be able to specify the exact element that we are talking about. A typical solution to this is to add an identifier to your element.
A typical solution to use an identifier. In HTML this is the id attribute. In CSS we can reference the id selector or we could use get element = by id in JavaScript. So the good news is that, most of the time, the page you are trying to automate is already setup to have easily locatable elements because others have needed to access things either for style or interactivity. Selenium allows you to leverage that work.
Look at the Elements tab again and find this text input. There isn't an ID on the input tag but there is one on the form tag.
We will go back to REPL and create a new variable const form = driver.findElement(By.id(“registrar”));
Now we have a form element. We can't just send keys to our form though. This is a good time to look up some documentation. Do a google search for Selenium WebDriver API and add JS on there just to narrow things down. Click on this link https://seleniumhq.github.io/selenium/docs/api/javascript/index.html then on the left side click Modules and then selenium-webdriver
If you scroll down the list you will find the Exported Class, WebElementPromise. Click on that link to open the WebElementPromise page. If you scroll down this page you will see that there are some Instance Methods. Find the Method, getTagName() and click on it to open it.
You give this method a tag name and it returns a promise. The promise will be a name in the form of a string. So lets get the tag name for the form. Go back to REPL and enter form.getTagName().then(name => console.log(“Tag name is “ + name) );
Whenever a method returns a promise, there is a “then” on it. The tag name is returned as “name” and then we use the arrow function syntax and just print out the tag name using console.log. Tag name is “form” so there is our value.
Lets try another method. Go back to the selenium webdriver documentation and scroll down the methods page to find “getRect”. Click on this method to open.
Enter form.getRect().then(rect => console.dir(rect));
into REPL. We are using dir instead of log so the output is formatted more nicely. We have some good info outputted here.
We still haven’t accessed the input field of the form yet. We could try adding an id to the source code but since we don’t usually have access to that and if we did, we wouldn’t be able to alter it anyway, we need a different way to solve this issue. Besides that, this is generally a bad automation practice. There is another locator strategy that we can use to find this unnamed element.
Go back to the documentation webpage. Under selenium-webdriver, click on By and the By documentation page should open. Then scroll down to the Static Functions section. Click on By.name function to open the documentation. Lets use this function. All form inputs have a name.
In REPL enter driver.findElement(By.name(“name”)).sendKeys(“found by name”);
This will find the element by name (the element’s name is “name” as you can see in the Elements tab on the webpage. Instead of storing “name” in a variable, we can do a thing called Method Chaining which we just add on the next method which is sendKeys.
What if we wanted to get this element by the tag name? Go back to the documentation page for selenium-webdriver for By. In the Static Functions section click on By.tagName to open the documentation. Even though this function is deprecated there is a new version in the documentation that we can use. Unfortunately there is more than one “input” on the page so the tag name won’t work unless we find the parent element (form) first. So lets find the form element again like we did before. Enter into REPL: driver.findElement(By.id(“registrar”)).findElement(By.css(“input”)).clear();
Using chain method to get the input tag (this will get the first input tag under the parent element “registrar”) and then clear will clear the text that is already in there from the previous steps.
Lets try it again but lets send some text to the box this time. Enter into REPL: driver.findElement(By.id(“registrar”)).findElement(By.css(“input”)).sendKeys(“Found form and then found input by css”);
Locator best practices: 1) Concise 2) Non-ambiguous 3) Reliable Keep your locator strategies concise, the shorter the better. The less you have to specify the better. You also need to keep them unambiguous. The locator can match multiple elements so you need to make sure that you are specifying the one that you intend. But, be careful that you make sure that they remain reliable. You don’t want to be so specific that things are fragile. They page is going to change and you want to make sure that your locator survives the updates.
Let's interact with the RSVP form and add some invitees. If you have closed out REPL here is a reminder of how to set up your workspace for testing. Enter the following into REPL one line at a time
const selenium = require(“selenium-webdriver”);
const By = selenium.By;
require(“chromedriver”); //windows only
const driver = new selenium.Builder().forBrowser(“chrome”).build();
const url = “your test url”;
driver.get(url);
First we want to add a name to the input field and then click the submit button. Lets create a variable for the form first. Enter into REPL const form = driver.findElement(By.id(“registrar”));
or you can try finding the form another way using the selenium-webdriver documentation.
Now we can use our form variable and search within it for our field which is called “name”. Enter into REPL form.findElement(By.name(“name”)). sendKeys(“Your Name”);
Next we need to figure out how to click the Submit button. So lets Inspect the button element. Inside of our Form, the Submit button is the only button so we can do a find element on it. Enter into REPL form.findElement(By.css(“button”)).click();
we use method chaining so we don’t have to store that button variable and then we just use the click method to click it.
The form element already has a feature called “submit” which will find the submit button so we didn’t really need to find it. Try submitting another name. Enter form.findElement(By.name(“name”)).sendKeys(“Any Name You Want”);
and then form.submit();
In fact, you don’t even need access to the form. Let’s try submitting another name. Enter form.findElement(By.name(“name”)).sendKeys(“Any Name You Want”);
and then submit the form from the element form.findElement(By.name(“name”)).submit();
You should have three Invitees now.
In some instances you may want to retrieve values from the page. Let’s say we wanted to run a test that determines if the header changed to the right thing. Let’s retrieve the text in the header (RSVP), h1. We can grab the value that is in the h1 of this page. Let's look at the documentation for selenium again. Under selenium-webdriver> WebElement you can find the getText method. https://seleniumhq.github.io/selenium/docs/api/javascript/module/selenium-webdriver/index_exports_WebElement.html
Enter into REPL: driver.findElement(By.css(“h1”)).getText().then(txt => console.log(txt));
Because this method is a promise we must call “then”on it and it is expecting a string so we will just print that out to the console.
Section 1: I found some windows issues while executing step 7 of Selenium WebDriver 3.0: Creating a new automated test. Here are a few things to try. If you are getting a path not found error, try entering require(“chromedriver”);
before the code in step 6. This will require you to exit the REPL by using CTRL+d.. If the error persists, go back to the command line (CTRL+d) and enter each of these commands in order.
npm install selenium-webdriver
npm install chromedriver -g
npm install chromedriver --save-dev
Go back into node REPL node and then enter require(“chromedriver”);
Continue from step 6.
Happy Testing