Pseudo Test Driven Development with Selenium and Page Object Model for developers and QA

2 Aug

A thought came to me today and I did a quick search online. There doesn’t appear to be much articles about test driven development with respect to Selenium. And what info there is is too brief (e.g. presentation slides) or just high level and developer process oriented. Both those approaches just seemed common sense and didn’t cover the heart of the matter. So I’ll write about it from my perspective giving some details from both the developer and QA/tester sides on how to use Selenium effectively for both parties, and not reiterate (as much as possible) what’s already covered in other online articles on this subject (at the time of this post). What I cover should be useful reference to both developers and QA/testers alike, and perhaps even other groups. Though pardon the long descriptive post to get at the heart of the content, got to clearly describe it all…Here me out til the end before you voice any disagreements or consider this post a crap of a rant.

First, to the reader, I will just assume you generally know what Test Driven Development (TDD), unit testing, (user) acceptance testing, Acceptance Testing Driven Development (ATDD), Selenium, and Page Object Model (POM) are. If not, read up on those topics beforehand.

Let’s start from TDD and unit testing. TDD is usually associated with unit testing and then may expand from there. Unit tests cover testing input & output (I/O) of individual software components/modules that perform (ideally) a simple function. The next tier up is integration tests that cover interactions between the units (i.e. components/modules) and the respective overall I/O between the units as a system. When the integration test covers all the components that form the system the it may be considered a system integration test. The final tier is the user interface level where we create UI automation tests that are often also (or then extended to create) acceptance tests.

So with all that in mind, how does Selenium (and similar UI automation tools) fit in with test driven development? Now this question will only apply to the front end codebase as Selenium can only test for that. In terms of back end code, we do nothing in terms of Selenium and have to wait until back end dependencies are addressed/fulfilled.

For the front end, we should treat building the UI itself as a collection of units, whether they are visible in the UI or not. For the javascript portions (sometimes not visible), you can cover that with javascript unit testing tools. You can also cover it to some extent with Selenium/WebDriver by using it’s JavascriptExecutor feature to execute javascript on the page. That feature can also be useful for integration/system test of javascript when all the javascript + CSS + HTML code is combined to render the page UI.

In terms of the visible portion of the UI units – for simplicity, let’s consider them as widgets, headers, footers, components that fit together to form the page. As such, you should test with Selenium the same way you do unit testing. For each UI unit, you have Selenium code that will test the functionality of that unit in terms of I/O. Input would be clicks, mouse overs, typing text, etc. Output would be things like what element states change, what elements get hidden/unhidden, what events get triggered, do you go to another page (from link/button/menu, form submission, etc.).

Ok, but you need to use Selenium with a test framework or build one out so you don’t duplicate code over and over again for such unit-like tests for Selenium. True. Then you might ask, isn’t this overkill and we should use Selenium at the higher level and when UI is ready? We’ll cover that a bit later. In terms of framework, yes we need one. I leave it to the reader to figure out what framework is needed, since each organization’s needs are different.

With Selenium base unit testing idea covered, integration testing with Selenium is similar, just expanding to interactions between visual units and the I/O resulting from those interactions. And system integration testing = traditional UI automation followed by automated acceptance testing.

Now, with all that kind of covered, how should you then implement those different levels of testing with Selenium? Well, I see 2 approaches:

* You be crazy and actually implement unit test like coverage of your UI components using some Selenium test framework. For QA, think of this a really small & short Selenium tests that take less than a minute to run. But we can end up with many Selenium tests in the end (UI unit tests + UI integration tests + normal QA Selenium tests). This approach is fine if you don’t mind large test suite to run and have the bandwidth to write and maintain the tests.

* You don’t write Selenium tests at the unit and integration test level, but you develop your Selenium framework (especially when using page object model) with the unit test and integration test level mentality so that you can be able to write better Selenium tests and do more test coverage efficiently. This is the approach I advocate.

Now, for my advocated approach you might say, can you clarify? Let’s begin by referring to the page object model. A page object represents all the functionality on a page and all its visual elements. So all the logical things you can do on the page, you model with methods/functions, and all the UI elements are translated as element locators for Selenium (property members of the page object class).

Ok, so you may or may not already know that so then what? Page object model is great, problem is not everyone uses it to maximum efficiency. For those of you who do (or near it) congratulations. For those who don’t, you got work to do. What do I mean by maximum efficiency? I’ve witnessed that some people and organizations tend to have a golden path, happy path, basic sanity test coverage mentality when building out test automation. So they build into the Selenium framework only the methods and locators they need at the given time they are writing tests to cover some portion of the page only. Meaning a lot of the functionality of the page is left out and filled in over time as more tests are written to cover them, if that ever happens. This choice makes sense for resourcing constraints, and not writing unused test/framework code but is a terrible design choice, at least in my humble opinion.

The proper TDD way to do it is you build out the page object as you are building out the UI itself. In an ideal paired programming or agile environment where team members collaborate and you don’t have “waterfall” development process, the (front end) developer builds the UI and simultaneously, a QA member will build out the matching page objects for the UI. Alternatively, the developer could also be the one building both the UI and the Selenium page objects. I don’t advocate building the page objects after the UI is ready, you just can’t keep up that way, unless of course you follow the route I dislike where you only implement portions of the page to cover only what you need at the time.

But you can’t build page objects without a working UI or can you?! Wrong if you thought no. Before you get a working UI, you usually have mockups and wireframes and other specifications that you can go by. The good specifications also cover the interactions of the page/site flow and interactions between elements. Using that info, you can define the page object’s model: what methods should it offer, what element locators it should have, how the locators might be utilized by the methods. Filling in the actual logic to make the methods work and what the locator values actually are can be done as the UI gets implemented, piece by piece. Without really thinking about it, in this way, you are essentially building a custom (page object Selenium) API for consumption by Selenium tests (as the clients of the API). Another way to look at it is you are also building a UI consuming client that consumes the UI (acting as a server in a client-server application model).

With my analogies in mind, things get more interesting. Unless you have a really smart QA/tester (who is well versed in those analogy types) and/or one who is developer-oriented, you may have problems with the QA/tester building out a maximally efficient Selenium framework (or the page objects) that mimick what you can consider an API or a client of something. Reasoning is they lack the technical experience or mindset to model the page objects properly – you can certainly get by with the deficient model, it just won’t provide as optimal a test coverage as you can actually get. In a sense, working on the Selenium framework and page objects are really meant for QA people in the roles of QA architects and Software Development Engineers in Test (SDET), while writing the tests that utilize the page objects can be delegated to the other QA roles of QA automation engineer (a bit vague a role/term to me), QA tester, software QA engineer, etc. The roles of QA architect and SDET of course are for QAs who have developer type background, which surprisingly most QAs tend to lack – some may have developer skills and even engineering or Computer Science degrees but lack the depth of knowledge desired compared to a developer. In essence, by developer type background we mean a QA who is capable of being a developer and doing application development (not just working with Selenium and test frameworks) but chooses to focus on the testing aspect rather than the core application development aspect, hence the SDET name is quite literally correct.

Now, if the organization lacks SDETs and QA architects and doesn’t have the luck (or resources) to attract such people, you can make up for that loss (where possible) by shifting your developers to take on some (but not all) of the testing work. With good object oriented programming and software development experience, developers can build out the page objects that the less experienced QAs can consume by creating actual automated tests cases. Because once you have a fairly readable and easy to interpret page object model (i.e. designed well), less technical people can easily write scripts, etc. that call the page object methods and pass in the needed test data, something which quite some postings online talk about. When calling on your developers to help out in this area, I would say that those who are software architects, design APIs (or maybe consume them too), develop client-server software, or are good with designing classes would be best at designing your Selenium page objects.

On a related note, implementing page objects are best done by those with a development background, but defining the page object method signatures (the API or input/output) can be done by other members like User Interface/Experience folks or the business analyst, or QA etc. basically someone with good modeling skills. A bad developer can easily define a bad page object model.

Before we continue, let’s briefly come back to the thought of why QAs with less technical experience create deficient page objects. Reason 1 – they may lack the architect/developer mindset to properly define method behavior and parameterize methods. Meaning methods use default static test data and/or cover a specific case only and isn’t generalized to handle different cases. They may add more methods for other cases and at some point figure out the y need to refactor and simplify for code reuse. Bear in mind that all the presently possible cases to cover are already defined by the UI (and hence the matching page object model), so if one knows all the possible cases, when writing a method for at least one of the cases, who in their right mind would not build the method in a general way to handle all cases and parameterize it? Given time constraints, no one said you could not leave the other cases unimplemented (do nothing actions or throw exception about not being implemented yet), only handling what is most important at first. This way you leave a well designed framework with minimal refactoring later on, easier to fill in the missing code later on too. Reason 2 – lack of element locator definition skills. To achieve good page object modelling, especially for complex javascript library based web applications, and also when developers don’t cater to QA ideal requirements (more on that later), you often need to resort to XPath and CSS selectors to achieve locator templating and parameterization for effective modeling and miniming lines of code from many static value locators as the alternative. And due to CSS limitations around node text “contains” matching and parent/child/sibling node axis traversal, you often have to go with XPath. Many QAs only have basic knowledge of XPath and CSS and fail to construct good XPaths or CSS that are templatized and can be used with variables and/or chained with a prefix, suffix, infix, to define the final value of a locator at runtime. Developers tend to be better at working with XPath and CSS, since they work with it in the web applications. WebDriver also indirectly offers an object oriented way to use a WebElement to traverse back up to its parent/ancestor or down to its children, etc. though it’s not well documented and requires some technical knowledge to make use of. Reason 3 – they may lack experience with javascript as in programming in javascript. WebDriver and even Selenium RC exposed a javascript execution interface to execute arbitrary javascript code. Which comes in handy for browser limitation workarounds, missing functionality that Selenium doesn’t offer, or for extra (lower level) test coverage by executing javascript code. Without js knowledge QAs either have to search online for js tricks or consult their developers to get some js code to use. Some people advise against executing js code for testing, but there is one area it comes in handy too, to fetch test data that you’d otherwise have to find other means to such as “screen scraping” the UI or pulling from the database or external API when it is already offered in the web application via a javascript API or hidden DOM elements on the page.

Now let’s briefly go over why have your developers help out with the page objects and even Selenium tests? Because they have more technical experience that comes in handy for the modeling and heavy programming portions of Selenium page object and framework setup. And also you force the developers to eat their own dog food (i.e. consume their web application for testing) and they see how horrible locating elements can be with XPath and CSS – it can be done, but sometimes the string is long, a bit messy, and complex though still technically “beautiful” when well defined. And we know with developers tending to be lazy (like using jQuery over direct DOM/document object) they may want to finally make their apps use unique IDs, names, classes, or other attributes to avoid using XPath and CSS for locators.

And to finally conclude it, why build out page objects together with the UI development? If ou didn’t get it from reading so far, it’s because this keeps the functionality in sync between what’s available in the app and what you are/can/will be testing. The keeping in sync part also ensures any changes to the UI causes a timely matching update to the page object code (and updates to any affected tests that consume those page objects). Doing this also ensures proper test coverage from the start. By building out the page object early on, you already build out the infrastructure to support test coverage whenever you get around to adding it. To clarify, you need to cover A, B, and C but you only have time for A right now. If you build out the page object to support B & C, you may have no tests now, but adding a test for them later is easy, simply chain together calls to the right page object methods with the right test data, do some test runs and fix as needed and you’re good. Without page object build out early, you then have to go back and add them in then write tests which mean more work later on just to save on work now. Do you want to pay later or pay now? One analogy to this is when you build houses (and perhaps anything else) you build from a good foundation (bottom up, not top down starting with the roof first). And you complete the house as a whole. Not build in pieces with the house still missing rooms/walls (hard to live in). You build it fully and hopefully perfect from the start so you don’t later have to say upgrade/trade up (e.g. switch test frameworks) or do some remodeling or rebuild the house later on (e.g. refactoring or framework/page object code redesign). With the complete page objects in place, you also know what test coverage you are missing based on unused code. Whatever page object methods and locators exist but are not used means at some point you need to write tests to cover it. So you don’t later have to do assessments on what test coverage you may be missing. If you run code coverage tools, if you can run them against the Selenium framework code, that could also give you such stats. It’s also good to build out page objects early on with the UI since the project is fresh in everyone’s mind so you know what functionality there is and what to cover. When things get deferred you forget over time. Plus when you work on related things, even for page object code, doing them together at once is faster and better designed than do a little now and come back to it for the rest later when you might forget some stuff. In my experience, I’ve found that writing missing tests from existing well defined methods is a whole lot easier and faster than having to create the missing page object code later on before writing the missing tests as well.

And some final words, while I talk about Selenium and page objects, this approach can apply to other types of frameworks without page objects (just use object oriented or functional programming design) and other types of applications (desktop, mobile, APIs, etc.). For the QAs who are less technical/developer-oriented, hopefully this post has given you insight on how to do better with Selenium. For developers, hopefully this post has given you insight on how to help QA out.

Writing Selenium automation and tests in and of itself is writing software. Hence, just as with software, you want to do TDD with it too, and I’ve described how I would do it in a TDDish way.

Update 01/29/2014: Came across a similarly useful post:

Update 06/11/2014: came across a post that used the build a house analogy with regards to testing:


One Response to “Pseudo Test Driven Development with Selenium and Page Object Model for developers and QA”

  1. Abhay Bharti August 18, 2014 at 12:07 pm #

    Very informative article..

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Tea in Eighteenth-Century Britain

History of Tea Project at Queen Mary University of London

Phimosis - A Simple Cure (That is working for me)

No nonsense, no adverts, no sales pitch, just honest information.

Glass Onion Blog

Cheat sheets, post-its and random notes from the desk of a programmer

Abode QA

A Hub For Testing Minds...

The Test Therapist

Performance & Security Testing Blog


a programmer's hub

Let's Not Crash and Burn

it makes your brain tingle


“Incinerate Ignorance”

Anastasia Writes

politics, engineering, parenting, relevant things over coffee.

One Software Tester

Trying To Make Sense Of The World, One Test At A Time

the morning paper

a random walk through Computer Science research, by Adrian Colyer

RoboSim (Robot Simulator)

Visualize and Simulate the Robotics concepts such as Localization, Path Planning, P.I.D Controller


open notebook

a happy knockout mouse.

my journey into computer science

Perl 6 Advent Calendar

Something cool about Perl 6 every day


Inspire and spread the power of collaboration

Niraj Bhatt - Architect's Blog

Ruminations on .NET, Architecture & Design

Pete Zybrick

Bell Labs to Big Data

%d bloggers like this: