Developing Selenium tests with proper abstraction

11 Jul

This post is with respect to page object modeling of Selenium tests but also applies to general Selenium test development when not using page object model. In the latter case, one may be using object oriented programming/design (OOP/D) or functional (but not necessarily object oriented) programming/design (FP/D), or not. I would advise if not using OOP/D, at least use FP/D.

In this post, I’m ranting from my experience, how some people do not use page object modeling properly (with some examples) and how it should be used. Pardon the long blocks of text & sparse use of formatting, it is a rant after all, I may try to pretty up the formatting in the future.

While one doesn’t have to adopt the page object model fully, there are some guiding principles one should follow.  Others may have differing opinions. This article may be a nice read for some of you who have experienced the frustration of what I see, and/or a nice reference for those not so experienced with programming particularly OOP/D and/or FP/D. From my perspective FP/D is same as OOP/D with the expection that you don’t organize your code into class objects with methods, instead you just have a set of functions (e.g. methods) to do things and in both cases, the methods/functions are parameterized and abstract away the low level details to describe a higher level view of functionality. Hopefully, in the larger community of Selenium testers, there are not many who actually do the things I rant about in this post. For those who actually do it, I’m assuming they’re either novice programmers or that they’ve been used to the the Selenium IDE and old school way of singular test command execution of writing tests.

Things you can choose not to adopt:

  • page objects returning other pages. This is a nice to have, but you can manage without it, you just have to manually instantiate the correct page objects in the needed flow/sequence rather than expect the starting page object to return the next page object in sequence for you to then continue with test. when returning page objects from a page object, you still would want to check that the page returned is correct (in correct state), then it wouldn’t be too much to expect the test writer to know what the correct page to check for and as such they can also manually instantiate the correct page if page object doesn’t return the next page. This is one feature our team chose not to adopt, why, I have no idea as I’m not the lead architect of our framework to spec’d it out. In OOP/D, class objects are not required to return other class objects, but they may. They could simply return basic types (int, string, boolean, etc.). So therefore, you can do same with page objects. Here’s some related posts about not wanting to return page objects from a page object:

Things you should adopt:

  • abstraction of page behavior from test behavior. Page behaviors being UI element locators, actions that can be performed on page, UI/page/element state. Test behavior being assertions performed, comparison of data, sequences of actions across pages. Following this allows for good, clean, modular test design that’s easier to read & maintain.

Problems observed with improper adoption of page object model:

  • UI element locators of page object made public and directly accessed by test to perform assertions against (text of locator, DOM/element attributes of locator), or to invoke actions against (e.g. click, type, select). It might be easy to do so, but then it makes tests aware of internal page UI and structure, which is not good abstraction and modeling of test. More brittle tests and code all over the place to update for changes. UI element locators should always be made private to the page object (or protected), and must be accessed only by methods of the page object or associated classes. If you need to access some locator (extract data or manipulate it) and find no method to do so, it means you need to create proper methods for it. In Java, you’d get warnings about private members that don’t get used by some method of the class. It’s actually ok though to leave private member locators defined with no methods that manipulate them as you can fill them in later on, though it is good practice to do it all at once (but who has the time to do the best case scenario anyways?).
  • Convenience methods added to page object. Page objects are meant to expose behavior of the page (or whatever the page represents as it can be a group of pages or section of a page). As such, the page object should contain methods that define particular individual logical actions on the page only (which may or may not reference other page objects). You should not create “convenience” methods that call multiple methods (which may or may not involve other page objects) just so that you have a single simple method to call in a test case to do a group of things and not have to write out a group of method calls & class object instantiations in the test cases. It makes sense to have convenience methods, but they should go in some common or base test case class that test cases are derived/extended from. They don’t belong in page object, unless it is actually logical as a behavioral action of the page. These are easy to notice when the method name or the implementation within the method indicates something like doThisThatFromPointAToGoToPointB().
  • Improper abstraction or defining of page object methods. For example, hard coding values to a page object method. This leans toward making the method a convenience method so that you can get through some step to test something else. Because you may later or should have tests that invoke the method with other data to test other use cases, therefore, the data used should never be hard coded. If you desire some hard coding, use/add as a separate additional convenience method (in common/base test case class) that calls the parameterized abstracted page object method in question with the predefined static input/data. Another example, is too granular, low level, atomic methods like clickThis(), clickThat(), because it is easy to do and is the old fashioned and Selenium IDE way to automate tests by per command basis. But ideally you want to abstract it up higher to something more functional or logical such as doThis() or goThroughSomeFlow() that actually perform the actions of clickThis() and clickThat() at the lower level. If you keep it too granular and low level, that defeats the purpose of OOP/D with page objects, as going so low level, you might as well go back to Selenium IDE style of executing singular Selenium commands and chaining them up to build a test.
  • adding test data into page objects like locators. For example, defining static variables for text on a page, and later using it to perform assertions or checking that the text exists on page. If the variables/text is used like a (or to define a) locator, then it is fine. But when you see it used like test data, then it should belong under a common/base test case class (or the test case itself) as a static variable rather than in a page object. Using like test data usually means it is used in assertion as the “expected” value rather than the “actual” value. If it’s treated as “actual” value then that’s ok, because then it is behaving like a locator that you’re extracting text from.
  • not using enums and constants where available in programming language used for automated test. This doesn’t directly apply to page objects, but does indirectly in terms of usage with page object methods (their parameter arguments or the internal method implementations). I’ve seen people use integer values like 0, 1, 2, or perhaps strings to specify option path to branch with (via if/else or switch case statement). While that’s fine, it makes the test design more brittle to updates and harder to interpret the meaning of the integer value. Strings are less of an issue but brittle should the string value change. While one can’t avoid use of strings and integers in actual implementation as some UI elements (or non-UI items but some type of data) will be based on string identifiers or identified by index or integer value, it helps at the interface/API level where you define the page object method signature (parameter arguments and return data type) to use constants and enums. Internally within the method, you use the enum/constant to lookup/translate to the proper string or integer value (e.g. SIDES.FRONT translates to “front” or COOKIE.OATMEAL translates to 2). This way, the test cases don’t need to have knowledge of the actual low level developer-centric values (of the web application or HTML source), it just needs to know on high level what logic we’re handling like which side of an object (front or back) or what type of cookie (oatmeal or peanut butter, without caring what the actual index position or value an oatmeal cookie is in the web application), which is part of abstraction with OOP/D. All the low level logic/values is handled within the page object methods.

Proper page object modeling

Here are some of my thoughts on what properly modeling a page object should be like. I didn’t do an extensive search on page objects online, but the example presented by the Selenium project is a little too bare/minimalistic to guide a novice tester not fully acquainted with OOP/D. Hopefully, what I present here is a bit more thorough in clarifying page object modeling with respect to OOP/D.

Any and all page objects should contain a set of the following types of methods that define the overall behavior of the page:

  • state reporting methods like isSomethingInSomeState(), isPageInSomeState(). Examples can be isColorSelected(“red”), isFontSelected(“Times New Roman”) or amIOnThisStepInwizard(2). With such methods, you can then call from test case to assert against expected state, the return value of the method being the actual value that you would compare against the expected. Alternatively, you might call the method to check the state to then perform the approach if/else/switch statement branching, such that if in state A do this, if state B do that. This type of method I’ve noticed some people miss. They either don’t create any such methods, because we generally aim for automating the happy/basic path test cases and don’t plan for robust regression testing that should involve state checking. The happy path tests generally assume state and indirectly test state via other actions (luckily that does happen), otherwise, there would be critical missed test coverage. Improper usage in this area I’ve seen is where instead of having a proper state checking method, people instead simply check an attribute or text of a locator (exposed publicly) as a way to check state.
  • read (or data extraction) methods like getThisFromThat() or getThisOfThat(), where we extract out data on page via page object to then assert against. The extracted data is the actual value to assert against some expected value. An example method could be something like getPriceOfCookies(COOKIES.CHOCOLATE_CHIP) where page shows a table of cookies and their prices, and COOKIES.CHOCOLATE_CHIP is an enum constant value defining chocolate chip cookies. The method internally uses the enum to figure out what locator to extract price value from. Improper use in this area I’ve seen is where people expose locators publicly and extract data off it rather than abstract it within a read method.
  • write (or action) methods like doThis(), doThat(), doTheseMultipleThings(), where the multiple things method could be invoking the multiple things in sequence (serially) or concurrently in parallel. This is the simplest type of method people would be comfortable with creating methods for to do certain actions or groups of actions on a page like filling in form fields, clicking buttons & links, selecting menu items that involve mouse over dropdown menus, etc. The only issues of bad usage I’ve seen here are the problems reported earlier (e.g. too low level like clickThis(), hard coded data, convenience methods).
  • parameterization of locators and methods. By parameterization for methods, we mean they should not have hard coded values. All input should taken as parameter arguments to the method, and page class member property constants/variables. Anything sort of hard coded in the method might be for template constructs to form a locator or test data. But nothing truely hard coded. By parameterization for locators, we mean where applicable, don’t statically define locators by a static ID/name/XPath/CSS/etc. With true OOP/D, you’ll notice that certain locators form a logical grouping like icons on page for red, blue, green color, or a dropdown menu of font names. If you carefully inspect the HTML/DOM source, you’ll notice that these locators share common attributes. In essence, you can define them with a regular expression (or string or positional index matching) type of locator using XPath or CSS selectors. So then you can templatize/parameterize the locator as a static expression with some dynamic variable inserted into it somewhere (in front, at end, in middle, etc.) to form the actual locator at runtime. The alternatives to parameterizing locators, are (1) offering fixed test coverage (e.g. we’ll only test so much with these statically defined subset of color or font locators, we won’t test all possible colors or fonts with data driven testing), (2) using static set of locators defined individually as scalar variables or as a set within an array/dictionary/hash/map. Alternative 1 doesn’t offer future scalability in full data driven test coverage, alternative 2 makes it a hassle to maintain a large set of locators also expanding your lines of code unecessarily. Though the trade-off for locator parameterization is that your locators can become more complex (for those not well familiar with complex XPath and CSS) and possibly slower as you have to use XPath or CSS over by ID, name, etc. sometimes. And it allows for better page behavior modeling as you can offer something like isColorSelected(“color name as string or enum constant”) vs isRedSelected(), isBlueSelected(), etc. But while you can still take the former approach of isColorSelected(value) with alternative 2 (array/map of locators), it ends up being a large block of if/else or switch case statement to filter through all possible values in the set rather than a single or few lines call to manipulate the correct locator by dynamically inserting/injecting the passed in value to parameterized locator (template). Last, alternative 1 for locators in my opinion is not desirable as you limit yourself extensibility and expanded test coverage in your framework (via data driven testing) in the future, and say if the set of colors or fonts were switched around in the app that you use for testing, you have to keep updating them in the framework, whereas if it was built well with parameterization, you won’t have to worry about it, you just update the test case data to match what the app offers, the framework remain same. In good test design, we want to avoid updating test framework and prefer to update test cases (flow and/or data) as the latter matches manual test use case expectation (what does test framework have to do with testing, with respect to use cases and manual testing?). I do know that one reason parameterized locators are not used by some people may be because they aren’t familiar enough with XPath and CSS selectors to utilize the functionality in cases where it is hard to parameterize without them, in which case, you would be stuck with alternative 1 or 2.

To conclude, this may or may not be a lot to expect of a QA person who works with Selenium, particularly with page objects and/or a functional test framework. But writing (real) test automation itself is software development, so technically those who work with proper Selenium automated tests should be “Software Development Engineers in Test” (SDET), not really Software Quality Assurance (QA) Engineers since you’re writing software for testing.

For those who still don’t (fully) understand what I’m talking about after reading this post, you need to learn more about OOP/D as well as how to define functions and methods in programming.


5 Responses to “Developing Selenium tests with proper abstraction”

  1. Brian August 8, 2013 at 2:18 am #


    Ignorant question…I am learning Selenium and realize that in the marketplace it is mostly used with Java, although it works with many languages. Nonetheless, I want to go with the market flow and since I do not know Java, I am wondering should I leave Java or JavaScript. If both, where do I start to learn – Java or JavaScript?


    • autumnator August 8, 2013 at 5:13 am #

      Well, what do you know? Selenium is just a tool/API that can be used with any language. What’s most important is knowing the methodology of automated testing (actions to perform, element locators to define, assertions against data or state to make, etc.). You then combine that with whatever language and test framework platform is to be used (TestNG, JUnit, etc.).

      Yes, Java is most popular. Javascript not so much as a language to drive/control Selenium, however, it is very useful to execute Javascript in the browser with Selenium, so knowing Javascript can be useful in that area for workarounds to Selenium limitations and for enhanced test coverage using Javascript.

      As for where to learn, simply do the same as learning any programming language. Take classes, read books, or research & read online tutorials, documentation, books, articles, blog posts. Whatever works for you. Classes may be best option but that’s the least free/cheapest option. Books you can borrow from the local library.

  2. Rodrigo J Martin August 12, 2013 at 6:22 pm #

    Excellent article!

  3. jvsantoshsarma September 25, 2013 at 12:44 pm #

    Well written. Worth reading.

  4. jvsantoshsarma September 25, 2013 at 12:53 pm #

    Reblogged this on Learn WebDriver.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Der Flounder

Seldom updated, occasionally insightful.

The 4T - Trail, Tram, Trolley, Train

Exploring Portland with the 4T

Midnight Musings

Thoughts on making art

Automation Guide

The More You Learn The More You Play...!

The Performance Engineer



Thoughts related to software development

Yi Wang's Tech Notes

A blog ported from

Appium Tutorial

Technical…..Practical…..Theoretically Interesting


I swear! Meerkats can do Linux


Requeuing the packets dropped in my memory.

Two cents of software value

Writing. Training. Consulting.

@akumar overflow

wisdom exceeding 140 chars.

Lazy Programmer's Shortcut

Java, J2EE, Spring, OOAD, DDD & LIFE! .......all in one :)

Testing Mobile Apps


education and inspiration for visual storytellers

No, Seriously...

Freeing up some mind cache!

Mike Taulty

I do some developer stuff for Microsoft UK

%d bloggers like this: