Tag Archives: Page Objects

A page object representing both desktop and mobile views or website and mobile app

2 Mar

It just occurred to me today, from reading a Selenium forum post, that a past blog post of mine can also apply to the following cases:

  • you have both desktop and mobile views for a website or web app (e.g. responsive design, etc.)
  • you have a website and a native mobile app that offers similar service/functionality

and as such in both these cases there is shared logic (user/app/site workflow, element locators – even if the actual values differ, but the “logical” representation is the same such as login button on both web site and mobile app, etc.).

Using the techniques defined in that other post, you can share locator references/variables and share page object methods and have a single page object represent both mobile & web versions. For sharing of methods with branching logic, instead of checking which A/B flow to go through, you check what platform you’re on and go to appropriate branch logic based on that.

Also on some specifics, I haven’t tested this type of implementation before, but in theory should work out even for cases where you have website and a native mobile app:

  • You just have to use the right driver instance in the page object methods (or instantiate the page object with the right driver instance), etc. based on the platform (e.g. Appium/ios-driver/etc. vs Selenium WebDriver).
  • For shared locators, XPath may work best, using the multi-valued locator approach with pipes “|”. That is because I know Appium (at least) supports XPath based locators. Not sure about CSS selectors. So, I’m hoping/assuming Appium supports the multi-valued XPath functionality. Don’t know about the other tools lke ios-driver, etc. If this doesn’t work out, you would need separate locator references/variables then.

Some of you might not agree, but this is one way to do it with code reuse via some manageable complexity. The alternative is to have separate page objects with separate methods and locators. That keeps things simpler but give you extra files, extra code, and some redundancy when some of that code and locators are kind of similar. In the end of course, choose whatever works best for you.

Selenium page objects beyond pages like a cart object?

23 Sep

Selenium page objects is a design pattern to help you model test code better. But one doesn’t have to follow the full guidelines of the design pattern.

Some people have used it to model parts of pages as well (headers, footers, navigation, templates, widgets, etc.).  But perhaps it can be useful for more than that, and some people might have already done this or not, as I couldn’t really search anything up or don’t know what/how to search specifically for this. If people have already done this, they haven’t widely publicized it.

What I have found so far is this: https://github.com/cheezy/page-object/wiki/Indexed-Properties

It’s an interesting piece to review. I had this similar thought in mind recently and decided to blog about it:

A shopping cart page doesn’t really do much really. It contains cart items, and offers a visual call to action (click button) that then takes you to checkout. That along with standard site (header/footer/navigation) template actions (login, logout, links to other areas of site).

The core functionality in the shopping cart page really belongs to the cart items and what you can do with them. So in my mind, having the shopping cart page object manipulate cart item actions doesn’t seem quite appropriate for object oriented design.

For example, this would be how you might typically implement the cart page in basic page object model:

cartPage.updateQuantity(cartItemIndex, quantity);
//obviously access cart item by index, text string name of item, or by some unique ID

However, perhaps you can extrapolate the cart items outside of the page object to manipulate individually as a collection or set of related WebElements (name, quantity field, remove button, etc.) or for more advanced usage as an encapsulated cart item object model itself.

Both modeling options are presented here below (since I find it hard to list code in a “basic” WordPress blog)


I would note that modeling cart items in such a way, while being more object oriented, can make implementing the cart page object (particularly the getCartItems method) and associated procedure to locate and group the related cart item elements together functionally more complicated. Because often, the web application will not have an easily implemented UI that has all the related elements easily related and uniquely identifiable to each other, especially on a cart page with N number of cart items.

Usually that may require sophisticated use of CSS and XPath patterns to locate and relate the set of elements for N cart items on the page. So this this whole approach is not something novice page object model and Selenium users can easily tackle. It takes time and skill to do. But worth trying out.

In the long run, I feel this type of approach is more maintainable, scalable, and makes the tests more readable. It just requires more thought in architectural design and more work upfront to implement. However, the complexity to implement could be reduced if you can get the developers to make the element locator values easily defined w/o resorting to custom CSS and XPath, and make it work for N cart items, and X related cart item elements (e.g. item color, item description, item this, item that, for every cart item)

If you ask me personally which cart item object model I prefer, it is the latter one that resembles a page object rather than the one that is simple a container of WebElements for a cart item.

What are your thoughts on modeling things or objects on a page like a page object. Cart item is the one that tends to come to mind, but there are others, for which they can be thought of as objects but not widgets, nor headers, footers, or navigation for page object modeling. Some other possible examples include a search result, a category listing, etc.

Also, please do inform me if you come across other articles about using page objects for things like cart items, search results, etc. where we’re not working with a page but some other object per se.

Pseudo Test Driven Development with Selenium and Page Object Model for developers and QA

2 Aug

A thought came to me today and I did a quick search online. There doesn’t appear to be much articles about test driven development with respect to Selenium. And what info there is is too brief (e.g. presentation slides) or just high level and developer process oriented. Both those approaches just seemed common sense and didn’t cover the heart of the matter. So I’ll write about it from my perspective giving some details from both the developer and QA/tester sides on how to use Selenium effectively for both parties, and not reiterate (as much as possible) what’s already covered in other online articles on this subject (at the time of this post). What I cover should be useful reference to both developers and QA/testers alike, and perhaps even other groups. Though pardon the long descriptive post to get at the heart of the content, got to clearly describe it all…Here me out til the end before you voice any disagreements or consider this post a crap of a rant.

First, to the reader, I will just assume you generally know what Test Driven Development (TDD), unit testing, (user) acceptance testing, Acceptance Testing Driven Development (ATDD), Selenium, and Page Object Model (POM) are. If not, read up on those topics beforehand.

Let’s start from TDD and unit testing. TDD is usually associated with unit testing and then may expand from there. Unit tests cover testing input & output (I/O) of individual software components/modules that perform (ideally) a simple function. The next tier up is integration tests that cover interactions between the units (i.e. components/modules) and the respective overall I/O between the units as a system. When the integration test covers all the components that form the system the it may be considered a system integration test. The final tier is the user interface level where we create UI automation tests that are often also (or then extended to create) acceptance tests.

So with all that in mind, how does Selenium (and similar UI automation tools) fit in with test driven development? Now this question will only apply to the front end codebase as Selenium can only test for that. In terms of back end code, we do nothing in terms of Selenium and have to wait until back end dependencies are addressed/fulfilled.

For the front end, we should treat building the UI itself as a collection of units, whether they are visible in the UI or not. For the javascript portions (sometimes not visible), you can cover that with javascript unit testing tools. You can also cover it to some extent with Selenium/WebDriver by using it’s JavascriptExecutor feature to execute javascript on the page. That feature can also be useful for integration/system test of javascript when all the javascript + CSS + HTML code is combined to render the page UI.

In terms of the visible portion of the UI units – for simplicity, let’s consider them as widgets, headers, footers, components that fit together to form the page. As such, you should test with Selenium the same way you do unit testing. For each UI unit, you have Selenium code that will test the functionality of that unit in terms of I/O. Input would be clicks, mouse overs, typing text, etc. Output would be things like what element states change, what elements get hidden/unhidden, what events get triggered, do you go to another page (from link/button/menu, form submission, etc.).

Ok, but you need to use Selenium with a test framework or build one out so you don’t duplicate code over and over again for such unit-like tests for Selenium. True. Then you might ask, isn’t this overkill and we should use Selenium at the higher level and when UI is ready? We’ll cover that a bit later. In terms of framework, yes we need one. I leave it to the reader to figure out what framework is needed, since each organization’s needs are different.

With Selenium base unit testing idea covered, integration testing with Selenium is similar, just expanding to interactions between visual units and the I/O resulting from those interactions. And system integration testing = traditional UI automation followed by automated acceptance testing.

Now, with all that kind of covered, how should you then implement those different levels of testing with Selenium? Well, I see 2 approaches:

* You be crazy and actually implement unit test like coverage of your UI components using some Selenium test framework. For QA, think of this a really small & short Selenium tests that take less than a minute to run. But we can end up with many Selenium tests in the end (UI unit tests + UI integration tests + normal QA Selenium tests). This approach is fine if you don’t mind large test suite to run and have the bandwidth to write and maintain the tests.

* You don’t write Selenium tests at the unit and integration test level, but you develop your Selenium framework (especially when using page object model) with the unit test and integration test level mentality so that you can be able to write better Selenium tests and do more test coverage efficiently. This is the approach I advocate.

Now, for my advocated approach you might say, can you clarify? Let’s begin by referring to the page object model. A page object represents all the functionality on a page and all its visual elements. So all the logical things you can do on the page, you model with methods/functions, and all the UI elements are translated as element locators for Selenium (property members of the page object class).

Ok, so you may or may not already know that so then what? Page object model is great, problem is not everyone uses it to maximum efficiency. For those of you who do (or near it) congratulations. For those who don’t, you got work to do. What do I mean by maximum efficiency? I’ve witnessed that some people and organizations tend to have a golden path, happy path, basic sanity test coverage mentality when building out test automation. So they build into the Selenium framework only the methods and locators they need at the given time they are writing tests to cover some portion of the page only. Meaning a lot of the functionality of the page is left out and filled in over time as more tests are written to cover them, if that ever happens. This choice makes sense for resourcing constraints, and not writing unused test/framework code but is a terrible design choice, at least in my humble opinion.

The proper TDD way to do it is you build out the page object as you are building out the UI itself. In an ideal paired programming or agile environment where team members collaborate and you don’t have “waterfall” development process, the (front end) developer builds the UI and simultaneously, a QA member will build out the matching page objects for the UI. Alternatively, the developer could also be the one building both the UI and the Selenium page objects. I don’t advocate building the page objects after the UI is ready, you just can’t keep up that way, unless of course you follow the route I dislike where you only implement portions of the page to cover only what you need at the time.

But you can’t build page objects without a working UI or can you?! Wrong if you thought no. Before you get a working UI, you usually have mockups and wireframes and other specifications that you can go by. The good specifications also cover the interactions of the page/site flow and interactions between elements. Using that info, you can define the page object’s model: what methods should it offer, what element locators it should have, how the locators might be utilized by the methods. Filling in the actual logic to make the methods work and what the locator values actually are can be done as the UI gets implemented, piece by piece. Without really thinking about it, in this way, you are essentially building a custom (page object Selenium) API for consumption by Selenium tests (as the clients of the API). Another way to look at it is you are also building a UI consuming client that consumes the UI (acting as a server in a client-server application model).

With my analogies in mind, things get more interesting. Unless you have a really smart QA/tester (who is well versed in those analogy types) and/or one who is developer-oriented, you may have problems with the QA/tester building out a maximally efficient Selenium framework (or the page objects) that mimick what you can consider an API or a client of something. Reasoning is they lack the technical experience or mindset to model the page objects properly – you can certainly get by with the deficient model, it just won’t provide as optimal a test coverage as you can actually get. In a sense, working on the Selenium framework and page objects are really meant for QA people in the roles of QA architects and Software Development Engineers in Test (SDET), while writing the tests that utilize the page objects can be delegated to the other QA roles of QA automation engineer (a bit vague a role/term to me), QA tester, software QA engineer, etc. The roles of QA architect and SDET of course are for QAs who have developer type background, which surprisingly most QAs tend to lack – some may have developer skills and even engineering or Computer Science degrees but lack the depth of knowledge desired compared to a developer. In essence, by developer type background we mean a QA who is capable of being a developer and doing application development (not just working with Selenium and test frameworks) but chooses to focus on the testing aspect rather than the core application development aspect, hence the SDET name is quite literally correct.

Now, if the organization lacks SDETs and QA architects and doesn’t have the luck (or resources) to attract such people, you can make up for that loss (where possible) by shifting your developers to take on some (but not all) of the testing work. With good object oriented programming and software development experience, developers can build out the page objects that the less experienced QAs can consume by creating actual automated tests cases. Because once you have a fairly readable and easy to interpret page object model (i.e. designed well), less technical people can easily write scripts, etc. that call the page object methods and pass in the needed test data, something which quite some postings online talk about. When calling on your developers to help out in this area, I would say that those who are software architects, design APIs (or maybe consume them too), develop client-server software, or are good with designing classes would be best at designing your Selenium page objects.

On a related note, implementing page objects are best done by those with a development background, but defining the page object method signatures (the API or input/output) can be done by other members like User Interface/Experience folks or the business analyst, or QA etc. basically someone with good modeling skills. A bad developer can easily define a bad page object model.

Before we continue, let’s briefly come back to the thought of why QAs with less technical experience create deficient page objects. Reason 1 – they may lack the architect/developer mindset to properly define method behavior and parameterize methods. Meaning methods use default static test data and/or cover a specific case only and isn’t generalized to handle different cases. They may add more methods for other cases and at some point figure out the y need to refactor and simplify for code reuse. Bear in mind that all the presently possible cases to cover are already defined by the UI (and hence the matching page object model), so if one knows all the possible cases, when writing a method for at least one of the cases, who in their right mind would not build the method in a general way to handle all cases and parameterize it? Given time constraints, no one said you could not leave the other cases unimplemented (do nothing actions or throw exception about not being implemented yet), only handling what is most important at first. This way you leave a well designed framework with minimal refactoring later on, easier to fill in the missing code later on too. Reason 2 – lack of element locator definition skills. To achieve good page object modelling, especially for complex javascript library based web applications, and also when developers don’t cater to QA ideal requirements (more on that later), you often need to resort to XPath and CSS selectors to achieve locator templating and parameterization for effective modeling and miniming lines of code from many static value locators as the alternative. And due to CSS limitations around node text “contains” matching and parent/child/sibling node axis traversal, you often have to go with XPath. Many QAs only have basic knowledge of XPath and CSS and fail to construct good XPaths or CSS that are templatized and can be used with variables and/or chained with a prefix, suffix, infix, to define the final value of a locator at runtime. Developers tend to be better at working with XPath and CSS, since they work with it in the web applications. WebDriver also indirectly offers an object oriented way to use a WebElement to traverse back up to its parent/ancestor or down to its children, etc. though it’s not well documented and requires some technical knowledge to make use of. Reason 3 – they may lack experience with javascript as in programming in javascript. WebDriver and even Selenium RC exposed a javascript execution interface to execute arbitrary javascript code. Which comes in handy for browser limitation workarounds, missing functionality that Selenium doesn’t offer, or for extra (lower level) test coverage by executing javascript code. Without js knowledge QAs either have to search online for js tricks or consult their developers to get some js code to use. Some people advise against executing js code for testing, but there is one area it comes in handy too, to fetch test data that you’d otherwise have to find other means to such as “screen scraping” the UI or pulling from the database or external API when it is already offered in the web application via a javascript API or hidden DOM elements on the page.

Now let’s briefly go over why have your developers help out with the page objects and even Selenium tests? Because they have more technical experience that comes in handy for the modeling and heavy programming portions of Selenium page object and framework setup. And also you force the developers to eat their own dog food (i.e. consume their web application for testing) and they see how horrible locating elements can be with XPath and CSS – it can be done, but sometimes the string is long, a bit messy, and complex though still technically “beautiful” when well defined. And we know with developers tending to be lazy (like using jQuery over direct DOM/document object) they may want to finally make their apps use unique IDs, names, classes, or other attributes to avoid using XPath and CSS for locators.

And to finally conclude it, why build out page objects together with the UI development? If ou didn’t get it from reading so far, it’s because this keeps the functionality in sync between what’s available in the app and what you are/can/will be testing. The keeping in sync part also ensures any changes to the UI causes a timely matching update to the page object code (and updates to any affected tests that consume those page objects). Doing this also ensures proper test coverage from the start. By building out the page object early on, you already build out the infrastructure to support test coverage whenever you get around to adding it. To clarify, you need to cover A, B, and C but you only have time for A right now. If you build out the page object to support B & C, you may have no tests now, but adding a test for them later is easy, simply chain together calls to the right page object methods with the right test data, do some test runs and fix as needed and you’re good. Without page object build out early, you then have to go back and add them in then write tests which mean more work later on just to save on work now. Do you want to pay later or pay now? One analogy to this is when you build houses (and perhaps anything else) you build from a good foundation (bottom up, not top down starting with the roof first). And you complete the house as a whole. Not build in pieces with the house still missing rooms/walls (hard to live in). You build it fully and hopefully perfect from the start so you don’t later have to say upgrade/trade up (e.g. switch test frameworks) or do some remodeling or rebuild the house later on (e.g. refactoring or framework/page object code redesign). With the complete page objects in place, you also know what test coverage you are missing based on unused code. Whatever page object methods and locators exist but are not used means at some point you need to write tests to cover it. So you don’t later have to do assessments on what test coverage you may be missing. If you run code coverage tools, if you can run them against the Selenium framework code, that could also give you such stats. It’s also good to build out page objects early on with the UI since the project is fresh in everyone’s mind so you know what functionality there is and what to cover. When things get deferred you forget over time. Plus when you work on related things, even for page object code, doing them together at once is faster and better designed than do a little now and come back to it for the rest later when you might forget some stuff. In my experience, I’ve found that writing missing tests from existing well defined methods is a whole lot easier and faster than having to create the missing page object code later on before writing the missing tests as well.

And some final words, while I talk about Selenium and page objects, this approach can apply to other types of frameworks without page objects (just use object oriented or functional programming design) and other types of applications (desktop, mobile, APIs, etc.). For the QAs who are less technical/developer-oriented, hopefully this post has given you insight on how to do better with Selenium. For developers, hopefully this post has given you insight on how to help QA out.

Writing Selenium automation and tests in and of itself is writing software. Hence, just as with software, you want to do TDD with it too, and I’ve described how I would do it in a TDDish way.

Update 01/29/2014: Came across a similarly useful post: http://simplythetest.tumblr.com/post/73164598956/ui-tests-as-unit-tests

Update 06/11/2014: came across a post that used the build a house analogy with regards to testing: http://www.softwaretestingtricks.com/2007/01/unit-testing-versus-functional-tests.html

Developing Selenium tests with proper abstraction

11 Jul

This post is with respect to page object modeling of Selenium tests but also applies to general Selenium test development when not using page object model. In the latter case, one may be using object oriented programming/design (OOP/D) or functional (but not necessarily object oriented) programming/design (FP/D), or not. I would advise if not using OOP/D, at least use FP/D.

In this post, I’m ranting from my experience, how some people do not use page object modeling properly (with some examples) and how it should be used. Pardon the long blocks of text & sparse use of formatting, it is a rant after all, I may try to pretty up the formatting in the future.

While one doesn’t have to adopt the page object model fully, there are some guiding principles one should follow.  Others may have differing opinions. This article may be a nice read for some of you who have experienced the frustration of what I see, and/or a nice reference for those not so experienced with programming particularly OOP/D and/or FP/D. From my perspective FP/D is same as OOP/D with the expection that you don’t organize your code into class objects with methods, instead you just have a set of functions (e.g. methods) to do things and in both cases, the methods/functions are parameterized and abstract away the low level details to describe a higher level view of functionality. Hopefully, in the larger community of Selenium testers, there are not many who actually do the things I rant about in this post. For those who actually do it, I’m assuming they’re either novice programmers or that they’ve been used to the the Selenium IDE and old school way of singular test command execution of writing tests.

Things you can choose not to adopt:

  • page objects returning other pages. This is a nice to have, but you can manage without it, you just have to manually instantiate the correct page objects in the needed flow/sequence rather than expect the starting page object to return the next page object in sequence for you to then continue with test. when returning page objects from a page object, you still would want to check that the page returned is correct (in correct state), then it wouldn’t be too much to expect the test writer to know what the correct page to check for and as such they can also manually instantiate the correct page if page object doesn’t return the next page. This is one feature our team chose not to adopt, why, I have no idea as I’m not the lead architect of our framework to spec’d it out. In OOP/D, class objects are not required to return other class objects, but they may. They could simply return basic types (int, string, boolean, etc.). So therefore, you can do same with page objects. Here’s some related posts about not wanting to return page objects from a page object: http://watirmelon.com/2012/05/29/page-objects-returning-page-objects/http://sellotapetest.blogspot.co.uk/2012/04/navigation-in-page-objects-returning.html.

Things you should adopt:

  • abstraction of page behavior from test behavior. Page behaviors being UI element locators, actions that can be performed on page, UI/page/element state. Test behavior being assertions performed, comparison of data, sequences of actions across pages. Following this allows for good, clean, modular test design that’s easier to read & maintain.

Problems observed with improper adoption of page object model:

  • UI element locators of page object made public and directly accessed by test to perform assertions against (text of locator, DOM/element attributes of locator), or to invoke actions against (e.g. click, type, select). It might be easy to do so, but then it makes tests aware of internal page UI and structure, which is not good abstraction and modeling of test. More brittle tests and code all over the place to update for changes. UI element locators should always be made private to the page object (or protected), and must be accessed only by methods of the page object or associated classes. If you need to access some locator (extract data or manipulate it) and find no method to do so, it means you need to create proper methods for it. In Java, you’d get warnings about private members that don’t get used by some method of the class. It’s actually ok though to leave private member locators defined with no methods that manipulate them as you can fill them in later on, though it is good practice to do it all at once (but who has the time to do the best case scenario anyways?).
  • Convenience methods added to page object. Page objects are meant to expose behavior of the page (or whatever the page represents as it can be a group of pages or section of a page). As such, the page object should contain methods that define particular individual logical actions on the page only (which may or may not reference other page objects). You should not create “convenience” methods that call multiple methods (which may or may not involve other page objects) just so that you have a single simple method to call in a test case to do a group of things and not have to write out a group of method calls & class object instantiations in the test cases. It makes sense to have convenience methods, but they should go in some common or base test case class that test cases are derived/extended from. They don’t belong in page object, unless it is actually logical as a behavioral action of the page. These are easy to notice when the method name or the implementation within the method indicates something like doThisThatFromPointAToGoToPointB().
  • Improper abstraction or defining of page object methods. For example, hard coding values to a page object method. This leans toward making the method a convenience method so that you can get through some step to test something else. Because you may later or should have tests that invoke the method with other data to test other use cases, therefore, the data used should never be hard coded. If you desire some hard coding, use/add as a separate additional convenience method (in common/base test case class) that calls the parameterized abstracted page object method in question with the predefined static input/data. Another example, is too granular, low level, atomic methods like clickThis(), clickThat(), because it is easy to do and is the old fashioned and Selenium IDE way to automate tests by per command basis. But ideally you want to abstract it up higher to something more functional or logical such as doThis() or goThroughSomeFlow() that actually perform the actions of clickThis() and clickThat() at the lower level. If you keep it too granular and low level, that defeats the purpose of OOP/D with page objects, as going so low level, you might as well go back to Selenium IDE style of executing singular Selenium commands and chaining them up to build a test.
  • adding test data into page objects like locators. For example, defining static variables for text on a page, and later using it to perform assertions or checking that the text exists on page. If the variables/text is used like a (or to define a) locator, then it is fine. But when you see it used like test data, then it should belong under a common/base test case class (or the test case itself) as a static variable rather than in a page object. Using like test data usually means it is used in assertion as the “expected” value rather than the “actual” value. If it’s treated as “actual” value then that’s ok, because then it is behaving like a locator that you’re extracting text from.
  • not using enums and constants where available in programming language used for automated test. This doesn’t directly apply to page objects, but does indirectly in terms of usage with page object methods (their parameter arguments or the internal method implementations). I’ve seen people use integer values like 0, 1, 2, or perhaps strings to specify option path to branch with (via if/else or switch case statement). While that’s fine, it makes the test design more brittle to updates and harder to interpret the meaning of the integer value. Strings are less of an issue but brittle should the string value change. While one can’t avoid use of strings and integers in actual implementation as some UI elements (or non-UI items but some type of data) will be based on string identifiers or identified by index or integer value, it helps at the interface/API level where you define the page object method signature (parameter arguments and return data type) to use constants and enums. Internally within the method, you use the enum/constant to lookup/translate to the proper string or integer value (e.g. SIDES.FRONT translates to “front” or COOKIE.OATMEAL translates to 2). This way, the test cases don’t need to have knowledge of the actual low level developer-centric values (of the web application or HTML source), it just needs to know on high level what logic we’re handling like which side of an object (front or back) or what type of cookie (oatmeal or peanut butter, without caring what the actual index position or value an oatmeal cookie is in the web application), which is part of abstraction with OOP/D. All the low level logic/values is handled within the page object methods.

Proper page object modeling

Here are some of my thoughts on what properly modeling a page object should be like. I didn’t do an extensive search on page objects online, but the example presented by the Selenium project is a little too bare/minimalistic to guide a novice tester not fully acquainted with OOP/D. Hopefully, what I present here is a bit more thorough in clarifying page object modeling with respect to OOP/D.

Any and all page objects should contain a set of the following types of methods that define the overall behavior of the page:

  • state reporting methods like isSomethingInSomeState(), isPageInSomeState(). Examples can be isColorSelected(“red”), isFontSelected(“Times New Roman”) or amIOnThisStepInwizard(2). With such methods, you can then call from test case to assert against expected state, the return value of the method being the actual value that you would compare against the expected. Alternatively, you might call the method to check the state to then perform the approach if/else/switch statement branching, such that if in state A do this, if state B do that. This type of method I’ve noticed some people miss. They either don’t create any such methods, because we generally aim for automating the happy/basic path test cases and don’t plan for robust regression testing that should involve state checking. The happy path tests generally assume state and indirectly test state via other actions (luckily that does happen), otherwise, there would be critical missed test coverage. Improper usage in this area I’ve seen is where instead of having a proper state checking method, people instead simply check an attribute or text of a locator (exposed publicly) as a way to check state.
  • read (or data extraction) methods like getThisFromThat() or getThisOfThat(), where we extract out data on page via page object to then assert against. The extracted data is the actual value to assert against some expected value. An example method could be something like getPriceOfCookies(COOKIES.CHOCOLATE_CHIP) where page shows a table of cookies and their prices, and COOKIES.CHOCOLATE_CHIP is an enum constant value defining chocolate chip cookies. The method internally uses the enum to figure out what locator to extract price value from. Improper use in this area I’ve seen is where people expose locators publicly and extract data off it rather than abstract it within a read method.
  • write (or action) methods like doThis(), doThat(), doTheseMultipleThings(), where the multiple things method could be invoking the multiple things in sequence (serially) or concurrently in parallel. This is the simplest type of method people would be comfortable with creating methods for to do certain actions or groups of actions on a page like filling in form fields, clicking buttons & links, selecting menu items that involve mouse over dropdown menus, etc. The only issues of bad usage I’ve seen here are the problems reported earlier (e.g. too low level like clickThis(), hard coded data, convenience methods).
  • parameterization of locators and methods. By parameterization for methods, we mean they should not have hard coded values. All input should taken as parameter arguments to the method, and page class member property constants/variables. Anything sort of hard coded in the method might be for template constructs to form a locator or test data. But nothing truely hard coded. By parameterization for locators, we mean where applicable, don’t statically define locators by a static ID/name/XPath/CSS/etc. With true OOP/D, you’ll notice that certain locators form a logical grouping like icons on page for red, blue, green color, or a dropdown menu of font names. If you carefully inspect the HTML/DOM source, you’ll notice that these locators share common attributes. In essence, you can define them with a regular expression (or string or positional index matching) type of locator using XPath or CSS selectors. So then you can templatize/parameterize the locator as a static expression with some dynamic variable inserted into it somewhere (in front, at end, in middle, etc.) to form the actual locator at runtime. The alternatives to parameterizing locators, are (1) offering fixed test coverage (e.g. we’ll only test so much with these statically defined subset of color or font locators, we won’t test all possible colors or fonts with data driven testing), (2) using static set of locators defined individually as scalar variables or as a set within an array/dictionary/hash/map. Alternative 1 doesn’t offer future scalability in full data driven test coverage, alternative 2 makes it a hassle to maintain a large set of locators also expanding your lines of code unecessarily. Though the trade-off for locator parameterization is that your locators can become more complex (for those not well familiar with complex XPath and CSS) and possibly slower as you have to use XPath or CSS over by ID, name, etc. sometimes. And it allows for better page behavior modeling as you can offer something like isColorSelected(“color name as string or enum constant”) vs isRedSelected(), isBlueSelected(), etc. But while you can still take the former approach of isColorSelected(value) with alternative 2 (array/map of locators), it ends up being a large block of if/else or switch case statement to filter through all possible values in the set rather than a single or few lines call to manipulate the correct locator by dynamically inserting/injecting the passed in value to parameterized locator (template). Last, alternative 1 for locators in my opinion is not desirable as you limit yourself extensibility and expanded test coverage in your framework (via data driven testing) in the future, and say if the set of colors or fonts were switched around in the app that you use for testing, you have to keep updating them in the framework, whereas if it was built well with parameterization, you won’t have to worry about it, you just update the test case data to match what the app offers, the framework remain same. In good test design, we want to avoid updating test framework and prefer to update test cases (flow and/or data) as the latter matches manual test use case expectation (what does test framework have to do with testing, with respect to use cases and manual testing?). I do know that one reason parameterized locators are not used by some people may be because they aren’t familiar enough with XPath and CSS selectors to utilize the functionality in cases where it is hard to parameterize without them, in which case, you would be stuck with alternative 1 or 2.

To conclude, this may or may not be a lot to expect of a QA person who works with Selenium, particularly with page objects and/or a functional test framework. But writing (real) test automation itself is software development, so technically those who work with proper Selenium automated tests should be “Software Development Engineers in Test” (SDET), not really Software Quality Assurance (QA) Engineers since you’re writing software for testing.

For those who still don’t (fully) understand what I’m talking about after reading this post, you need to learn more about OOP/D as well as how to define functions and methods in programming.

Der Flounder

Seldom updated, occasionally insightful.

The 4T - Trail, Tram, Trolley, Train

Exploring Portland with the 4T

Midnight Musings

Thoughts on making art

Automation Guide

The More You Learn The More You Play...!

The Performance Engineer



Thoughts related to software development

Yi Wang's Tech Notes

A blog ported from http://cxwangyi.blogspot.com

Appium Tutorial

Technical…..Practical…..Theoretically Interesting


I swear! Meerkats can do Linux


Requeuing the packets dropped in my memory.

Two cents of software value

Writing. Training. Consulting.

@akumar overflow

wisdom exceeding 140 chars.

Lazy Programmer's Shortcut

Java, J2EE, Spring, OOAD, DDD & LIFE! .......all in one :)

Testing Mobile Apps



education and inspiration for visual storytellers

No, Seriously...

Freeing up some mind cache!

Mike Taulty

I do some developer stuff for Microsoft UK