Tag Archives: Automation

A page object representing both desktop and mobile views or website and mobile app

2 Mar

It just occurred to me today, from reading a Selenium forum post, that a past blog post of mine can also apply to the following cases:

  • you have both desktop and mobile views for a website or web app (e.g. responsive design, etc.)
  • you have a website and a native mobile app that offers similar service/functionality

and as such in both these cases there is shared logic (user/app/site workflow, element locators – even if the actual values differ, but the “logical” representation is the same such as login button on both web site and mobile app, etc.).

Using the techniques defined in that other post, you can share locator references/variables and share page object methods and have a single page object represent both mobile & web versions. For sharing of methods with branching logic, instead of checking which A/B flow to go through, you check what platform you’re on and go to appropriate branch logic based on that.

Also on some specifics, I haven’t tested this type of implementation before, but in theory should work out even for cases where you have website and a native mobile app:

  • You just have to use the right driver instance in the page object methods (or instantiate the page object with the right driver instance), etc. based on the platform (e.g. Appium/ios-driver/etc. vs Selenium WebDriver).
  • For shared locators, XPath may work best, using the multi-valued locator approach with pipes “|”. That is because I know Appium (at least) supports XPath based locators. Not sure about CSS selectors. So, I’m hoping/assuming Appium supports the multi-valued XPath functionality. Don’t know about the other tools lke ios-driver, etc. If this doesn’t work out, you would need separate locator references/variables then.

Some of you might not agree, but this is one way to do it with code reuse via some manageable complexity. The alternative is to have separate page objects with separate methods and locators. That keeps things simpler but give you extra files, extra code, and some redundancy when some of that code and locators are kind of similar. In the end of course, choose whatever works best for you.

To dockerize your tests or not?

16 Sep

I recently started working with Docker. And Docker is the rage these days, with people trying to deploy and run their systems/applications as Docker containers. And how to test these new deployments, etc.

As a matter of fact, even Selenium is made available in Docker containers for test infrastructure:



It got me thinking, should we as test automation specialists take things further and also dockerize our tests (and test framework, tools, etc.) as a package? Something like pull down a docker container that has all your test environment preconfigured (Java, maven, Eclipse, JUnit/TestNG, etc.) for you to easily run functional/black blox/regression/integration/API/UI tests, etc. rather than set up your localhost environment with the right software and pull down test code from source control then run, or use a fat bloated virtual machine with all this preconfigured.

I made a post to StackOverflow for this but have yet to receive any feedback. Some folks upvoted my post though. Do you as my blog reader have any thoughts you’d like to share here or in my SO post?

Update 11/21/2015: Well, it’s nice to see an instance where Robot Framework is being dockerized. Searching online for “docker Robot Framework” brings up a few more results.

Update 02/12/2016: Came across these posts about Docker + (Ruby-based) BDD: Dockerizing BDD : Presentation at #BDDX15 Conference LondonDockerizing Cucumber-BDD and Ruby Friends

To re-invent the wheel or not? And testing frameworks…

16 Sep

Was reading a slide presentation today from another blog post: http://www.slideshare.net/abagmar/automate-across-platform-os-technologies-with-taas. Brought back memories of earlier work I did of defining a test automation framework: http://yadiytaf.sourceforge.net. I still think my YADIYTAF specification is a nice read. Although I never did create a reference implementation of the spec. I ended up prototyping what I needed at the time against the existing RobotFramework (RF). From that effort, I ended up expanding the functionality/feature set of RF by expanding it’s interoperability with Java, .NET, Perl, PHP via their remote library server interface implementing servers in those languages. So RF is a lot more powerful now since I first evaluated it years back.

In the end, makes me think, should one re-invent the wheel or not at times? Or see what’s already available and build upon that. I guess it depends on what works for you or not.

Polymorphic security measure could be a pain for Selenium if running tests with that enabled

7 Oct

Came across this site today:


Interesting technology. But if the site to be tested uses real time obfuscation of the (form) element IDs, names, classes, etc. that could be a pain to automate tests against since you’d have to get a handle on the element location to perform the automation. Perhaps one would need or use a backdoor option to disable the obfuscation when running automation to test the site normally.

Either that or utilize an API provided by that solution perhaps that lets you real time map original element location identifiers to the obfuscated ones to use with Selenium at run time.

Be interesting to hear of anyone automate testing of a site that uses such technology.

Thoughts on a Selenium interactive exploratory test and debug tool

24 Jan

I came across a nice Selenium-based tool recently called SWD Recorder. You can find more about it here or watch this video. But for me, the topic to discuss today is how it could be extended to offer features that I had been planning to build into a tool that would use the Selenium API. But since a similar tool already exists now, I can build off of it rather than create from scratch. But in case I don’t get to it yet or never do, I’m blogging my thoughts on what I would build upon this current tool.

The tool is great but I had been looking for specific features (to have), which don’t necessarily align with this tool’s original goal and thus I’d fork it for my own purposes. In general, I’ve been looking to build a cross browser Selenium interactive debugging tool that has the following features:

  • A Selenium IDE or Selenium Builder like command executor. But cross browser compatible and doesn’t run as browser extension but separate tool that drives browser via Selenium APIs. Basically the tool provides a set of standard API commands to select from and user supplies the locator value and then you can click execute/run to execute the selected command against specified locator. I’m not looking for option to dynamically add a list of commands to execute like a test sequence/scenario/case or be able to save that to disk as a test case/suite file that Selenium IDE/Builder offers. Rather it would be more for exploratory test/debug session to see if certain Selenium commands work against specified element in given browser, etc. But those additional options I mention that are not of interest to me could be extended/added by other folks for a type of Selenium IDE/Builder that actually works across browsers (not just for running tests but recording actions or for manually defining/creating the action list/set). Examples of commands to execute: click, sendKeys/type, getText, getAttribute, drag & drop.
  • Expand the previous bullet to also offer a list of javascript emulated/simulated action commands (not native actions/interactions API from Selenium but pure javascript events & emulation of actions) to execute. This would be nice addition that even Selenium IDE/Builder doesn’t offer I think. It serves as a way to see if javascript workarounds for some Selenium commands will work against given browser or not. Without such feature, the only way to test this is to actually wrap & executed the needed javascript code within your test framework to then test & try out, which also requires trial test runs of some test case with setup & teardown that takes time, or using a Python/Ruby shell to execute it but which requires you to type it up or load a pre-written library. But why waste all that time when you can test out quickly in a special debugger tool like this. Examples of javascript wrapped commands: drag & drop, mouse click, mouse over, mouse up, mouse down, scroll mouse wheel. See this blog post about javascript workarounds for ideas.
  • Add alternate options for finding & testing locators. For example buttons to inject Selector DetectorSelector Gadget, and Super Selector among other similar tools without need to set up bookmarklets. Enhance the tool’s current WebElement explorer with option (by default or not) to generate a matching CSS selector (if applicable) for the given XPath that is generated. In this case, for Windows we’d have to have a pre-compiled binary of CSSify Python script for Windows, while on other OSes, we could perhaps just run the script natively. Or better yet, we instead screen scrap the CSSify public page/service to call the underlying “web service” (an HTTP POST call) to translate the XPath to CSS without havng to shell execute a Python script. This collection of goodies would then finally give Selenium users a cross browser element locator finder & tester tool to find elements (their XPath and/or CSS) and then modify the default XPath/CSS value and see if it still works, etc. A tool comparable to FirePath and Firebug. Granted it won’t be a perfect equivalent since it doesn’t quite show the HTML DOM source relative to the inspected element like Firebug & FirePath do at this point.
  • Add alternate debugging options across browsers. For example, have button to inject & load FirebugLite without having to set up bookmarklet.
  • Add (sort of) cross browser javascript error collector functionality for debugging with Selenium. Have buttons to inject the javascript error collection code snippet, and button(s) to check/display/retrieve the collected javascript errors (error count and the specific list of messages). Or instead of button(s) to check errors, it could be automatically dumped out in the tools display after the injection whenever error occurs. This functionality of course would not support javascript page load errors (only after page load) and would not persist across pages, have to manually inject on every page desired. Unless we enhance the tool by modifying the Selenium source code (.NET binding?) to auto inject on every page load. Now this might kind of seem pointless, but it’s sort of a cross browser solution as well as nice alternative to IE’s not so nice developer console or alert dialogs of javascript errors. It’s also a way to test out how well such a solution would work when you actually implement the same into your test framework without you have to do that first by evaluating it in an interactive debugging mode.
  • Have a tab section where you can inject arbitrary javascript source files (via HTTP URL) into the current page. It’s a lot easier than having to manually write the javascript code snippet to inject the script element with src set to the HTTP URL then to execute that javascript snippet with Selenium command in say an interactive Python/Ruby shell. Just paste the URL to the GUI tool’s text field, then click inject script.
  • Followng on previous bullet, also nice to have a section for a cross browser Selenium javascript console, whee you can execute any desired javascript code snippet via Selenium WebDriver’s JavascriptExecutor. You paste or type the code snippet in a textarea field and click execute. Any return value is cast as a String and dumped back in a results area for user to see in the tool. This provides a javascript console equivalent to the browser’s native developer tools but one in which the code is executed by Selenium rather than directly by the browser. It would be a nice way to test out whether certain javascript can be executed or whether it works well with Selenium before you actually code it into your test framework or test. I used to and currently do this over a Python interactive shell but it’s more simpler to do this over a GUI tool, especially for novice users.

Now from all the features mentioned above, to summarize, I’m looking to have an interactive GUI-tool based Selenium exploratory test & debugging tool. One in which you can test out code snippets, locators, and Selenium commands cross browser before you actually code it into test framework. Others might prefer the direct approach but I personally prefer interactive test & debug first as it is a lot more insightful and faster this way than to put everything in framework and a throwaway code test script that also takes much more time to execute through the test flow or having to set breakpoint in debugger and debug from that point. Having such a tool, you can easily combine manual & automated steps in one to see how things work. Such a tool is essentially a GUI version of what I talk about in previous posts:

How to debug test and try selenium code with interactive shells

Using selenium with interactive interpreter shells

A selenium IDE alternative for other browsers and another record playback method

Update 01/26/2014:

It recently occurred to me, that it would be nice to be able to inject jQuery and/or Sizzle into Selenium with such a tool via some buttons to click. To be able to test out whether can locate elements defined with jQuery syntax or for the non-standard CSS provied by Sizzle. This of course is only needed in the case of jQuery if the site under test doesn’t already use jQuery.

Update 04/14/2014:

I recently came across a similar too, this one is not .NET based but Java as a JAR file. Love it when the community comes up with new supporting 3rd party tools.


Robot Framework Test Automation book review

9 Dec

I recently volunteered to review a book about Robot Framework (RF) in exchange for an eBook copy. As a user & fan of RF, a previous technical subject matter reviewer for Packt publishing (publisher of the book), and one who loves free stuff, I volunteered to review the book to see what it had to offer. Here’s my review, reposted from my Amazon review below. And first, here’s a link to the book:


I think some of the reviewers may have been a bit harsh in their reviews, so I’ll be one of the few to provide a balanced assessment. However, do note that I’ve read most of the book but skimmed over parts of it and have not looked at the accompanying source code that you have to download from the internet. Therefore, the review is not a complete review of the whole content of the book including the source code.

This appears to possibly be the first book about Robot Framework (excluding the framework’s user & quick start guides), which I think is a positive thing regardless of the quality of the book. As a whole, the book is nicely written for someone not familiar with Robot Framework (RF) and for non-technical people. It provides a good introduction about RF, delves into some of capabilities and benefits of RF, and is a good stepping stone to use & learn more about RF. It is a good bridge or companion to the RF project’s existing User Guide and Quick Start guide, which provide more technical detail and information not covered by this book. For more advanced coverage of RF, look elsewhere, hopefully there will be such a book to cover that in the future.

Having said that, the book does have its downsides. As pointed out by others, there are some typos in the current/first edition of the book, not a whole lot, but a few here and there as I’ve seen. Understandable, but sadly that these were missed before publication. The organization of the book is decent, but could be improved, and the chapter titles don’t reflect well against the actual topics in the chapters, at least from a technical user for what they expect to see based on the title alone as the content vs title doesn’t quite fit descriptively. Only by reading the details of the chapter summary at beginning of chapter or in the Preface do you see what the chapter is really about (compared to its title).

I also found the book a bit lacking in some areas in terms of subject matter or content. The section about Data Driven testing and Behavior Driven Development (BDD) testing could have benefited from elaboration with some actual test case examples and/or code implementation (in the book, not as external accompanying source code samples). The BDD information I felt was too brief in coverage. In same area, the book mentions “DSL” without actually defining the abbreviation (DSL = Domain Specific Language), which is bad form in technical writing as we don’t expect the user to already know abbreviations. The book also mentions RF generically without mentioning which version of RF is being covered at the time of publication/writing.

I may be incorrect in assumption, but the author also might not be active with and well informed of the RF user and developer community. I say this because while the book does mention some common & useful test libraries, there are some omissions whether by choice or ignorance. For example: AutoItLibrary (for desktop UI testing with free AutoIt tool), ranorex-robot-library (for desktop UI testing with Ranorex commercial tool) SimpleSikuli (Java test library alternative version of Sikuli integration that was covered in the book), SSH library. Another thing the author missed was the fact that the Remote Library interface/API of RF provides for much more than what was mentioned in the book, particularly that it can be used to interface RF to other languages, tools, platforms not natively supported by RF (Python/Jython/IronPython) and as such to execute tests in those areas as well. For example: (pure/native) Java (as opposed to through Jython), pure .NET as opposed to via IronPython, Ruby, Perl, PHP, and more. Furthermore, the book mentions that only the Python and Ruby versions of the (generic) remote server are implemented for users to make use of for remote libraries, and for anything else, one would have to create it themselves, though it is not necessarily hard to do so. But in reality, as of this review (and at least about 1-2 years before it), there have been other (generic) remote libraries server implementations already available (Java, .NET/C#, Perl, PHP, Clojure, node.js), so it is truly really easy to use remote libraries and for other platforms. The author should have also put in some references that for more technical information and whatever not covered by the book, the reader can look to the RF user guide and discussion with the RF user/developer community via the online Google Groups, providing the links to them with the references.

So given all this, to an (or a more) advanced/technical user, the book is not very helpful, and for what you can get for free with online searching, the RF user/developer community discussion group, and the framework’s existing user guide, the price of the book may also seem extravagant. But I think it is a useful book that can be used to help try and convince upper management, non-technical business stakeholders to adopt RF, etc.

I’d like to end the review with some positive points at least. Unless one is well familiar with RF, there are some useful things that can be learned or be reminded of from this book. For me, I hadn’t noticed that RF offered a randomization feature when executing test cases and test suites. And the coverage of the test configuration/data files like variable files was a helpful reminder to me. Last, I like the fact (at least what is stated in the book) that Packt publishing has a “Packt Open Source Royalty Scheme, by which Packt gives a royalty to each Open Source project about whose software a book is sold”. Which would mean RF should get monetary donations from Packt for each copy of this book bought by someone. I wonder what happens in the case of refunds/returns by customers though. Does that get removed from the donation amount or Packt still passes that on to OSS projects for the “initial” sale.

Maybe a future edition of the book will be more improved. And/or look forward to seeing an advanced coverage version of the book for RF for more advanced/technical readers.

Selenium WebDriver – extracting an image on page by use of take screenshot and cropping

4 Dec

Normally, if we wanted to get an image off the web page under test, we’d download it externally using the URL extracted off the image element source attribute. Unfortunately, in today’s AJAX based web apps, that doesn’t always work. Sometimes, the elements appear like images visually to the user but are rendered by javascript into DOM elements that are not “image” elements, composed of sprites, etc. on the server side.

In such cases, you have to go an alternate route…

The technique to do it is explained here:


and an example implementation in Java here:


Side comment 1 – it would be nice if others in the community can contribute other language implementations to the Java source code example above so others can use it. Python, Ruby, C# perhaps? And a sprinkle of PHP, Perl…I may contribute code snippets when I have time.

After you’ve obtained the image, you can do whatever you want with it. Binary/MD5/SHA-1 hash compare the image against a known benchmark one?

For my case, that’s not sufficient as the image isn’t always a single image but a collection formed as one and dynamically generated. So we do some flexible fuzzy logic image comparison with our internal in house image comparison solution.

And on that topic, side comment 2:

I’ve noticed that taking screenshot then cropping to desired element location & size isn’t 100% cross platform compatible. So blogging this for reference to others and to see if anyone else have same issue. The problem here is that different browsers crop with slightly different coordinates and/or width/height ending up with cropped images that differ slightly across browsers. Now I haven’t tested across all browsers, but my findings reveal that Safari provides the most exact desired cropping, while Firefox and IE have slight variations that are not desirable. I didn’t test Chrome, nor mobile Safari with Appium.

With that in mind, you definitely can’t do hash/binary compare of the cropped images across browsers due to differences, unless you have a different benchmark image per browser. So with a single benchmark image, you’d need some flexible fuzzy logic image comparison technique. There’s multiple options here, but none readily available, other than maybe ones like Sikuli. The in house solution we use is based off ImageMagick and does the dirty work of calculating the parameters for the image comparison for you wrapped in a nice REST API.

Now anyone have experiences they can share around the topics discussed above?

Der Flounder

Seldom updated, occasionally insightful.

The 4T - Trail, Tram, Trolley, Train

Exploring Portland with the 4T

Midnight Musings

Thoughts on making art

Automation Guide

The More You Learn The More You Play...!

The Performance Engineer



Thoughts related to software development

Yi Wang's Tech Notes

A blog ported from http://cxwangyi.blogspot.com

Appium Tutorial

Technical…..Practical…..Theoretically Interesting


I swear! Meerkats can do Linux


Requeuing the packets dropped in my memory.

Two cents of software value

Writing. Training. Consulting.

@akumar overflow

wisdom exceeding 140 chars.

Lazy Programmer's Shortcut

Java, J2EE, Spring, OOAD, DDD & LIFE! .......all in one :)

Testing Mobile Apps



education and inspiration for visual storytellers

No, Seriously...

Freeing up some mind cache!

Mike Taulty

I do some developer stuff for Microsoft UK