Tag Archives: Selenium

Awesome Stuff – Selenium, RobotFramework, test automation, etc.

24 Feb

A while back I came across the “awesome” series of listings of (good?) resources for a given category (awesome go/java/ruby/python/etc.).

Well, it’s nice that it has expanded to other areas beyond programming languages, and per se to my original blog content, some areas I and my readers are interested in that I thought to share (if you haven’t already come across them):

I’m contributing to these as I can. And we all should, but at the same time, in keeping with the motto “awesome”, entries in the list should have good reason to be there, not just “yet another related thing to put on the list”, should be something special, otherwise, it will end up a bloated list of (spam-y) stuff.


A page object representing both desktop and mobile views or website and mobile app

2 Mar

It just occurred to me today, from reading a Selenium forum post, that a past blog post of mine can also apply to the following cases:

  • you have both desktop and mobile views for a website or web app (e.g. responsive design, etc.)
  • you have a website and a native mobile app that offers similar service/functionality

and as such in both these cases there is shared logic (user/app/site workflow, element locators – even if the actual values differ, but the “logical” representation is the same such as login button on both web site and mobile app, etc.).

Using the techniques defined in that other post, you can share locator references/variables and share page object methods and have a single page object represent both mobile & web versions. For sharing of methods with branching logic, instead of checking which A/B flow to go through, you check what platform you’re on and go to appropriate branch logic based on that.

Also on some specifics, I haven’t tested this type of implementation before, but in theory should work out even for cases where you have website and a native mobile app:

  • You just have to use the right driver instance in the page object methods (or instantiate the page object with the right driver instance), etc. based on the platform (e.g. Appium/ios-driver/etc. vs Selenium WebDriver).
  • For shared locators, XPath may work best, using the multi-valued locator approach with pipes “|”. That is because I know Appium (at least) supports XPath based locators. Not sure about CSS selectors. So, I’m hoping/assuming Appium supports the multi-valued XPath functionality. Don’t know about the other tools lke ios-driver, etc. If this doesn’t work out, you would need separate locator references/variables then.

Some of you might not agree, but this is one way to do it with code reuse via some manageable complexity. The alternative is to have separate page objects with separate methods and locators. That keeps things simpler but give you extra files, extra code, and some redundancy when some of that code and locators are kind of similar. In the end of course, choose whatever works best for you.

To dockerize your tests or not?

16 Sep

I recently started working with Docker. And Docker is the rage these days, with people trying to deploy and run their systems/applications as Docker containers. And how to test these new deployments, etc.

As a matter of fact, even Selenium is made available in Docker containers for test infrastructure:



It got me thinking, should we as test automation specialists take things further and also dockerize our tests (and test framework, tools, etc.) as a package? Something like pull down a docker container that has all your test environment preconfigured (Java, maven, Eclipse, JUnit/TestNG, etc.) for you to easily run functional/black blox/regression/integration/API/UI tests, etc. rather than set up your localhost environment with the right software and pull down test code from source control then run, or use a fat bloated virtual machine with all this preconfigured.

I made a post to StackOverflow for this but have yet to receive any feedback. Some folks upvoted my post though. Do you as my blog reader have any thoughts you’d like to share here or in my SO post?

Update 11/21/2015: Well, it’s nice to see an instance where Robot Framework is being dockerized. Searching online for “docker Robot Framework” brings up a few more results.

Update 02/12/2016: Came across these posts about Docker + (Ruby-based) BDD: Dockerizing BDD : Presentation at #BDDX15 Conference LondonDockerizing Cucumber-BDD and Ruby Friends


Integrating AutoIt, Sikuli, and other tools with Selenium when running tests in Selenium Grid

22 Jan

A true integration for tools like AutoIt, Sikuli, etc. with Selenium would be when they are able to run under Selenium Grid configuration just like Selenium. That would probably first require that they can run over WebDriver API, JSONWireProtocol as with AutoItDriverServer. But when Grid automatically selects which node to run your Selenium tests with, would it not do the same for AutoIt, except if you specifically request a certain capability set that let’s you know which exact node the test will go to. But anyhow, it might get confusing how to map/correlate which grid node runs the Selenium part of test vs the AutoIt/Sikuli part, etc. So maybe having AutoIt/Sikuli run under Grid configuration isn’t the whole answer either.

Regardless if AutoIt/Sikuli ran under Grid configuration or not, you could still deploy your Selenium tests that use AutoIt/Sikuli, etc. under Grid. I did blog about this a while back but never gave concrete examples. In StackOverflow tradition, people seem to want specific examples. So I’ve finally worked up a simple illustrative example.

To have AutoIt/Sikuli work with Selenium while it’s running under grid mode, you need to find a way to determine the node host that runs the Selenium tests at runtime, so that you can then call/run AutoIt/Sikuli code on the same target node host, so that everything runs on the same correct machine (rather than the dilemma novices encounter where Selenium executes on the remote node but AutoIt/Sikuli runs on the localhost, causing mismatch and test fails).

The easiest example solution is to extract out the node host information from the Selenium Grid API by providing the WebDriver session ID that you first need to extract. Once you have that information, you can then use various ways to remotely execute AutoIt, Sikuli against the node host that you’ve extracted. All this is presented in this Github gist:


of all the various ways in the sample gist, I personally prefer calling AutoIt, Sikuli over WebDriver API, where possible rather than to resort to using PSExec.exe, SSH, etc. I’d go for web service route as alternative to WebDriver API. Sticking with PSExec.exe type option as last resort. But you are free to pick whichever option works best for you.


WebDriver API and JSONWireProtocol is not just for web and mobile applications testing, it can be for desktop too!

19 Jan

And so, here’s a blog post about just that. I’ve found that the WebDriver API in general, with regards to common location strategies of ID, name, class, tag, and kind of XPath, along with element manipulations–such as click, type/sendKeys, get/set text, get attributes, mouse operations, key up/down, taking screenshots, finding elements & validating properties (enabled, visible, selected)–all that being common across web, mobile, and desktop.

Selenium/WebDriver started off for web applications. Then came along Appium, ios-driver, etc. which expanded it to the mobile space, first iOS then Android, and now somewhat even Windows Phone.

But there has been very little in the desktop area. The first for it was Appium for Mac. And we’ve seen nothing else since. Though if you consider unofficial Selenium/WebDriver-like APIs, then maybe there’sa few more: Twin and sikuli-remote-control. But now, I’ve worked out some proof of concept prototypes (based off the old Appium Python implementation) to give you more options to desktop automation using WebDriver API and for Windows, not Mac!


Check them out. Pretty interesting. The AutoIt one has a Selenium integration demo showing automation of AutoIt and Selenium together against websites using 2 drivers – a “web” driver and an AutoIt driver. They’re not Selenium Grid compatible yet, something to look into for the future. But at least they work for remote deployment. Though there is a way to make it work unofficially with Selenium Grid deployment of Selenium tests (but where these desktop UI servers are just not officially part of Grid as nodes) – this solution will be presented in a future blog post.

And last, I had wanted to complete with a three’s company big bang, but I couldn’t get Sikuli working just yet, maybe in the future…


So try them out, submit feedback, send pull requests with enhancements and bug fixes, etc.! 😉

P.S. all this made possible by the great work of others. Like the Appium team for the server base I used. And if/when I work on the Java server version – the Selenium team or ios-driver team, and .NET server version – Jim Evans for his Strontium server implementation. As well as the work of those who build great free or open source tools like AutoIt, Sikuli, AutoPy.

Update 6/5/2015: since my post, I found a new solution that’s better and more recent than Twin for desktop Selenium-style UI automation – Winium.


Polymorphic security measure could be a pain for Selenium if running tests with that enabled

7 Oct

Came across this site today:


Interesting technology. But if the site to be tested uses real time obfuscation of the (form) element IDs, names, classes, etc. that could be a pain to automate tests against since you’d have to get a handle on the element location to perform the automation. Perhaps one would need or use a backdoor option to disable the obfuscation when running automation to test the site normally.

Either that or utilize an API provided by that solution perhaps that lets you real time map original element location identifiers to the obfuscated ones to use with Selenium at run time.

Be interesting to hear of anyone automate testing of a site that uses such technology.


Selenium page object modeling challenge – are you up for it?

11 Feb

Having seen some good & bad page object model code & especially after having read Clean Code book, I sure wish there was some book, presentation, blog post, etc. that went over in more detail about the page object model beyond the simple examples presented in various online articles and the official reference example.

Something that presents case studies of good implementations, bad ones, complex ones (ones that you’d expect novices would never implement or have trouble implementing), and examples/tips on how to fix bad implementations into good ones. For bad/complex ones, would be nice to cover things like:

  • shared page widgets, global (or somewhat global) headers, footers, etc.
  • templated/master/skeleton pages where other pages & sites inherit from (similar flow & UI but where the final sites may differ from one another in minor aspects on some flow or UI element implementation/definition like CSS/JS specification)

if such good one presentations already exist, please point me to them. If not, I urge my fellow Selenium bloggers (especially those that follow my blog), fellow Selenium users/contributors/developers, my blog followers/readers to try and take on this challenge (find such a presentation or come up with one to present online, at some meetup, a code camp, or upcoming official Selenium conference).

I would think such a presentation will greatly benefit the Selenium user community, particularly the novice users who lack the initial software development/architectural design skills to appropriately model page objects with the right level of abstraction.

I do understand such a task may not be easy as it requires finding/sharing code that might be confidential or intellectual property (corporate code) unless reworked as generalized code that could be shared publicly. Unless of course we can find some good open source Selenium test code that happens to be bad or complex (or good) implementations to show as case studies. We need to find and present such code in a central location (as presentation, etc.) because it’s labor intensive to scour the internet OSS codebase for Selenium code examples as case studies (for each person to do individually).

What are your thoughts on this? My ideal thoughts are we need a Clean Code type book for Selenium that covers page objects in detail. A sort of “Clean Selenium Page Objects” or “Clean Selenium Code” book.

On that note, maybe you might have code you can share to use for such a case study presentation? Or maybe you could try to get your organization to approve a generalized version of your Selenium page object code (not all of it but some example excerpts) to use for a public case study?

Anastasia Writes

politics, engineering, parenting, relevant things over coffee.

One Software Tester

Trying To Make Sense Of The World, One Test At A Time

the morning paper

an interesting/influential/important paper from the world of CS every weekday morning, as selected by Adrian Colyer

RoboSim (Robot Simulator)

Visualize and Simulate the Robotics concepts such as Localization, Path Planning, P.I.D Controller


open notebook

a happy knockout mouse.

my journey into computer science

Perl 6 Advent Calendar

Something cool about Perl 6 every day


Inspire and spread the power of collaboration

Niraj Bhatt - Architect's Blog

Ruminations on .NET, Architecture & Design

Pete Zybrick

Bell Labs to Big Data

Seek Nuance

Python, technology, Seattle, careers, life, et cetera...


New Era of Test Automation

Der Flounder

Seldom updated, occasionally insightful.

The 4T - Trail, Tram, Trolley, Train

Exploring Portland with the 4T

Midnight Musings

Thoughts on making art

Automation Guide

The More You Learn The More You Play...!

The Performance Engineer



Thoughts related to software development