Saturday 13 December 2014

Selenium adventures - How to automate finding pictures of cute cats.

Blah blah automation. Blah blah Selenium. Apparently automated testing is what all the cool kids are doing these days. I'm not naive enough to believe that automated testing is some kind of magic spell that when cast the software will test itself and suddenly reveal the location of all the bugs. But, having experienced first hand the pain of long drawn out manual regression testing, if ANYTHING helps ease even a small amount of that pain - I want to know about it!

A few months ago I started wondering what Selenium is and how can I find out more about this automated testing magic. After a bit of searching I realised I wasn't the only one looking for answers. A lot of other people were already asking similar questions. I started my search on Quora. I like Quora, its familiar. I frequently read Quora on the train while commuting to pass the time.

Quora directed me to the Selenium Webdriver documentation so I switched from reading Quora on my daily commute to reading Selenium documentation instead. It introduced a lot of new concepts like the difference between asserting, verifying and crazy things called Xpaths. I was still pretty baffled even after reading the documentation. I needed tutorials.

I found some videos on Youtube of an Indian sounding guy explaining Selenium using Java. I tried to copy what he was doing as best I could. Eclipse (the IDE he was using on his videos) was horrid. I spent so much time working out how to "set up" Eclipse to get to the point where it would just open a web browser on its own. Eclipse forced me to learn what environment variables are and how to set them in Windows.

I persevered though the frustration and was eventually able to write this.

My first ever Selenium script. It opened a browser, went to Google and searched for cute cats. I like cats.

By this point I was aware that Selenium Webdriver could be powered by lots of different programming languages. Now the thing is, I'm not a programmer. Selenium needs a language and I wanted to be sure that I chose wisely before filling my head with copious amounts of this kind of nonsense.

The next language I decided to investigate was PHP. I chose to look at PHP because I test software written in PHP for my day job. The first obstacle I encountered was that PHP is a server side scripting language and I didn't have a server. Luckily I share the bus to the station every day with two PHP developers. They enlightened me and told me all about XAMPP. After a bit of faffing around I had a server. After a bit more faffing around I had written this.

General observations between PHP and Java were that the PHP was faster to create than Java due to typing it into notepad++ and not having to fight against Eclipse. With Java the output from the test was shown in real time, with PHP the output was only shown once the test had finished. PHP needed a server to run, Java didn't. Java compile errors might as well have been written in alien. Getting Java to compile was especially painful. PHP was ever so slightly easier to understand why things were broken.

Python was next on my list but as I have discovered so far, one does not simply start writing selenium scripts. I had to get my head around Python first. I had managed to download something called Python and install it. It was on my computer somewhere but had no idea how to even start using it. So I started reading this book called Invent your own computer games with Python. My general interest levels in games and gaming are much higher than non-gaming related topics. I think this really helped hold my attention. I would start copying Python from the book to make hangman or noughts and crosses. Then messing around and adding my own extra bits to see what they did.

At one point I wrote code that had a bug in it. I couldn't see the bug in the code as I didn't really know what I was looking at. But I could sure as hell see it when I tested what I'd written. Was able to eventually work out that four white spaces were in the wrong place which had made all the logic go wonky. I hadn't really realised until that point just how fragile code can be and how easy it is to accidently write bugs.

It took a bit more Googling to work out how to install the Selenium bindings for Python. Having never really had any need to type things into a command line before that took a bit of working out too. But by trial, error and possibly a small amount of luck. I was able to tell Python "python -m ensurepip --upgrade" and "python -m pip install selenium" About an hour later after staring at various error messages and pleading with Google for help. I managed to write my first ever Selenium script in Python and get it to run.

I still have a long way to go on my Selenium adventure but I'm in a better place now than I was a few months ago.

I think subconsciously, I've already decided that I want to continue learning Python. It held my attention much longer than Java or PHP. I especially like the lack of semi-colons at the end of every line. I think the more Python I learn, the more Selenium will fall into place. I just have to keep chipping away slowly at online tutorials until it starts to look like an Elephant.

Monday 17 November 2014

When software goes bad, it can go very, very, very bad

I saw this article recently about how a website owned by a company called eDreams nearly charged a lady £23 billion for a return flight.

Closer inspection revealed that it wasn't the flight that caused the problem, it was actually the return baggage check-in cost.

How could such a massive error make it unspotted onto the eDreams website?

Maybe they didn't have any automated tests that could verify the return baggage cost was far to large.

Maybe the bug was an edge case that only occurred under very specific circumstances, which this customer accidently stumbled upon.

Maybe the testers were outsourced and just didn't care.

Maybe the testers didn't have control of their staging environment and were trying to test a moving target as their developers kept updating it every 20 minutes.

Maybe they didn't have a staging environment at all.

Maybe the testers were not given enough time to test.

Maybe their testing department found the bug, but didn't have the power to stop the build from being released.

Maybe the data the testers tested with was different to live data.

Maybe the customer followed a different path through the software to the testers.

Maybe none of the testers were looking at the return baggage check in cost, because that area had not changed recently.

Maybe the website simply wasn't tested at all.

There are potentially hundreds of reasons this bug could have been missed.

What damage has the bug caused? I would say massive damage to their company reputation. The £23 billion ticket made headline news! As a customer I would be wary of buying anything from their website. As a software tester, this does not sound like a company I would EVER want to work for.

About a month after the appearance of this bug on their website, British Airways and Iberia decided to withdraw their fares from three of their websites. This, in turn, wiped 59% off the e-dreams share value. How long now until this company folds?

When software goes bad, it can go very, very, very bad. Some people don't realise just how bad this can be. It's doubtful that there will be a happy ending for eDreams. Their story is a true software testing nightmare.

Thursday 13 November 2014

"Why the problems occurred?" See if you can guess!

This article was shared with me today. I especially loved the addition of the speech bubbles to stock clip-art. I found it very special and utterly hilarious. It really made me giggle!

See if you can guess the answer to this one....

Wednesday 12 November 2014

Tips for staying happy and sane while testing software - Tip #2

Don't lie, ever. Not even a little lie, not even once. Software does not lie. Software will not cover up for you. Testers that lie, ALWAYS get bitten in the ass.

When I worked with small teams of testers sometimes a tester would get through their work for the day far too easily, far too efficiently, ask no questions and all of their tests would mysteriously pass. As a team lead, situations like this scared me a lot. It was my head on the block if tests were skipped and bugs missed, not theirs. I used to protect myself by covertly stealth testing random samples of suspicious "too good to be true" looking work. Liars frequently didn't get offered any more work.

As a tester I feel that it is essential to build a reputation for speaking the truth, the whole truth and nothing but the truth about the software in test. By telling the truth all the time about the software's behaviour a large team can start to spot patterns in the big picture together. I've seen really good things happen when large test teams start sharing honest observations between themselves. Tester says something like "I think I just saw something strange. I'm not sure why and I can't force it to happen again. I'm certain I saw it though." and then another tester pipes up "Yeah, I saw something similar on Wednesday. What were you doing when it happened?" and before you know it they have worked together to totally nail the bug and now they both can reproduce the issue on demand.

I once saw a guy sacked for passing a large set of test cases over the space of a week, all of which required a flat bed scanner. He was sacked because the company didn't own a flat bed scanner.

Speak the truth, don't lie.

Monday 10 November 2014

Why new features are a bit like rainbows (and how to find solid gold bugs at the end of them).

I've been doing a fair bit of thinking recently about testing new features in software. New features are special. They are all shiny and new which means its very unlikely they will have been tested before. It can be insanely difficult to measure something new, especially if there is nothing similar to which it can be measured to against. Over the years I have seen many new ideas and designs translated into software and many test teams trying their best to test them. I would say that the number one cause of friction between departments, rivers of tester tears and silly quantities of overtime is adding new features to software.

So why is it so hard testing something new? Lets try understand the testing process for a new feature. Testing a new feature generally involves some kind of document that describes the feature and tells everyone what it should do. At a very basic level most people that don't test (and even some newbie testers) will imagine testing a new feature as something like this.

The picture above makes testing a new feature look easy. Compare the software against the description of what it should do. Look at the good bits and raise bugs for everything which is bad. But something incredibly important is missing from this picture. As testers, we must keep in mind that the feature description is written by human(s). Writing a feature description that is all encompassing that explains every microscopic detail and is 100% flawless is just as impossible as testing software in an all encompassing way covering infinite amount of paths in microscopic detail. It can't be done.

So, as testers, we need to give consideration to the designers intention. Once we start thinking about the designers intentions as well as what feature description says and what the software actually does. The picture now starts to looks a bit like this...

Seven distinct possibilities have been identified when testing the new feature. I've numbered these outcomes on the picture above.

Lets imagine we have a new feature. This feature is a clickable button in the software labelled 'P'. When the tester starts testing they end up in one of the coloured blobs on the picture. This is where things can start to get a little confusing so please forgive me but I'm going to colour my text to match the blobs in the picture.

1) The designer does not intend this button to print when clicked. The feature description says the button should print when clicked. When the tester clicks the button, it doesn't print.

2) The designer does not intend this button to print when clicked. The feature description says nothing about the button printing. When the tester clicks the button, it prints.

3) The designer intends the button to print, but doesn't say so in the feature description. When the tester clicks the button, it doesn't print.

4) The designer does not intend the button to print, but the feature description says it should print. When the tester clicks the button, it prints.

5) The designer intends the button to print, says so in the feature description but when the tester clicks the button it does not print.

6) The designer intends the button to print but does not say so in the feature description. When the tester clicks the button, it prints.

7) The designer intends the button to print. The feature description says the button should print when it's clicked. The tester clicks the button and it prints.

Straight away, we've identified there is a lot more going on in the testing process than just comparing software to a feature description.

So what can happen in all these different circumstances? Hmmmm, my guess would be something like this..

1) Tester raises bug, the designer may see the bug and close it straight away. (chance of tester tears as tester thinks "the designers are closing my bugs for no reason")

2) Tester must question the designer "is this button supposed to print?" some lesser testers may fail to ask this question (chance of friction as designer thinks "did the tester not read the feature description properly?" regardless of whether or not feature description contained the answer)

3) Tester doesn't see a problem, nothing in the description indicates a problem (chance of friction with additional tester tears as only the designer can see the bug. The designer may change the description in the middle of testing, or start raising their own bugs. When designers raise bugs sometimes they forget to mention critical information and testers don't take ownership of them. They can slip through the normal process net and slowly rot in a bug the database as no-one really knows what to do with them.)

4) Unless the test team has developed psychic powers, only the designer will see the bug. (chance of friction and tester tears. The worst scenario is that someone blames the test team for missing a bug they had no chance of ever seeing)

5) Bug raised, no further questions required. Sadness is deflected from the test team on to the programmers instead.

6) Bug raised that says the software does not match feature description (chance of friction when designer tells tester their bug is "as designed")

7) Everyone is happy!

When you look at the big picture and see that only 1 of 7 possible outcomes results in happiness for testers, designers and programmers. You start to realise that there are a lot of things that can make the team sad. Maybe if more people were aware of what was actually happening they would be able to avoid some of the associated problems.

I know when I test a new feature, the testing I do takes everything mentioned above into consideration then something special happens. The special bit happens in my head when I start applying my imagination, creativity, logic and common-sense. I suddenly start seeing bugs that no-one else can see! If you were to ask most people to describe a rainbow, they would say that a rainbow is red, orange, yellow, green, blue, indigo and violet. They describe what they can see with their eyes. However a rainbow is more than just visible light. It has infra-red at one end and ultra-violet at the other and you can't see these bits of the rainbow with just eyes. Our picture of testing a new feature now looks something like this...

Let me give you some examples of some things that happen when I test.

I'm looking at something right in the middle (Number 7 - designer intends it to happen, feature description describes it happening and it happens in the software) however this "feature" is making the software flash in such a way it's making my eyes hurt and I think we could be sued for inducing epileptic seizures.

I see the button is green text on a red background and think this won't be a fun experience for someone with colour blindness.

I see the Japanese version of the software contains an image of a Hong Kong flag and think people in Japan won't like this. Just as a confederate flag would be bad in the American version or a Swastika bad in the European version.

I see a pink elephant's trunk in a children's game that is HIGHLY inappropriate and think no way will this get past PEGI/ESRB or any other age ratings board of classification.

In a browser based game I think I might be able to skip from level 1 to level 100 if I start changing numbers in a the URL query string.

And this my friends is the secret to finding solid gold bugs at the end of a new feature rainbow. If you can learn to think independently about the thing that you're testing, you will start seeing the infra-red and ultra-violet bugs that no-one else can see or imagine.

Wednesday 5 November 2014

Tips for staying happy and sane while testing software - Tip #1

Realise that testing the quality of something and making decisions about the quality of something are totally different things. Don't get upset if someone goes through your precious bug database setting all your bugs to WNF ('will not fix').

Novice testers can struggle with accepting this concept. They tend to moan a lot, develop a negative attitude and start mumbling nonsense like 'what is the point of writing bugs if they won't fix them'. Understand that as a tester your job is done once the information is reported. Recognise that you cannot be personally held be responsible for decisions about quality that are made by others.

If you do have the misfortune of existing within a forsaken organisation where blaming testers for the existence of bugs is deeply in-grained into the working culture, firstly, you have my deepest sympathy. Secondly, you can always show the powers that be your WNF'ed bug report and politely point out the issue was previously raised (aka the 'I told you so' manoeuvre).

Don't moan, test more.

Sunday 2 November 2014

How many tests could a tester test if a tester could test Tetris?

Given that games and software testing are two of my favourite things it was only a matter of time before I stumbled upon a game that claims to "test your testing skills". I know what you're thinking, who would even try make a game about testing a game? I'm really not making this up, it exists. Here is the link if you don't believe me.

Software that tells someone how good a job they are doing of testing the software - this was too good to miss. I had to download it and take a look.

The way testing Tetris works is that instead of playing for score, the person testing plays for code coverage. Think of functions and statements in the code like Xbox achievements or PlayStation trophies. 100% coverage and the game is won!

This screenshot was taken approximately 5 seconds after I rage quit because I suddenly realised what this "game" was doing and why that was bad. The test how well you can test Tetris game is in fact a really good example of how not to be fooled by metrics when testing software. The company which made it sell products that report code coverage metrics. They really want us to believe that test coverage = quality. Bottom line, if something sounds too good to be true, it usual is.

If we were to say, scale this Tetris example up to a much larger more complex game hmmm maybe something like Skyrim. Yes, Skyrim is a good example. If we played a similar test coverage game with Skyrim and tried to gain 100% coverage how long would it take? a year? two years? ten years? a hundred years? Personally, I didn't work on Skyrim so I would only be able to guess at how it was tested. But I am pretty certain that 100% coverage would have taken too long and cost far too much money to achieve. 100% coverage would have bankrupted Bethesda Game Studios.

But Skyrim was still released, and was widely praised as being a fantastic game. 100% test coverage wasn't critical to the success of the project. If the people playing the game found some obscure bugs in an obscure part of the code (which many did) its wasn't the end of the world. They would just release a patch and the bugs would be gone (hopefully).

But there is still one more extremely important lesson that software testers can learn take from the sheer ridiculousness of the "test how you test Tetris" example. I'm going to call out to the elephant in the room by saying "Test coverage simply does not equal quality". Please don't ever let anyone trick you into believing it does. This is because test coverage only measures how many lines of code were allowed to run. If a computer could then tell a human if each of those lines of code was correctly implemented or not, every software tester in the world would be out of a job. The bottom line is code with 100% test coverage can still have bugs.

Saturday 1 November 2014

Scary Pumpkin

The scariest pumpkin I've seen this Halloween.

The essence of testing

For some reason, the world of software testing is a minefield when it comes to terminology. The International Software Testing Qualification Board (ISTEQ) currently has a 50 page pdf document explaining testing terminology which can be downloaded from their website. When I first learnt how to test computer games I was not taken to one side and told to memorise any of these terms before I could start testing. I'm fairly certain that someone with zero experience of testing software could learn all the these terms, but even once they knew the lingo this probably would not enable them to perform meaningful tests.

When I worked in games testing I was once told by a freshly hired Analyst that I should carry out pair-wise testing on the project I was working on. I had no idea what pair-wise testing was at the time. The Analyst then started trying to explaining pair-wise testing to me, but found himself struggling and told me to "just create an account at hexawise.com and that will generate test cases for you". I followed this advice because I was really excited to see what pair-wise testing was especially as I had been told it would solve a lot of problems. About 30 minutes later I was very disappointed to learn that pair-wise testing was not something new or special. The test team already did pair-wise testing, it was just that nobody had ever called it pair-wise testing before. Admittedly we worked out the combination to test using an excel spreadsheet rather than using a site like hexawise.com but it in essence was the same thing.

I believe that in certain situations complicated test terminology can hinder more than it helps. This is especially true when working with new testers that are still learning the ropes. If I asked an experienced software tester "What is testing?" I might hear words like manual, automated, blackbox, whitebox, greybox, exploratory, compliance, acceptance, agile. But what is the actual essence of testing, how can it be described it in simple plain English words to someone that doesn't speak the ISTEQ lingo.

For me, the essence of testing is simply "collecting evidence to prove the circumstances where something will not work"

And this is why testing is hard. A software tester needs to be able to think of every circumstance where the thing being tested will not work. This requires intelligence, imagination, numeracy, problem solving skills, a curious mind and the ability to learn new things very quickly. Razor sharp eyes are also very useful.

Test post 1

This is a test (did you really think it would be anything else?).