Thursday 12 November 2015

If you have to automate IE10, avoid the Selenium 64-bit IE Driver at all costs.

When it comes to testing anything in a browser, Internet Explorer tends to have the reputation of being the black sheep of the browser family. Anyone with any experience of testing know that there is a greater chance of something being broken in Internet Explorer than any other browser. Let's face it, IE doesn't have a great track record. As software testers, we all remember the pain of having to support IE8, IE7, IE6. We also remember the moments when support for certain versions of IE was dropped, along with the subsequent wave of euphoria upon realising we no longer had to test in them. But I digress, new versions of Internet Explorer come along, like buses, to replace the older versions, which we eventually drop.

Internet Explorer 10 is currently a supported browser for some software that I test. I follow the widely accepted practice of writing automated tests to do the repetitive grunt testing work so that I have more time to test the complex bits (that can't be automated) by hand.

I've been running automated tests in IE10 happily using the 32-bit version of Selenium Internet Explorer Driver for quite some time. Until this morning. This morning, everything broke.

Well I say broke, what actually happened was that all the tests that used to take 5-10 seconds each to run, suddenly started taking 2 -3 minutes each to run! I watched some of these tests running and I saw that the IE driver was mysteriously typing text into all the text boxes very, very slowly. It's speed was comparable to an asthmatic snail.

So what changed? Well, a bit of investigation revealed that someone else had 'upgraded' the test suite to use the Selenium 64-bit Internet Explorer Driver from the usual 32-bit driver.

But why would this cause everything to break so horrifically in IE10?

Well, in IE there is a manager process that looks after the top level window, then there are separate content processes that look after rendering the HTML inside the browser.

Before IE10 came along, the manager process and the content processes both used the same number of bits.

So if you ran a 32-bit version of IE you got a 32-bit manager process to look after the top level window and you got 32-bit content processes to render the HTML.

Likewise, if you ran a 64-bit version of IE you got a 64-bit manager process to look after the top level window and you got 64-bit content processes to render the HTML.

Then IE10 came along and changed everything because it could. In 64-bit IE10 the manager process was 64-bit (as you would expect) but the content processes, well they weren't 64-bit any more. That would be too logical and sensible. The content processes remained 32-bit. I think the reason they didn't change the content process to 64-bit was to try keep IE10 compatible with all the existing browser plug-ins.

Anyway, part of IE10 (the manager process that controls the top level window) is 64-bit and the rest of it (the content processes that render the HTML) are 32-bit. Now this might seem a tiny bit crazy because a Windows a 32-bit executable can't load a 64-bit DLL and vice-versa, a 64-bit executable can't load a 32-bit DLL. This is the very reason why there was a separate 32-bit and 64-bit versions of IE in the first place!

So what was actually happening to my tests when they were using the 64-bit Selenium Internet Explorer driver?

The tests were sending key presses to the browser. The sending of a key press is done using a hook. The IE Driver sends a 'key down' message followed by a the name of the key, followed by a 'key up' message. It does this for each key press. Because the way these messages are sent is asynchronous, the driver has to wait to make sure that the 'key down' message is processed first so that the key presses don't happen out of order. The driver does this by listening for the 'key down' message to be processed before continuing.  

In 64-bit IE10 the hook can be attached to the top level manager process (because that part is 64-bit) but the hook then fails to attach to the content process (because that part is 32-bit).  

So the 64-bit manager process sends a key press, then listens to hear whether or not the 'key down' message was received by the 32-bit content process. But because the 32-bit content process can't load a 64-bit DLL, it never responds to say "Yeah I've dealt with the 'key down' you sent". Which means the manager process times out waiting for the content process to respond. This time-out takes about 5 seconds and is triggered for every single key press.

The resulting effect is that the IE driver types 1 key every 5 seconds. So if your test data contains fancy long words like "inexplicably" it's going to take a whole minute to type that string in. You know your automated tests are seriously broken when a human can perform the same test in less time than it takes the test script.  

This issue is at the heart of the Selenium 64-bit Internet Explorer Driver and is certainly never, ever going to be fixed. Especially given that Microsoft intend to discontinue all support for legacy versions of IE from January 12th 2016 

Fortunately I was lucky and the work around in my situation was simply to roll back to using the 32-bit version of IE Driver. 

Beware the Selenium 64-bit Internet Explorer Driver. Apparently it can't handle taking screen-shots either for exactly the same 32-bit trying to use 64-bit reason.

Tuesday 20 October 2015

Automating bacon sandwiches

I've recently been lucky enough to be involved with a new software development project from the very start. One of the advantages of being the first Test Engineer on the project was that I was able to help implement and set up test automation on the project from the very beginning. Frequently software development projects see test automation as an after-thought and try implement it later, when the software is already quite advanced. This results in automation efforts that are always trying to 'catch up' to development which can significantly increase the amount of time consuming manual testing required.

I have recently been reading Experiences of Test Automation by Dorothy Graham and Mark Fewster and found this book to be fantastic. It contains many case studies and lets the reader share the experience of how other teams handled test automation. It explains not only what went well but also what went wrong.

Some of the test automation challenges we have already faced on my new project include:

  • Ensuring automated testing is included as part of each user story and completed for every release.
  • Ensuring that each automated test runs independently of other automated tests so that when a test fails it can be run alone and the failure observed in isolation.
  • Challenges surrounding running automated tests in the cloud in different browsers.
  • Challenges about what should be an automated unit test and what should be an automated ui test and avoiding duplication of effort between each level of automated testing.
  • Challenges involving moving automated test code between repos
  • Keeping the test suite as "unbrittle" as possible to ensure test failures are worthy of the time spent investigating and debugging the tests.

It's fair to say that test automation on any project is a full time job which requires a significant amount of effort to implement and maintain. Automated tests are code and as such they should be subject to the same rules already applied to development code e.g stored on revision, code reviews for pull requests etc.

Every Friday morning in our office a company-wide bacon sandwich order is placed. Yes, I know this sounds awesome. It really is awesome.

The process for this bacon sandwich order is as follows: An email is sent with a link to a form where orders for bacon sandwiches are collected. The cut off time for placing an order is 9am. The list of sandwich orders are emailed to a local sandwich shop which then start preparing the sandwiches. One person who has placed an order is then chosen at random (using a random number generated at https://www.random.org/) to collect the sandwiches. A second email is sent with the name of the person who is collecting the sandwiches that morning. Everyone takes their sandwich money over to their desk and pays. The sandwich collector then goes and picks up the sandwiches, which usually arrive around 10.00am.

When deciding which tests to automate, one criteria commonly used is to identify simple repetitive tasks that are performed often. This morning while completing my bacon sandwich order form I realised that this was a relatively simple task that I repeat each Friday morning. As so much test automation activity had been going on recently on my project, I decided I was going to attempt to automate placing my bacon sandwich order in the simplest way possible.

I always order the same sandwich (bacon and egg on ciabatta). I looked back through past emails with the link to the web form and saw that it was rare for the url for ordering the sandwiches to change. Because I wanted to automate this task really quickly, I only had 15 minutes until the cut off time I knew from experience the fastest way to do this would be using Python and Webdriver.

So this is what I did:

1) Downloaded and installed Python 3.5.0 from https://www.python.org/downloads/

2) Added Python to the PATH environment variable. This was quite easy to do I just went to the Advanced tab of System Properties on my PC. Then I clicked the Environment Variables button, edited "PATH" and typed ;C\Python27 on the end of the string.

3) Opened a Git Bash terminal window and changed directory to /c/Python27

4) Installed Selenium by typing "python -m pip install selenium'. (Note: Pip is the package manager that Python uses to install and manage packages. The "-m" stands for "module", not "magic". )

5) Opened IDLE (Python's Integrated Development Environment) from my Start menu.

6) In IDLE, selected File > New

7) Wrote the following basic script.

8) Then saved this file as bacon.py inside the folder c:/Python27

I tested this basic script by typing "python bacon.py" into the Git Bash terminal window. What happened then was a Firefox window opened up and loaded http://www.python.org.

Excellent! I now had a very basic browser automation set up and running on my pc. I set about writing the script which was going to order my bacon sandwich.

The first thing I did was modify the url in my script to open the url of the bacon sandwich order form. Our actual order form is a public URL so for security reasons (we don't want the internet ordering a billion bacon sandwiches through our order form next Friday) I am going to use http://www.bacon.com as the url in my example to protect the identity of the actual sandwich ordering form.

The next thing that the script needed to do was click on the text input box for name and type in my name.

This was the name input boxes element tag on the ordering page.

The easiest way to locate it was by its id.

I added this to my script....

I saved and ran my script again. It opened Firefox, navigated to the ordering page and typed my name into the input box.

The next question on the form was 'Can you collect the sandwiches today?' and underneath this question there were two radio buttons labeled "yes" and "no".

The "yes" radio button's element looked like:

And the "no" radio button's element looked like:

As the ids were unique, for simplicity I decided my script was going to click on the ID for "no" as I was busy this morning with meetings at 9:30am and 10:30am which would prevent me from collecting the sandwiches.

By the time I finished writing my script it looked like this...

I ran my script and approximately 2 seconds later, my order was automatically placed. I really liked how quick and simple it was to implement and run a script that performs a simple task. Now every Friday all I have to do is type "python bacon.py" on command line to place my sandwich order.

Sometimes it's not necessary to apply layer upon layer of fancy testing frameworks, use complex IDE's and hide code behind abstraction (through page object model etc.) to automate simple tasks. Test automation can be simple and still be effective. It is much better for a project to have a small amount of simple curated automated tests than to have no automated testing at all. Don't forget, it's also a really good idea to start writing automated test code at the same time as the application code.

This post was also published on my company's blog Scott Logic Blog

Monday 24 August 2015

How to develop psychic testing powers when dealing with software that has no requirements.

Writing good requirements for software development might seem like an easy task on the surface, but it's actually much harder than many people imagine. The two main challenges that arise when writing requirements are firstly, requirements can change frequently. Secondly, even if you manage to capture a requirement before it changes, it's really easy for someone to completely misunderstand or misinterpret it. Good requirements are absolutely crystal clear with no room for interpretation whatsoever. 

So what happens when bad requirements happen to good testers? Unfortunately testing software which has poorly documented requirements is far more common than it should be. It's hard to describe the internal thought processes that take place, but its kind of similar to invoking psychic powers. You have to know detailed information about all the things you possess no knowledge of.

Lets imagine the very worst case scenario, you have been asked to test something that has absolutely no written requirements. None, nada, rien, nothing, absolutely no documentation whatsoever.

Warning! This kind of testing is pretty risky, before invoking magic psychic testing powers you should always inform a responsible adult (usually a Project Manager or the Test Manager - should you be lucky enough to have one) about the lack of requirements and associated risks. 

Some of the most common risks encountered will be:

* High priority bugs will be found late. This is because by the time the person doing the testing gains decent knowledge of how the software is actually supposed to work, time will have passed and release date will be closer.  

* The number of 'as designed', 'working as intended' or 'not a bug' defects will significantly increase as testers start making educated guesses as to what might be a bug. 

* Product knowledge will probably only exist inside the heads of 1 or 2 knowledgeable people. The workload of these people will increase as testers try to extract this information from them. It's very rare for knowledge holders to be available to answer questions all the time.

* Test automation will either grind to a halt or happen very late. How can you write automated regression tests if you don't know how the product is supposed to work? The simple answer is you can't. 

So once you have told the responsible adult in charge about the risks of testing with no requirements they may say something along the lines of ''We can't write requirements because no-one knows how it works.' It could be a legacy product you're being asked to test. It could be that the person that created it left the company without writing any kind of documentation. You may even be told 'We simply don't have time to write any requirements'.

What happens now? Don't panic, I'm going to try guide you through the most efficient pain free way to test the unknown. The following approaches can help maximise testing efforts while also giving the illusion that you have developed some kind of psychic testing ability.

At the most basic level any testing carried out on a requirement-less project will fall into two categories.

Category 1 - Obvious things - I'm certain if I do this, the software should do that.

Category 2 - Mystery things - I have no idea what the software is doing, why it is doing it or even if it should be doing it at all. 

An example of a category 1 obvious thing would be a text input box that says 'email address' with a button below it that says 'subscribe to newsletter'. A fairly safe assumption would be that entering a valid email address and clicking the button will subscribe the email address to a newsletter.

A category 2 mystery thing might be an unlabelled text input box with a button below it that says 'Start'. What is being started? What should happen when it starts? How do I know if it actually started? What should be typed into the input box?

A good tester will explore the software and be able to draw from a number of sources to guess the errors. All the points listed below have all worked for me in the past when I have been expected to test unknown entities.  

* Try to test important and critical features first however without requirement to work from, it may not be immediately obvious which features are critically important. So start with the obvious functionality which is basically everything that falls into category 1.

* Break the software down into smaller areas or sections. Keep track of all the obvious things that were tested in each of these areas and what the results were. This information can be used as the starting point to form regression tests.

* While you are breaking the software down into smaller component pieces and testing all the obvious things, questions will come into your mind about the features that fall into mystery category 2. Compile a list of questions about the mysterious features that fall into category 2.

* All the time you are doing this, rely on your instincts! If something feels like a bug because it's acting in an unexpected way then it's highly likely it's a bug - even if that bug might turn out to just be a poor design choice. Anything that detracts from the overall quality of the software should be considered a bug.

* Seek answers to the mystery questions. How does the functionality compare to the previous version or to a competitors product? These insights can give valuable clues as to whether or not something is working correctly. Learn as much as possible about the product's functionality from reliable sources.

* Ask developers how they expect the software will behave. If you don't have any requirements to test against, it's likely your developers didn't have requirements to develop against either but they should at least be able to tell you what kind of functionality they added.

* Always keep notes while exploring and learning about the software. Document unguessable things once you discover how they are supposed to work. Trust me, it will save a great deal of time later when you have to revisit complex areas and remember what's going on. There is also bonus value in having notes should a new tester join your project and you need to get them up to speed quickly or if you ever find yourself in the situation where you have to hand over your testing work to someone else.

* If in time doubts arise as to whether or not to log a bug, just log the bug. Once it is entered into a defect tracking system people are usually very fast to point out false positives and it only takes a moment to close them down. 

* Try to confirm your test results with anyone that already holds expert knowledge of the product. Remember all your test results are still just assumptions until they are confirmed or denied.

Whatever you do don't give up or get disheartened. While lack of well documented requirements and user stories certainly increases the difficulty level of testing, it certainly doesn't make testing completely impossible. Always do the best you can with the tools and information you have available to you.

Thursday 20 August 2015

Pinteresting Test Automation - JavaScript Edition

It's been a roller-coaster of a month since my last blog post. In the last four weeks I have successfully managed to change job and learn JavaScript! I started on JavaScript the same way as Python by completing the free codecademy course. If you test things and you want to learn basic programming you should definitely give it a try.

Some initial observations made while learning JavaScript:

1) The learning process was much faster than last time. Knowing a first language definitely helps with learning a second. My first Fizzbuzz in Python took 30 minutes, but my first Fizzbuzz in JavaScript took 3 minutes.

2) White space is not an enemy in JavaScript land. Viva the curly bracket!

3) Forgetting semicolons isn't nearly as bad as I thought it would be.

I've also learned absolutely loads of things about test automation with JavaScript in the last couple of weeks, which is the main reason for this blog post (hooray!).

One of the first things I did was install Node.js which comes with a truly awesome package manager called npm. The package manager made it really easy to try out all of these testing frameworks. Beware if you're on Windows 10 however, some tweaking was required to get it working correctly (Stack Overflow is your friend).

I discovered that there are many different testing frameworks available for writing tests with JavaScript. Actually it's not just testing frameworks, there are many, many JavaScript frameworks in general. Far too many of them. There is a joke among developers that a new JavaScript framework is born every sixteen minutes!

Testing frameworks I encountered and explored were:

* Jasmine

* Mocha

* Chai

* Cucumber.js

* Selenium WebDriver JS

* Nightwatch.js

* Protractor.js

Some of these frameworks are specifically for unit testing, some are for end to end testing. Some depend on each other, some are agnostic and framework free.

I drew a little ASCII diagram to try visualise them. Each framework is listed left to right in a box with either (u) for unit testing or (e2e) for end to end testing. Each framework box has everything it uses on listed underneath it.

These test frameworks increase in complexity from left to right. Jasmine stand alone is a simple unit test framework that just requires JavaScript. Protractor is a more complex end to end test framework that requires either Jasmine (or Mocha and Chai) (or Cucumber) and uses both WebDriver and Node.js

I had a play around with Jasmine stand alone but as this is a unit test framework, I found I had to actually write some Javascript code before I had anything to run my tests against. Unit tests are usually written by the developers that are developing the application. As a Test Engineer, the tests I need to write are a mixture of both acceptance tests, integration tests and end to end tests.

* Acceptance test - Determines if a specification (also known as a user story) has been met.

* Integration test - Determines if a number of smaller units or modules work together.

* End to end test - Follows the flow through the application from the start to the end through all the integrated components and modules.

I looked at Protractor next. Protractor is a testing framework which has been around for a couple of years. I saw that the tests were formatted in a BDD (Behaviour Driven Development, not Beer Driven Development) style.

The syntax protractor uses is based on expect along the lines of 'expect something to equal something else' rather than the more familiar verify/assert statements I encountered when I was writing Selenium WebDriver tests in Python. Protractor's main strength is that it was created specifically to test AngularJS applications. It supports element location strategies for Angular specific elements. If you need to test anything created in AngularJS, Protractor is the King of the Hill.

I then moved on to looking at Nightwatch which felt closer in syntax to the Selenium WebDriver tests I had previously written. Nightwatch is newer than Protractor making it its first appearance on GitHub in February 2014. I found a good tutorial for getting started with Nightwatch which also has a demo on GitHub .

I after a bit of playing around with it, I decided I was going to re-write my Python Pinterest test in JavaScript Nightwatch.

I went through all the Nightwatch asserts and commands and tried to include as many of them as possible in the sample test I wrote.

It was very reassuring to see first-hand that JavaScript and Nightwatch are capable of carrying out all of the tasks possible with Python and Selenium WebDriver.

Anyway here is the test example I wrote with JavaScript and Nightwatch. One of the main advantages I found of writing within a testing framework was that creating the tests was actually much faster. The amount of text I had to physically type in was less than if I hadn't been using a test framework. Also, instead of faffing around with variable assignments a lot of the nitty gritty of what was going on in the background was actually hidden away from me allowing me to just focus on writing the test.

Saturday 18 July 2015

A Pinteresting Python Selenium Example

Eight months ago I started my selenium adventure by learning how to automate finding pictures of cute cats. I chose Python as my weapon of choice due to it being very easy to install, not requiring a server to run and not needing a heavy IDE for development. I have been writing automated UI tests both at home and at work. I found my automated tests not only saved me time carrying out tedious repetitive regression tasks, but also found a range of genuine bugs ranging from obscure to showstopper!

Once I started writing tests with Selenium I found the more I wrote, the more snippets of code I had available to re-use. Writing tests started becoming much faster when the challenges I was encountering were challenges I had previously solved. I wanted to take all my good code snippets and combine them into a useful test that could be used as an example or reference material. None of the tests I had written at work were suitable as they were not written for software which was available to the general public. So I chose a well-known website and decided I was going to write a really good test with lots of comments so I could keep all my code snippets in one place.

The site I chose to automate was Pinterest and my 'boiler plate' Python Selenium example can be found here on Github

While writing the test I discovered something that appeared to be a minor bug. A logged in user was able to enter the email address they used to register for Pinterest into the 'Invite friends' box and invite themselves to join Pinterest again, receiving an invitation email for a site they already registered for in the process. I guess I would have expected the user's email address to be checked to see if it already belonged to an active account before sending an invite. I guess this just proves there is always value in sitting down and taking the time to write these kinds of test.

Thursday 25 June 2015

Applying a soft dip heuristic to software testing

Just as different people can possess different political beliefs and not everyone believes the same thing, I think the same can be said with software testing. In the world of testing there isn't a one size fits all 'right answer', it doesn't exist. Lots of people have lots of different ideas and some of these ideas can conflict with each other. The whole manual vs. automation argument is a good example of this. Some people think that automated testing is a silver bullet that will eliminate all bugs. Some people believe that test automation is so expensive in terms of time and effort in relation to the value it returns that it should be used sparingly.

When I think about where my testing beliefs fit into the testing community around me, the principles of context driven testing resonate with my personal experience, like music in my ears. Testing is hard, I know this, I have experienced this first hand and as such I know that there are no absolute best practices. What might be a good approach for one problem could be totally impractical or impossible for a different problem. But I also know there are ways we can make life easier for ourselves.

James Bach seems to really like making up nonsensical acronyms to try remember good ways to test things. Some examples of nonsensical testing acronyms created by James Bach would be

HICCUPPSF = History, Image, Comparable Product, Claims, User Expectations, Product, Purpose, Standards and Statutes, Familiar Problems

CRUSSPIC = Capability, Reliability, Usability, Security, Scalability, Performance, Installability, Compatibility

CIDTESTD = Customers, Information, Developer Relations, Team, Equipment & Tools, Schedule, Test Items, Deliverables

DUFFSSCRA = Domain, User, Function, Flow, Stress, Scenario, Claims, Risk, Automatic

If you don't believe me, paste any of those seemingly random strings of capital letters into Google and see what comes back :)

My favourite of all these 'Bachist' acronyms is the SFDIPOT one, because even though it at first glance it sounds like utter bollocks it's the one that has proven the most useful to me and as a believer in context driven testing, I only care about practices that are actually good in context. I still thought the way he had arranged the letters made it sound like bollocks, so I rearranged it in my head so I could remember it easier. After all this is about what works for me, not what works for Mr Bach.

You say potato, I say potato. You say SFDIPOT, I say SOFTDIP. Soft dip is nice, especially at parties with an assortment of breadsticks and savoury nibbles. What is SOFTDIP?

SOFTDIP = Structure, Operation, Function, Time, Data, Interface, Platform

Each of the words asks questions about the thing being tested. I find that asking these questions help me to imagine how I am going to test, identify important scenarios before I begin testing and make sure I don't overlook anything really important.

The Softdip questions I ask myself are:

Structure - what is it made of? how was it built? it is modular can I test it module by module? Does it use memcache? Does it use AJAX?

Operation - how will it be used? what will it be used for? Has anyone actually given any consideration as to why we need this? are there things that some users are more likely to do than others?

Functionality - What are its functions? what kind of error handling does it have? What are the individual things that it does? Does it do anything that is invisible to the user?

Time - Is it sensitive to timing or sequencing? Multiple clicking triggers multiple events? How is it affected by the passage of time? Does it interact with things that have start dates / end dates or expiry dates?

Data - What kind of inputs does it process? What does its output look like? What kinds of modes or states can it be in? What happens if it interacts with good data? What happens if it interacts with bad data? Is it creating, reading, updating and deleting data correctly?

Interface - How does the user interact with it? If it receives input from users, is it possible to inject HTML/SQL? What happens if the user uses the interface in an unexpected way?

Platform - how does it interact with its environment? Does it need to be configured in a special way? Does it depend on third party software, third party APIs, does it use things like Amazon s3 buckets? What is it dependent on? What is dependent on it? Does it have APIs? How does it interact with other things?

Yeah, I know what you're thinking just another mnemonic but let me give you an example and show you how this works for me, because if it works for me who knows, it might work for you.

Once upon a time I was given some work to test, a new feature which when used correctly will delete a customer from a piece of software. Sounds simple doesn't it on the surface. The specification for the feature was just that, we need a way to delete a customer from the software. How would you test that? What would you consider? I know from experience that deleting things is a high risk operation so I wanted to be certain beyond doubt that this new feature would be safe and have no unexpected consequences or side effects. So I worked through SOFTDIP and this is what I came up with.

Structure - I have been provided with a diagram that shows where all client data is stored in the software's database. It seems that information about a customer can be present in up to 27 different database tables. When I test this I will need to make sure that no traces of unwanted data are left behind. Maybe I can write some SQL to help me efficiently check these 27 tables.

Operation - The software is legacy and has so much data in it currently that it costs a lot of money to run. As customers migrate to our newer product, their old data is being abandoned in this legacy software and we are still paying for it to be there. This is why the feature is needed. Only admin users will use this feature to remove data, customers and end users must not be able to access this feature. I am going to need to test that non-admin users are definitely unable to access this feature.

Functionality - It deletes existing user data. It must only delete the data it is told to delete and leave all other data intact. It must delete all traces of the data it is told to delete.

Time - What happens if deletion is triggered multiple times? What happens if removing large amounts of data takes too long and the server times out? What happens if the act of removing data is interrupted? What happens if data is partially deleted and an admin user attempts to delete it again?

Data - The software can be in the state of not deleting anything or deleting stuff. Live data is HUGE compared to the amount of data in the staging environment. I will need to prove that this feature functions correctly running on live data. This will most likely involve paying for an Amazon RDS to hold a copy of live data. I need to make sure I know exactly what I'm going to test and how before I request this RDS to minimise costs. It's also possibly going to be a good idea to take a database snapshot taken of this RDS before testing starts so can easily restore back to the snapshot if/when tests fail or need to be re-run.

Interface - I have been told there will be 'something' at URL ending /admin which will allow the admin user to delete account data. I need to make sure that customers are not able to access this URL directly and that only admin users are able to initiate the deletion process. I'm also going to have to make sure that even though this interface wont be seen by customers that it is fit for purpose. Consideration should still be given to things like asking the user for confirmation before any kind of deletion starts.

Platform - This software does make some requests to third party software to retrieve data however if customers are deleted then those requests won't happen as the software won't know what to ask about. I need to prove that this statement is true. There is a second piece of software that asks the piece of software in test for data, what happens if it asks about a client that has been deleted? I'm going to have to make sure that I also test this scenario.

Asking these questions starts to forms the basis for my test plan

* attempt to access feature as non-admin user, verify non-admin unable to access feature
* Make sure user is asked to confirm before delete operation starts
* attempt to start the delete operation multiple times for the same customer
* attempt to start the delete operation multiple times for multiple customers
* ensure feature works correctly when using live data
* ensure after running delete operation that all data has been successfully removed
* carry out a regression test to ensure existing functionality is maintained
* test the delete operation for a customer which has a really large amount of data
* verify software no longer makes requests to third party software for deleted customers
* verify that other software which makes requests to the software in test still functions

So why bother going to all this trouble, all of this preparation before starting testing? Well I'm always happier when I run code changes against a decent test plan. It makes me feel reassured when the time comes to release them into the wilds of the production environment. Every day, a lot of people depend on the software I test and I feel a strong responsibility to them to do the best job that I possibly can. Good testers care deeply about creating good software.

Monday 8 June 2015

Tips for Staying happy and sane while testing software - Tip #3

Assume any information given to you could be made of lies until you have proven it to be true (and seen it to be true with your own eyes).

If information given comes from a non-technical or customer facing source be especially wary. If that source is members of the general public then loud warning klaxons should be immediately sounding in your head!

What happens when someone else sets your baseline or expected behaviour and the thing you are testing does not meet that baseline? Well, firstly you don't have enough information to establish where the error is. Is it a problem with the thing being tested or a problem with the expectation set for the thing being tested. Don't always assume blindly that your oracle is correct!

Without any prior knowledge of a system, if Bob one day tells you that Software A and software B share the same database. You might think 'Hey Bob thanks for sharing that information with me, that's going to make my life much easier now I know that.' If he had worked on the system for many years would his statement be any more valid than if he had been at the company a week? Possibly. But Bob is still a human and he could make a mistake with the words he used to describe what he was actually trying to communicate to you.

What if what Bob actually meant to say is Software A has a single database table which Software B also uses and there is also a second database where the majority of information used by software B lives. Would that change the way you test the interaction between software A and software B? Of course it would! This revelation would certainly lead to more questions. The first new question possibly being can both pieces of software write to this shared table? But what would happen if you were a black box tester and only found out Bob's initial statement to you was false while you were in the middle of running some tests based on what he had said?

It could possibly change the way you interact with Bob, repeated swearing and name calling may even happen depending on the situation and levels of rage.

Happy testers realise that questioning everything that isn't clear also includes questioning people. These days, if someone said to me software A and software B share a database. My initial response would be 'Orly?' then I would actually take a few measurements to check that this statement was true before embarking on major time consuming testing adventures. I guess just like in the same way you would check the oil and water in a car before going on a long journey.

To summarise, humans are human, even if they tell you about cake, the cake could always be a lie.

Monday 1 June 2015

bLogjam - A Cryptojelly Commentary

So in the last couple of weeks a newly discovered computer security exploit was found - hooray! Something we thought that was safe, trusted, tried and tested over a very long period of time has turned out to be flawed. It's in the media, the sky is falling and people that use the internet are scared about things they do not understand! Customers are frantically emailing companies to ask if they are safe, and how safe safe actually is. An exploit how dreadful, hearts were bleeding last time, what is this nonsense?

Well it's been named Logjam and it's pretty interesting as it exploits a cryptographic method that was patented back in 1977.

The cryptographic method exploited by Logjam is called a Diffie-Hellman key exchange. So why does anyone actually care and why does cryptography stuff matter?

Let’s imagine two people that want to share a secret, but they are only able to talk to each other in public where anyone can hear everything they say to each other. That would be a pretty annoying situation, especially as sometimes you really want to share secrets with your best friends without anyone else listening in (I know I do).

So cryptography solves the problem of sharing secrets in public. The simplest explanation of how a Diffie-Hellman key exchange works is to say it is like mixing various colours of paint together. The trick is that it’s easy to mix two kinds of paint together to make a third colour, and it’s very hard to unmix a paint mixture to establish which two colours made that particular shade.

If these two secretive people wanted to share a secret colour they can do this using a selection of colours of paint. They can both agree in public to start with the same colour like yellow (the public key) then secretly pick a second colour that no one else knows (a private key) which will help them secretly share a new secret colour with each other.

So let’s say one person's secret colour is red and the other is blue. Both secret colours get mixed with the public yellow colour (to make orange and green respectively). One person then gives the orange paint to the other and receives green paint. The clever bit is now when they add their own secret colour again, mixing red into green and blue into orange, and the end result is they are both left with the same horrible shade of dirty brown. Possibly not the most aesthetically pleasing colour, but no matter how yucky it looks they now both have the same colour and more importantly, nobody else knows what that final colour is.

Colourful diagram shown above because colourful diagrams always help.

Now imagine that instead of mixing colours, a Diffie-Hellman key exchange mixes numbers together instead using hard sums.

So what happened recently was that some people discovered a way to swap what would be the equivalent of the vibrant paint palettes used by this method for crappier paint palettes. People thought they were picking their secret paint from large palettes containing lots and lots of colours, when unfortunately an attacker had switched their palettes with smaller palettes containing a smaller number of colours. And we all know if you only have a choice of red, yellow and blue it’s much easier to work out that the secret mixed colour will be a nasty shade of brown.

The logic of mixing the colours was and still appears to be sound, just no-one imagined until recently there would be a way to switch the palettes around and limit the number of private colour choices. As testers we must always strive to imagine the unimaginable, this is one of the reasons why testing is much harder than it appears to be at face value. There may be right answers and wrong answers, but there are also unknown questions which have yet to be answered. Don't worry though, unlike a mythical bug (Rowhammer), Logjam is really easy to patch and most people won't have to do anything more than upgrade their web browser to the latest version to make it go away forever.

Monday 16 March 2015

Mythical Bugs That Can't Be Patched

When you work in software testing, every now and then, you get to hear other people's stories about bugs. Most of these stories will be fairly mundane. Something along the lines of like "Yeah, I clicked the button and nothing happened". But there will be other times, once you have been working in software testing for a while, when you may get to hear a story about a legendary bug. Legendary bug stories tend to something like "Yeah, and then if you paste that in to the text box and hold down the shift key, it sends an email to 193,248,2489 customers thanking them for ordering Nickleback's latest album".

I love legendary bug stories. They can serve as mild amusement, shining examples of things that should or should not be done, cautionary tales of woe or even be so far outside of the box that they change the way a tester will think about certain problems or situations. I think all testers must love a good bug story, the ones I go drinking with certainly do :)

The legendary bug story from my last games testing team went something like this. Once upon a time there was a racing game that was about to be released. While testing this racing game, one of the AI controlled cars fell through the track. Replays of the race had to be watched back and scrutinised in microscopic detail to try ascertain which car had fallen though the track and more importantly where it was on the track when it fell through. All the cars had the names of the drivers written just above the wind-shields. While watching the replay footage back, one of the testers noticed a spelling mistake. A driver called 'Bayne' had 'Banye' written on his car. This spelling mistake had been in the game for a long time. The spelling mistake had been missed by everyone, at every level of development and was also in a whole load of promotional screen-shots and marketing material for the game! This legendary bug would possibly fall into the cautionary tales of woe category. The fact that test caught it very late in the day and saved the company from significant embarrassment pushed the story of 'Bayne' into legendary bug status. Seriously, I'm surprised noone on the team submitted it to The Trenches.

Outside the world of games however, legendary bugs can sometimes be utterly mythical. At Google they have an all-star team of security testers dubbed 'Project Zero'. These people actively hunt out vulnerabilities with the aim to find the flaws before the bad guys find them so they can be fixed. Well, Project Zero found a new bug last week. Not just any bug, a mythical a hardware bug!

The story goes like this. Computers use memory to remember things. There is a type of memory called Dynamic random-access memory (abbreviated to DRAM). DRAM works by storing every bit of data in a separate capacitor. The capacitor can be either charged or discharged and these two states represent 0 or 1. Google's testers found a way to change 0's to 1's or 1's to 0's without accessing them. They found that if you pick two memory locations either side of third memory location, and bombard these two 'aggressor' locations with requests, the third 'victim' location will just flip from 0 to 1 on its own.

They are calling the exploit Rowhammer and you can read the Project Zero blog post here. The worst thing about this bug is that it is physical in nature. It can't be patched.

There is currently a test for Rowhammer on github.com although in the warnings it does say "Be careful not to run this test on machines that contain important data." So you possibly won't want to try this on your home PC. At least knowledge of this issue is in the public domain now. Knowing about the Rowhammer exploit exists possibly makes it slightly less terrifying. It certainly will be interesting to see if and how anyone takes advantage of it.

Wednesday 4 February 2015

Please don't feed me spaghetti code.

Everything is urgent, everything is critical, rush rush, develop develop, test test, now now now! This is a common theme in both games development and software development. Management pressure always trying to get the product shipped or the next chunk of code released. Everything may appear shiny and happy on the surface but underneath, code becomes a twisted, tangled, distorted mess which starts testing the sanity of anyone that has to interact with it.

I recently found out about the term 'Technical Debt'. I hadn't heard of it before but after reading a description, I realised all software development will have encountered this debt in some form or another.

So what is Technical Debt and how does it affect software? Well, cutting corners leads to bad decisions, which in turn leads to problems. Fixing problems takes time so when corners are cut a technical debt is created. The debt can be paid at some point in the future when the consequences of the bad decisions are fixed but frequently these debts are ignored. When technical debt is left in place without repaying it, it grows, and accumulates interest as those bad decisions start requiring even more bad decisions to work around them.

Technical debt leads to architectural nightmares made out of spaghetti code. New features gradually start requiring an ever growing number of hacks and workarounds to implement. Before long, the code base starts looks like a really high Jenga tower held together by wishes and tears in danger of collapsing at any moment. Even making simple changes to the software becomes increasingly challenging as the technical debt grows.

Taking on small amounts of technical debt does seem to be completely unavoidable. But some companies don't know how to manage their technical debt. Even fewer companies know when they should avoid taking on new technical debt or even how to start making repayments. Technical debt is actually really dangerous because it is one of a few things that can kill companies dead.

Continuous regression testing is possibly the easiest way to find problems and identify potential code changes that create technical debt. When these kinds of problems are found at the testing stage a choice can be made between either fixing the problems (paying back some of the debt) or backing out of making the change (completely avoiding any new technical debt).

Reducing technical debt ideally should be part of a company's culture because once it starts building up, it won't remove itself. There are various testing activities that can help identify technical debt however as this debt is created by bad code and poor architectural decisions, testing won't make this debt go away. Only refactoring code, redesigning and recreating can pay back technical debt.

There isn't very much a test team can do on its own to reduce technical debt other than shouting loudly and hoping developers and architects pay attention and listen.

Saturday 31 January 2015

Why API's are like take-away menus

I recently had to give a presentation to my test team at work with about testing APIs. The team covers a wide range of experience levels and with varying technical knowledge so I wanted to try describe API's in a way that everyone would understand. The presentation I gave was well received so I wanted to share an excerpt of it here.

I've seen quite a few non-technical people look worried just at the mention of the word API. Possibly due to not understanding what and API actually is.

API's aren't scary. They are just made up of questions and replies.

API stands for Application Programming Interface but this doesn't tend to mean a lot to non-technical people.

At a very basic level, the term API is used to describe both sides of computer systems that talk to each other using language which is pre-agreed and understood. Software can make requests using APIs in order to receive responses and this is why API's are like take-away menus

Imagine the menu for your favourite take-away restaurant......

That menu is like an API between a customer and the restaurant. It is something that lets both side communicate with each other.

When a customer phones the take-away and places an order from the menu, imagine they are software that is making a request to an API. When the restaurant delivers the food, imagine this is like the customer receiving a response from the API.

The takeaway menu does not include the recipes and instructions on how to cook the food. The customer does not need to know how to make each dish (because the restaurant takes care of this). The menu simply lets the customer select the food they want to eat from the choices provided by the restaurant.

The recipes and instructions on how to cook the food would be like source code. APIs make it possible for software to share information and interact with each other without having to share all their code.

If the restaurant changes their recipes or ingredients, the menu will still work and the customer will still be able to order.

So if the source code changes for a piece of software, the API will still work.

If each dish on the menu is numbered (like at a Chinese take-away) and the restaurant changes the numbers, a customer with an old menu ordering using the old numbers will receive the wrong meal!

Changing the numbers on the menu is like changing an API which is already in use. This should be avoided if at all possible.

Tuesday 6 January 2015

The Name, Shame and Flame API Vulnerability game.

Its now 2015 and we're all living in the future! Our world has become a place where invisible intangible things (like APIs) have become rather important in our day-to-day lives.

This morning while staring at my PC with sleepy eyes, details of a security vulnerability at www.moonpig.com popped up in one of my social media feeds linking to this article. The post itself was fairly negative towards Moonpig, along the lines of "I will never use their service again because they fail at security. How dare they fail to protect my details!".

However, reading the article I became especially interested in the story for two reasons. Firstly, this was an API flaw and I do a lot of integration testing involving APIs in my day job. We all know making APIs and using APIs is hard. Secondly, the article claimed Moonpig allegedly hadn't bothered to fix their API yet!

So I did what any good software tester would do, I attempted to recreate the bug.

Currently, my weapon of choice for querying APIs is Google's Postman client plugin for Chrome. I like Postman because it can query both SOAP and Restful APIs. It's also really easy to save and share API collections. Authenticating with APIs is fairly straight forward in Postman too, but from reading the article I wouldn't have to do that.

I still couldn't believe Moonpig's API had absolutely no authentication on it whatsoever!?! I decided to press on regardless. If I could experience a dodgy API first hand it would make it easier for me to identify similar dodgy APIs in the future.

Fortunately for Moonpig (and more importantly Moonpig's customers) by the time I was able to try a query, around 11 hours after the details of the vulnerability had been disclosed to the internet, their API was completely and utterly dead. It was no longer spitting out customer's details. It was totally unresponsive and returning status code of 0. RIP api.moonpig.com

I honestly don't think Moonpig would have fixed this problem had details of the vulnerability hadn't been shared online. The author of the original blog post states he told Moonpig about this problem back in 2013! As harsh as it may sound, in these circumstances I think sharing the vulnerability was definitely the right thing to do as it has forced Moonpig to take corrective action - even if that corrective action appears to be pulling the plug on the API rather than fixing it. Companies like Moonpig have a responsibility to look after our data!

And now the damage is done with major newspapers picking up on the story and Moonpig's reputation torn to shreds. I think the saddest part of the story is Moonpig ignoring such a bad bug for such a long time.