Monthly Archives: January 2017

A story of exploratory testing – 30 bugs

Me: “This week, we ran 250 test cases.  245 passed and 5 failed.”

Flip the slide.

Me: “Here is our bug find/fix graph. This week, we found 30 bugs and fixed 42. This is the first week where fixes outnumbered finds.”

VP: “Wait, go back one slide.  How come you only had 5 tests fail, but found 30 bugs?”

30 bugs?

On this project, we created test cases based on the written requirements, and could show traceability from requirements down to test results.  These test cases were intended to show that we met the customer’s requirements – and the customers had full access to this data.

In addition to the official test cases, we also ran many exploratory tests. The testers spent time experimenting, not following a direct script.  In terms of finding bugs, this exploratory approach was much more productive than following the script.

One reason, the scripts were written from the requirements, the same document that influenced design and code. We should have been surprised if the  prepared test cases found any bugs.  Professional, smart, testers following their nose found many of the issues that would have frustrated customers.

(these events happened a long time ago, on a product where we had an annual release cycle and 3 months of system test – that feels like forever ago)

Fallacies with Metrics

I have a talk at the upcoming STPCON in Phoenix, called Metrics: Choose Wisely.  I’ll be featuring some of the content here as a preview.  Please feel free to comment, ask questions here, and by all means, do attend the conference if you can.

At the talk, I’ll be providing a methodology for creating software quality metrics that tie into your business goals. Then, will pick apart some of my work by showing various fallacies with using these metrics.  The first fallacy to watch for is survivor bias.

For a quick exercise, think about a medieval castle.  What material are castles made of?

Edinburgh Castle - illustrating that our conception of castles are made of stone.

Edinburgh Castle – illustrating that our conception of castles are made of stone.

Yeah, stone castles are what we think about when we think about castles. In fact, most castles were made of timbers – out of wood.  However, today we mostly see the castles that survived for hundreds of years. We just see the stone castles because the wooden castles have burned or rotted away.

For an example where survivor bias may impact conclusions on a metric, consider this chart which shows the priority of open bugs.

Chart showing open bugs by priority, and illustration of survivor bias affecting conclusions about software quality

Open bugs by priority

Someone may draw the conclusion that quality is pretty good here.  The only 1% of the bugs are of the highest priority, and the distribution looks normal.  However, the underlying data is only of the bugs that are still open.  This team may, or may not, deliver software that has many high priority bugs, but they fix those bugs quickly. We should look at a distribution for all of the bugs, not just the open bugs.

Another example where survivor bias exists is with customer satisfaction surveys.  Getting a sense of quality from your customers is vital, but you have to remember that the survey results that you see are the results from the people that completed your survey.  The survivors.  You don’t see the results from people who gave up on the survey.  This is why I like to use very short surveys, like the Net-Promoter Score.  The shorter the survey, generally the more survivors you have.

Manual Tests or Automated Tests? The answer is “yes”.

A line from the new movie Hidden Figures reminds me of an adage that we’ve developed.  When the question is: “Should we do this or that?”  The right answer is usually, “Yes, do this and that.”.

The same is true for automated tests or manual tests.  Unless a project is completely a one time use, throw away, it will almost certainly benefit from developing some tests that are repeatable and executed automatically.  On the flip side, any project that has real humans as users should have real humans making sure it works.

The line from the movie was from John Glenn, “You know you can’t trust something that you can’t look in the eyes”.  He was asking Katherine Johnson to double check the calculations that came from the computer.  The computers did the math faster and with more accuracy than the humans, yet it still takes a smart human to make sure its right.


Eliminating biases in A/B testing

A/B testing is a powerful customer-driven quality practice, which allows us to test a variety of implementations and find which works better for customers.  A/B testing provides actual data, instead of the HIPPO.

The folks at Twitch found that the users in the test cell had higher engagement than the control group. They found that this higher engagement came from factors other than the new experience, which might cause a cognitive bias in their results.  Factors like the Hawthorne effect and new users break the randomness for the experiment.

They adjusted the data to reduce the impact of these effects, and provided a great case study on how they did it