This is a story about how I came to value Exploratory testing. But first, a story about this story: Every Saturday morning, we go through the same ritual. I have a couple cups of coffee, and Jake, my German Shepherd watches my every move, and follows me everywhere, waiting for me to put on my shoes. He knows we are going on a walk.
Today, on our favorite trail, he started to limp a little. Instead of the full 3.5 mile hike, I turned off a side trail to loop around and cut it short. This new trail (new to us) was great. We found a small stream and surprised (and were surprised by) a flock of wild turkeys along the ridge line. If we kept to the standard loop, we would not have found this cool area of the hill.
While walking, I remembered another value of exploration, in software testing. Our test team had naturally evolved a style, where we would perform targeted and purposeful “ad-hoc” testing on a new feature until we were comfortable enough with the functionality. Then, run the official test cases for “score”. Once we had the score, we would fill in the rest of available time trying different ways to break the system. These practices worked for us, and I really didn’t think about it until one day presenting status to the executive team.
The status (recreated here to protect the innocent) for this week showed 327 test cases passing with 5 failures. We also opened 30 new bugs, and received 43 bug fixes. A pretty average week. However, one of the directors asked a question. How could we have only 5 test failures and yet find 30 bugs? His point of view was that our written tests should find all of the bugs, and if we were finding most of the bugs through other means, this pointed to inadequacy of our written tests. I explained how this happens, the written test cases actually find few issues. Most of the bugs are found with the ad-hoc and negative tests, and especially in system with multiple test endpoints (several APIs, several UIs), its prohibitively expensive to script all of the possibilities. This conversation piqued my curiosity, though. Were we behaving optimally? Should we invest more in test case definition?
One thing that I came to realize, our test cases were generated using requirements and design documentation. The developers also used the same requirements and design documents to create the code. Developer testing will generally ensure proper operation, at least for the happy path. So, test cases generated by the test team, from the same original source, tended to have a high pass rate. Problems and bugs with the system had to be found through other means.
After some research, I found that our practices had a name, Exploratory Testing, and ET is used by many organizations in different industries and software types. The exploratory testing approach emphasizes the creative engagement by the tester (as opposed to following a test script), to contemporaneously design and execute tests. The test team was not following the test scripts, but using their experience, creativity, and observations to find new tests to try.
I valued the time we spend on exploratory testing method more than spending more time on written test cases for two reasons. Exploratory testing was far more productive in finding bugs and errors in the code, plus I had more confidence in release readiness based on the testers judgement, rather than the quantitative results of test case execution.
My research into exploratory testing lead to several great resources:
- Exploratory Testing Explained, James Bach (pdf)
- Exploratory Software Testing, James Whittaker (Amazon)
- Lessons Learned in Software Testing, Cem Kaner, James Bach, Brent Pettichord (Amazon)
These resources helped our team refine and improve our practices, by learning new techniques and attacks from the authors. Some of these methods are called tours. Our team got better at testing by studying these new tours.
We even found a tool to help manage exploratory testing sessions, called Session-Based Test Management, which could help put a measurement (i.e. test case count) to the testing effort. This measurement could have eased some the questions raised about the quality of our test cases.
All in all, I’m glad that question about the quality of our test cases came up. Our team learned that we were following industry best practices and we learned how to improve our practices.