Skip to content

Measurements and Goals

I think this blog post about metrics vs. measurement gets to the heart of the problem with SMART goals. If M is defined to require a metric instead of a measurement, you end up with a lot of extraneous, wasteful busy-work (surveys, presentations, numbers of words/pages/LOCs, etc.) that detract from actually doing what you want to do. Perhaps REAL goals are a better way to go.

Tester brain vs. end-user brain

As an end user, it is easy to miss bugs, or to shrug bugs off as a mysterious anomaly of the software. For the past several months, I’ve noticed that when I am reading mail in Outlook’s three-pain view, grouped by From, if I drag/drop a group of messages to another folder, I appear to remain in the Inbox folder but it appears empty and I have to sort of click around randomly to get the inbox contents to reappear. (I think it’s basically to click on another folder and then return to the Inbox.)
(Really, all that description wasn’t necessary, but I can’t help making my bug report descriptive.)

It’s one of those terrible “intermittent problems”–not happening every time, but enough to be annoying (yet not annoying enough to open a support ticket – and I don’t even know how to do that, with MSFT, with our internal IT people)?

This is easy to do while testing too. The combination of “not terribly annoying”, “not terribly consistent”, and “easy, instinctive workaround to get back to the thing that I really want to test.” Especially if you are working from a pre-defined test script. (In theory, Exploratory Testers are less likely to ignore this problem.) These are particularly likely bugs to escape from development as well, because the developer’s tests are unlikely to ever hit them – or he’s likely to ignore them because he knows the easy workaround instinctively.

It’s important to deliberately turn on the “tester brain” while testing because our “end user” brain is much more tolerant of poor quality.

Humor in requirements

On a recent nostalgia trip for my first computer and first GUI, I encountered the following example of user-hostile documentation:

Documentation quality varies from good to unclear and insulting: “The Becker BASIC system will help you to learn structured programming: After about the 15th or 20th error message, you’ll learn to be much more careful in your program development.” – BeckerBASIC review

The reviewer interpreted this line as being unclear and insulting… although out of context, it is hard to tell if this characterization is an exaggeration (the manual writer was trying to be humorous, not antagonistic, I think). It is hard to imagine such a line appearing in a serious commercial software product’s documentation today (from Microsoft or Oracle or Sun, for example).  More likely, perhaps, in an open-source project or an indie development shop whose brand attributes include “edgy” or “attitude”.

Still, if we were to slavishly follow the maxim: “the manual (documentation) is the oracle”, would we be trying to test that the software produces a sufficient number of error messages on poorly structured code?

I think this is the solution to the puzzle I posed earlier (here and at the CAST 24×7 website): a requirement that is intended to be understood as humorous and therefore only a “requirements jerk” would produce the test case: “create a program with terrible spaghetti code. Expected result: 15-20 errors generated, programmer learns about structured programming.”

Can virtualization make record/playback more useful?

It seems like there are a few big problems with record/playback style GUI automation:

  1. there is a big upfront cost to identify all of the controls
  2. there is a big maintenance cost to update all the control mappings when the app changes
  3. there is a lot of variability about the state of the application and the operating system when you re-run tests that can break the automation

If you are a big fan of exploratory testing, there is a temptation to want to record/playback your test sessions. That’s when item 3 comes to bite you unless you have item 1 done (but to to the best of my knowledge, none of the automation record/playback tools really do 1 well for R/P anyway.)

Could we mitigate 3 by using virtual machines? If we have a VM in a completely known state, then save a snapshot, and start your record/playback tool. Then when you want to regress, you can always start from the same playback point. Everything will be pixel- and byte-exact to the state where you first ran the test.

Is anyone doing this in real applications?

Not being a requirements jerk

There’s a spirited discussion about requirements going on at CAST24×7. Ben Simo visited a store with some requirements posted on the door: “no backpacks, food, strollers, unruly kids or cheap husbands”, etc.

He asks how you would test these requirements? How could you refine them? How do you determine whether bugs need fixing?

The discussion centered largely around how bad these requirements are and how they need refinement. No doubt. But sometimes it’s better not to be a requirements jerk and just get some testing done. It’s a great way to build goodwill.

I wrote:

There are two separate testing problems here – one is related to testing the quality of the requirements, and the other is testing the quality of the implementation (enforcement) of requirements. We have to assume that this is a “waterfall-type” development methodology because such ambiguous requirements already exist. At this point it’s too late for me to provide input into the requirements – the sign is already there on the door! So I can approach this testing problem two ways – one is to be the perverse tester who wants to make a point about what an awful development process this is, and the other is to produce useful tests that the customer cares about.

The perverse tester could easily come up with dozens of test ideas that would show how the requirements are faulty. I could carry in one hand a giant briefcase with food cleverly concealed under a false bottom and a covered thermos of hot drinkable chicken soup in the other. I could come in with my girlfriend (to whom I am not married) and constantly complain about how expensive things are. I could bring in 40 extremely well-behaved school children and have them stand directly in front of the cash register without saying a word. I could go into a dissertation about exclusive vs. inclusive or, and what exactly it means to “be 18 years old or responsible parent”. I could send a 72 year old bachelor in and complain he wasn’t turned away because he’s neither a parent nor exactly 18 years old! I could buy the most clearly defective product on the shelf and then return, demanding a refund – and explain that I’ll buy $200,000 worth of merchandise tomorrow if the exchange it, but otherwise I’m taking my story to the ActionNewsOnYourSideConsumerTeam and they will be swarming the place with cameras tomorrow!

But what does this get me other than a reputation as being hard to work with?

Some of the thinking about the requirements in the comments are useful for guiding my tests… but still, you need to start with the assumption that the requirements are “pretty good” and respect roughly what the product owner wants. So I’d start out with several basic tests (1) person with large backpack, (2) person with large stroller, (3) couple who’s been instructed to fight over the price of everything, with the man playing cheap, (4) someone attempting to get a refund, (5) an adult with unruly kids, (6) an adult with well-behaved kids, and maybe a boundary test with (7) a responsible 17 year old, (8) an unruly 19 year old, (9) someone with a small backpack, (10) someone with covered food.

Then I’d meet with the business owner and report who was let in, how they were treated, whether the enforcer had to think hard about the rules, etc. Maybe ask for clarification about whether they really care whether the cheap one is the husband or the boyfriend or the wife, etc. By doing the obvious tests first and presented that “yes, in general your requirements are correct” or “no, the staff in your store are paying no attention to the requirements”, I’ve opened up the conversation – I’ve shown that I have value as something more than just a complainer, we’ve got some testing done, and we can start to think about the meaning behind the requirements. I don’t think this wastes any time (we’re going to run those tests anyway), but it builds up a lot of good will.

(And as I read it, I think “cheap husbands” is meant to be cute… so perhaps I’d skip the cheap husband test, that seems awfully passive-aggressive to actually test for what is obviously not a real requirement. I doubt this kind of thing happens in software though. Anyone have any examples?)

Tonus Interrogativus

The tonus interrogativus is used in Gregorian Chant to indicate a question.

Software testing is all about questions. Here are some of them.