The ‘amount of badness’: weighting Severity for defect analyses

The ‘amount of badness’: weighting Severity for defect analyses

I’ve previously suggested that a good defect severity scheme should have five values.  I’ve also noted that attributing Severity is one of the most important ways that we classify anything in our work as testers because so much can be learnt from analyses of defect reports.  Does that mean that when we do these analyses we have to do each one five times, once for each severity level?

Not always, in my view.  Some metrics definitely should be analysed separately by severity level.  I’d recommend this, for example, for analysis of defect rejection rates.   I’m just as interested in how many are rejected and why at the lower severities as at the higher ones, and the reasons may vary significantly across the severity levels.

What, though, about something like defect distribution within the product?  Here I’m looking for those parts of the product that are giving us most trouble – defect clusters – and whether that’s a small quantity of nasty defects or a large quantity of more benign ones is not so important at first.  Both situations should be investigated in more detail; the first task is to identify them, the second is to prioritise more detailed investigations.  The most obvious three options all have disadvantages.z

  1. Use only the overall quantity of defects?  That takes no account of whether they are mostly very bad ones, or mostly insignificant, so it shows me the places of interest but doesn’t help me to prioritise my subsequent investigations of them .

  2. Analyse each severity level separately?  In a 5-level scheme, that takes 5 times as long as option 1, and the results will need to be combined before I can see where the most trouble seems to be across all severities.

  3. Analyse only the top two or three severity levels as a single group?  Defects that would have little significance individually might, when there are lots of them in one place, indicate a significant problem area, but I wouldn't see these clusters .

What I’m really looking for are clusters of trouble and an indication of what I call the ‘amount of badness’ in each one.  The one that I’ll investigate first is the one with the most badness in it.  I can find this in a single pass by using Weighted Defect Quantity (WDQ, if you like TLAs).  To get this, the defects at all Severity levels are combined into a single quantity by multiplying the quantity at each level by a factor that reflects the increasing impact of defects as their severity level rises, then the results of these calculations are added together.  This is not in itself a metric but it simplifies the analysis of the metrics for which it can then be used.

The worse the defect, based on its severity, the higher the weighting it gets.  In the 5-level severity scheme illustrated below, Trivial gets weighting = 2, Minor = 3, Major = 5, Critical = 8, Blocker = 13 (values taken from the Fibonacci sequence).  Then: Weighted Defect Quantity = sum for all severity levels of (number of bugs at that level * weighting for that level).  E.g.:

Severity Qty of bugs Weighting Weighted Level Qty
Blocker 1  * 13 = 13
Critical 2  * 8 = 16
Major 11  * 5 = 55
Minor 8  * 3 = 24
Trivial 14  * 2 = 28
Weighted Defect Quantuty: 136

WDQ represents the total ‘amount of badness’ that has been detected in whatever was being analysed.  When this has been calculated, further analyses can be done once without regard to quantities at individual Severity levels for metrics where this doesn’t matter.

The bad news is that I haven’t yet found a defect management tool that will do this for me, but then, there are other aspects of my favourite metrics that the tools aren’t good enough at so it’s only one of the reasons why I usually start my regular weekly / monthly analyses with a extract from the defect database that I can then manipulate however I want to.  And if there’s a technically-minded tester or a test-friendly developer who can automate this for me, so much the better.

As well as for an initial view of defect distribution by location in the product and by quality characteristics, I use WDQ for defect detection effectiveness and a few other metrics.  And if defect detection effectiveness sounds important, yes it is, so watch this space for more about it!

This article is a part of a series. You can check the next one right now.

Part 5                                                                                                                                                                                                                                     Part 7

Author: Richard Taylor

Richard has dedicated more than 40 years of his professional career to IT business. He has been involved in programming, systems analysis and business analysis. Since 1992 he has specialised in test management. He was one of the first members of ISEB (Information Systems Examination Board). At present he is actively involved in the activities of the ISTQB (International Software Testing Qualifications Board), where he mainly contributes with numerous improvements to training materials. Richard is also a very popular lecturer at our training sessions and a regular speaker at international conferences.