Unit regression tests are tests that run early in the development process and find coding errors that break something that used to work (i.e., a regression). A tool like Diffblue Cover can write such unit regression tests automatically.

A very good example of the prevalence of regressions problem in software came this week, when it emerged that the NHS’ COVID-19 app had a bug (now fixed) that led it to fail to notify some people that they should self-isolate.

Originally, the software worked exactly as it should have, but a subsequent software change designed to improve the app broke it—and no-one noticed until weeks later.

During testing in the Isle of Wight and the London Borough of Newham, the statistical model used by the app was implemented correctly, and users were warned appropriately. But before wider release to the public, the model was enhanced to take into account “infectiousness”—the fact that those with the virus are most infectious when they first start to show symptoms. 

This improved the accuracy of the model, which meant that the risk threshold (the “score” above which a user is likely to be infectious) could be lowered to catch more COVID cases without causing more false positives. A false positive in this context would be to tell a user to self-isolate when they were not a significant risk of spreading the disease.

The risk threshold should have been lowered given the new scoring approach. But that change didn’t make it into the released app (it’s not clear why from the Government Health Tech blog post).

How could unit regression tests have helped?

Unit regression tests can help software developers spot this class of error at development time. In this case, a tool like Diffblue Cover would write two tests that trigger both sides of the control flow—the case when a notification should be sent, and the case when the risk threshold isn’t met (and no notification is sent). There are two ways this can help the developer spot a latent problem when a new change is made to the code:

  1. The unit tests fail, signalling some new behavior that didn’t happen previously (when the code worked).
  2. When Cover writes new tests for the new code, comparing the old tests and the new tests would highlight the change in values being returned by the risk function. Even if the previous tests didn’t catch that—unit tests are not foolproof—the diff between tests highlights the behavioral change in the code. Simply flagging that difference can be enough to alert the developer.

Unit regression testing can improve a development team’s ability to spot problems way before they are shipped to users.