We’ve put together an excerpt of our recent webinar⁠—Using Reinforcement Learning to Write Java Unit Tests⁠, presented by Diffblue’s CEO Mathew Lodge—based on a talk he originally gave at the 2020 QCon London event. 

Using Reinforcement Learning to Write Java Unit Tests

This talk, in a nutshell, essentially looks at the use of an AI technique called reinforcement learning, because in the same way that AI can search for the best Go move, AI can also search for the best unit test for your code. Essentially, we’re using the same technique that’s used by AlphaGo, the Go playing AI algorithm that Google created, and we’re using a very similar approach to write unit tests.

What does this have to do with DevOps?

Well, Continuous Integration in theory is this fantastic pipeline where you write code, you commit it, unit tests run, you find any bugs and fix those, and then the software continues to flow down the pipeline and if it passes through all of these things then eventually it goes out into deployment. This is the dream of DevOps. 

But in practice, what often happens is that organizations don’t have enough tests that run early in the pipeline, and then what happens is the errors don’t get caught. So CI in DevOps is very similar to the kinds of approaches used in factories with production lines, and the whole idea in a factory is that you should find the errors early, and the earlier you find an error, the faster and cheaper it is to fix—and the same is true of software.

Unit regression tests run early in the pipeline—but what are they and how do you get them? Find out in our free ebook.

If the error is not found until much later (for example, integration tests, or maybe even in deployment time or into production) then the time it takes to fix that fault is much greater—because first of all, you have to find out what exactly happened, and identity and isolate the root cause—and take a developer who had probably moved on to something else, and they’ll have to go back to that piece of code (stop what they were doing) and fix the issue, and then you’ll have to run back through the whole cycle again. 

So having tests that can run early, frequently and quickly in that development cycle is very important. 

So what is AI? 

In some ways, you were taught as a developer that the right way to write software is to think carefully, build your algorithm, and not just hack the code together until it works. AI in fact is very much like hacking the code until it works. Some brief background about AI: 

In traditional algorithms, code transforms input to output with something like multiplication or division, and we can write down an algorithm that will do that every single time. However, some problems are much more difficult for us to write algorithms for–for example, image identification, historically, had been one of these challenges. Answering the question “What’s in this picture?” was incredibly difficult for 30-40 years. There were a lot of very smart people trying to solve this problem and entire conferences devoted to finding algorithms, thousands of papers written about how to do image recognition, many clever theories about what the algorithm might look like, e.g. maybe it should identify edges and shapes, and from there pick out 3D forms. And none of it worked. Absolutely none of it. None of it was even close to being as accurate as a human just looking at a picture and saying, “That’s a cat and a dog.”AI essentially iterates on a statistical model to figure out how to connect the input to the output. So when you train an AI model on machine learning and neural networks, essentially what you are doing is iterating on the statistical model and hacking the values of this model (the weights in the neural network) until the network is able to recognize the image and generalize the result, so that when you show it lots of pictures of cats, it will reliably identify the cats in all of those pictures.

Most neural networks today will produce an output that gives you a number that indicates how confident it is that the item it has identified is in that particular picture. And of course, it will also get it wrong. But what’s interesting here is that since the evolution of convolutional neural networks a few years ago—a breakthrough technique for AI that enabled ImageNet, which is a neural network algorithm that was a competition winner for image recognition developed by a team at Stanford—ImageNet is now more accurate than human beings at identifying images off the internet. It’s incredibly good at recognizing things, and there have been versions of it developed for things like looking at x-rays and detecting cancer and so on. 

AI: Not Artificial, Not Intelligent

So AI has solved a problem that humans just couldn’t get to, and in some ways, it’s badly named: AI is neither intelligent nor artificial. It’s not artificial in the sense that it solves real problems (problems programmers cannot reach). On the other hand, it’s not a future robot overlord and it’s not sentient: it’s just maths. What is happening when you’re training a neural network is something called gradient descent. If you can think about a 3D surface, and say the height above the ground is what you want to optimize for, you have an error function or some kind of objective function that tells you how close you are to the right answer. In gradient descent, you may not know where the right answer lies on that surface, but you certainly know which direction is down. So if you move in the direction of down and keep repeating that, eventually you will come to a minimum. 

In the case of ImageNet, that’s essentially what you’re doing when you train it by showing it a lot of images and saying, “Here’s a picture of a cat. Here’s another picture of a cat. Here’s a third picture of a cat.” As you keep doing that, you “walk” over this landscape to find the right answer. So AI is really great when you can’t brute force the problem and image recognition is a good example of that, as is playing Go—and writing software tests. 

Watch the webinar to hear the rest of the talk