Transcript
Hello, everyone. I’m David Rubenstein, editor-in-chief of SD times. And I would like to welcome you to today’s SD times live webinar, write a year’s worth of code in eight hours. That’s right, that’s not a typo. While you’re pondering that, I have a couple of quick announcements to make. First, today’s session is being recorded and will be available on demand on the SD times.com. website for future viewing. Second, the presentation today will be followed by a Q&A. So if you have questions at all, just type them into the chat tab on the right-hand side of your screen. And we’ll get to as many as time will allow. So answering the question, how do you write a year’s worth of code in eight hours by using AI to automate unit testing? If you’re going to shift left, unit test automation is key. If you have to allocate developers to write those tests, you’re robbing Peter to pay Paul because time spent on testing is not time spent on developing new features. Today, we’ll look at how Java developers can benefit from using AI automation. Joining us to kick things off is Matthew Richards. He’s the head of product at Diffblue. A company that offers tooling to automate the writing of unit tests. So Matthew, take it away.
Excellent. Thank you very much for having me here today. I’m, as Dave says, head of product at Diffblue. I’ve been here for a couple of years. Prior to that I spent 10 years at Cisco leading large complex Java development teams. I have a background in C++. So I’m here to talk to you today about how unit tests are so important to help your organisation move quickly. But as Dave mentioned, they’re How do we do that in a way where we’re not robbing Peter to pay Paul? How do we make sure we can write enough tests to move quickly, but the cost of writing those tests doesn’t become counterproductive? So we’re going to talk about AI assisted unit test automation. So, step by step, I’m going to start with a recap on unit testing. What is unit testing? And how does it help us to deliver software faster? When we talk about the key two categories of AI, we all think about AI, we talk about AI, everything these days is sold with AI. But let’s talk about what that really means and which forms of AI are used in different cases.
And I’m going to talk about how Diffblue uses AI to automatically write code. Here, writing a year’s worth of unit tests in an eight hours. And I’m going to give you a quick overview of the product. We’re going to jump into IntelliJ. I’m going to jump into GitHub and show definitely redaction. And then we’ll have plenty of time for q&a. If you’ve got questions, please feel free to post them in the panel as we go.
So our customers tell us we are shipping fast, and we want to make it faster.
It is about speed, the quicker and faster you can release means you can release more often, which means you can respond to your environment quicker. And that might be opportunities you need to respond to quickly you need to release a fake feature to win a deal, you need to release a feature to remain competitive, you need to release a bug fix quickly that’s maybe hindering you with a customer. So being able to ship fast is necessary in our modern world to be able to compete. But also a company that’s able to ship faster is a company that must have good quality practices in place. Shipping fast is a byproduct of knowing that your code quality is high and knowing that you have the procedures, mechanisms and pipelines in place to ensure that you trust the software that you’re shipping. So I talked today about how unit testing helps you move faster. But it’s also about how we can write better quality code or reduce the number of regressions making it into the field.
So we all hear about this in terms of shifting left. So a recap of how unit tests help us shift that shift left, and what shifting left is all about. So bugs cause the delivery of software to slow down for so many reasons. We’ve all been in that situation where a bug has gone into the wild. A customer has found it. We know that we have a bug but we do not know where that bug was introduced. And we do not know where to fix it. So bugs that are making it into the field can take hours, days, weeks, months. I had one bug that took over a year to track down. We ended up having to build debug builds, give it to the customer to wait for a memory leak to manifest itself and that took a year and that’s not extreme. We know that happens.
So we need to stop defects from getting into the field because even though we know we have a defect, we don’t know where that defect is manifested. We do.
In a way to fix that defect, then we have to go through the whole process of bringing it in house, making the repair, testing it, validating it, giving it back to the customer, and then waiting for the customer to have that confidence that the book has been fixed. So we’re trying to avoid bugs in production, we’re avoiding regressions.
As we shift left, further away from production, the time to find and fix defects reduces. So if we can find that bug before it makes it into production with chopping down the time it takes to to deal with that defect down to hours. So this is done in some sort of integration testing.
The point of integration testing, though, the developer who wrote the code has already moved on to the next job that developers forgotten about the code change they’ve moved on. And the QA engineer is running that integration test and maybe catches a book. And maybe that integration test, they’re running that for maybe 2030 different developers have all contributed code changes that day, how do they know which developer has actually introduced the defect?
So integration testing is much better at catching defects than letting your customers do it. But actually, knowing who introduced the defect is still a problem. And then even then you track down who you think is caused that problem. You go to that developer and you say, hey; I need you to stop doing what you’re doing. Have a look at this piece of code that you wrote for me yesterday, and fix it. That developer has moved on, though, they may have already written three or four other pieces of code since then. So that developer is context switching, they’re having to stop what they’re doing go back to what they were doing before. Recall exactly why they were implementing something what the exact behaviour was that they wanted to implement, and context switching kills velocity.
So we need to go further left, we need to keep the responsibility for finding that bug on the developer’s desktop at the time that they write that code. So the developer doesn’t move on to their next job until they have caught all the regressions and fix them whilst it’s still in their hands. Bringing the responsibility for testing to the desktop is what unit testing is all about. And unit tests are a great mechanism where developers can write those tests and validate that functionality themselves.
So unit testing is speeding up your software delivery for so many reasons. But ultimately, the thing that kills productivity more than anything is the huge amount of switching back and forth. Context switching is we’re trying to understand why bugs exist, who introduced them, how to fix them.
So, as well as shifting left, the other thing that unit testing does is it gives us precise information about regressions. So an integration test is testing an end to end behaviour. It can test a whole API flow from the front end of your application all the way through the database all the way to remote servers.
So when something goes wrong, you know it has gone wrong, but you don’t know where it has gone wrong, and where to fix it. Unit tests are testing every single method precisely. So when a test fails, and you find the regression, you know, exactly when we say surgically where that defect needs to be fixed.
And then finally, because unit tests are describing every single behaviour of every single method in your application, the unit test itself is documentation. A developer can read a unit test before they make their code change and understand exactly what the behaviour of that code is, before they’ve changed it.
So they’re not having to work out what does this piece of code does that I’m about to change, they don’t have to guess and take risks. If they can just read that unit test or that suite of unit tests for the method and under examination, then they can make that code change with much lower risk. They can move quicker. And it’s less risky because they know the context of the code that they’re working on. We’ve seen this all the time with legacy code, you pick up a piece of code that was written five, six years ago, you do not know what it does; you spend half the time of actually working on that piece of code, just understanding what it does to begin with.
So let’s talk about what is good unit testing. So I’ve lived in the unit testing world for long enough to see great unit testing and really poor unit testing.
Great unit tests, a unit tests written by the developer when they make the code change. We’ve talked about this already keeping accountability with the developer, so that they fix the problem before it makes its way into production. And second, we’ve mentioned that unit tests are granular, they’re surgical, they test every single pathway through each method to validate each individual behaviour. So when something fails you can say in this method, this behaviour has changed.
So, conversely, a poor unit test is written afterwards by a QA engineer someone else. So as a developer, you finish your code and you throw that piece of code at a QA engineer to write the unit tests for that QA engineer just has all the burden of understanding what you did they have to understand the code before you made the code change; they have to understand why did you make the code change in a particular way, they’re guessing, and they end up being biased about the way you made the code change, then they write that unit test. So it takes a lot longer for a QA to write that test. But that QA is bugging you. They’re asking you questions constantly. Why is it done this way? Why have you made this choice?
And worst of all, the developer loses accountability for the quality of code. So we’ve seen where QA is writing unit tests, where they end up throwing it back at a different developer to fix when they find a bug. We want to keep accountability with the developer for the quality of the code. And then second, we’ve already mentioned integration tests verify end to end functionality.
If you verify end to end functionality, you will not know about corner cases of every single individual behaviour, you’re going to miss things. And ultimately, when the integration test fails, you will not know why it failed and exactly which piece of code needs to be repaired.
So integration tests are dangerous. But people think they are great unit tests, because they’re written in J Unit. I wrote a J Unit, therefore, it must be a unit test. Integration tests are a tool on top of unit testing. So let’s have a look at it’s very simple unit test. On the right-hand side. This is a unit test. For a method written in Java, the method under test is called increment. So this increments the number by one very simple method. So we create a counter object. In this case, we call the increment method, and it increments that counter by one.
So this test is just telling testing a very specific behaviour. It’s testing the behaviour that when you create a counter object that starts at zero, the first time you call counter dot increment, that counter should equal one.
It’s very simple, but it’s very granular. And if someone messes up and they regress the behaviour of increment, this will catch that an increment might be a big complex piece of business logic. And we might use this individual behaviour 1000s of times across your application.
So, unit testing, though, takes a lot of time to implement our research by surveying software developers tells us that people spend anywhere between 25 and 50% of their time writing unit tests. That’s a lot of time. Quote, here, we ended up spending more time writing unit tests than we do actually writing business logic. So unit testing is not as simple as just do it, we need to know that the unit testing that we’re doing is valuable; we need to know that we’re unit testing the right things, we’re not writing too many unit tests, we need to make sure that we are balancing unit tests with our ability to deliver features. And this is called the unit test philosophy paradox.
So there’s other things about unit testing that we need to consider as well as the time we’re spending on writing unit tests.
How do you actually know that the unit testing you’re doing is valuable? So here, we’ve got a distribution, typical application, where we see all the code here, the curve is showing us we have a large quantity of low complexity code. So this is code that has maybe one or a few pathways through it all. And then a small tail, a very complex code, this might be your core business logic, a state machine, something that is very complex for a human to actually write and very complex to unit test.
So we give ourselves targets, companies use unit tests, coverage targets, to tell a developer, this is how much unit testing I want you to do. So when that developers writing some code, she can come back and say yet, I’ve hit my 75% code coverage. Everything’s good. We’re confident that we’ve mitigated the risk, and my project is safe from regressions.
But in this example, we’ve tested to 75% code coverage 75% of the lines of code have been tested. But what about this risky bet at the end? The lowest complexity code is the easiest to test. We’ve not tested that complex functionality. We’ve not tested that big, gnarly state machine that’s at the core of your business function.
Now let’s see the piece that’s really hard to test that took developers a long time to get right. So when that thing on the right goes wrong, we’re in trouble.
So code coverage is not the answer to knowing that you’ve got good quality code, and that you’ve reduced risk. It’s more complex than just the number. And I’m going to show you an extreme but real example of this.
Let’s go back to that piece of code we looked at before. So we’re testing our increment method.
code coverage, as I say, is a measure of how many lines of code have been tested.
In this test method, the coverage actually occurs when I call the dot increment method. So it’s at this point that my coverage calculation says, Yep, I have covered 100% of the behaviour of this method
is this point that I get the confidence that I have tested the code, but what coverage actually means is not that 100% of the lines of code have been tested, it’s that 100% of the lines of code in this pathway have been called by the JVM in this case. So 100% of that code has been executed. The test part of this is actually the assertion. It’s where we look at the result of the JVM running that piece of code.
And we actually say is the answer what I expect it to be.
So let’s imagine an extreme but real-world situation where I’m just gonna delete that assertion altogether. Now, I’ve still got my test, I’m still calling that code, I’m still executing 100% of the code for this behaviour of the method under test. But I’m never asserting what the behaviour should be. I still get the code coverage, sonar cube, and coverage tools. JaCoCo will all tell me you’ve got the coverage, you’re safe. But you’re not. This will never catch a regression. And you may say this is extreme that developers don’t write assertions. But what if you’ve got several assertions What if you’ve got four or five assertions here and you forget one, you miss one, your bias towards what you wrote as a developer, and you forget that there’s a corner case? So it’s not about writing no assertions, it’s about writing either incorrect assertions, missing the assertions, or being unintentionally biased towards particular behaviours.
So again, even getting coverage, even for the highest complexity code, is not meaning that you are going to catch regressions.
So how do we solve this problem? We’re going to change subject slightly here, we’re just going to talk about AI for a couple of minutes.
AI is something that we hear about all the time. AI is going to solve every problem every customer has.
And everyone has this idea of what AI means. So I’m going to explain AI. By breaking it down into two different AI, I’m going to tell you about the type of AI that you probably are thinking about. And I’m going to tell you about the second type of AI that is used equally much, but people rarely think about it. And this is really important, because it’s actually the second mechanism that we use when we want to use AI to write unit tests. So let’s have a look at the two typical groups of AI. We have supervised and unsupervised methods for AI. We’re going to look at this by comparing two different products from Google. On the left, we’ve got Google Photos. So this is the search engine where you type in that you want a picture of a bus and you get a picture of a birth, you get a whole page of pictures of buses.
This is a supervised method. And unsupervised method is for example, the Google application AlphaGo that plays the game go automatically. We’ll talk more about dough in a second.
So Google Photos, you want to search for a picture of a birth? How does Google know that it’s giving you pictures of Besitz? Well, what it does, first, is it has an enormous bank of images that it’s crawled the internet for, and it’s pulled these images down, and it’s running AI offline. So that means before you typing in your search to analyse all of those pictures and to group them.
So way before you’ve ever searched reverse, Google will have taken all the pictures and said these pictures look the same. These other pictures look the same. These other pictures look the same. And maybe there are groups where there are bits of different groups. So maybe your first group is older pictures of buses. The second group is the pictures of cars. And the third group is pictures where there are cars and buses.
But Google doesn’t know that there are pictures of buses cars and
cars and buses, they just know that the images look similar.
So this is done before you actually make your search where we’re training the algorithm, sorry, training the AI to group pictures together, we still need to know that versus though. We find a way to tell what is in a picture. So this is called labelling. This is where we discover that some of those pictures in the first category are of buses. So if pictures 135, and seven are buses, and we know their buses, maybe from the name of the image, or from some metadata on a webpage that we got the image from, we can infer that images 246, and eight in the same category must also be verses because they look similar.
So we have trained our application to group these together. And we’ve labelled the pictures so that we know what they are.
This requires a huge amount of computational effort. Because we’re doing all of this in advance. We’re grouping as many combinations together.
What if we have some birds in that pictures as well? Well, well, we have to group all the pictures of birds together. And we’ve got unlimited sets of groupings of pictures here. So we’re looking at an infinite set of combinations of data here. And the rules are infinite as well is that there are infinite numbers of different things.
Fortunately, here, a supervised model is always learning. So if you then search for buses the following week, you’re going to get different answers. And that’s because the AI has learned the AI has maybe got better at determining what best looks like but may have also got worse.
It may also have a bigger data set to analyse. So there are more pictures that it’s pulled in over that week. So we get a different answer every time. That’s okay, with Google Photos, we were caught with that, that has really catastrophic effects in certain fields, though, for example, in a highly regulated environment, finance, AI stock trading, for example, you need to know in certain fields, that the answer you’re getting is going to be the same every time you have to justify why did the AI make that choice. So determinism is important in some situations, not in others. And then finally, is bias towards the inputs? What if all the pictures in my catalogue are of London buses? Those big red buses. You only get pictures of Big Red buses. So every time you search Google, you’re never going to see a yellow bus from Tokyo, for example. So supervised models are biassed towards the what you give them.
So supervised methods work really well in certain situations.
Unsupervised methods are a completely different approach. We can use these together. And often you might find these together. But unsupervised methods are used, and can be chosen based on the criteria of the problem you’re solving. So go is a game, one of the oldest board games in existence. Go is a Chinese board game that was invented two and a half 1000 years ago.
This game has got 2.1 times 10 to the 170 moves. So Chess has got 10 to the 120 moves. So go is pretty much infinitely more complex. But AlphaGo is an AI that can play go and beat a world champion.
AlphaGo when it is playing against that world champion has never played go before.
AlphaGo does not do any learning ahead of schedule.
That’s ahead of the game. So it’s learning go for the first time that it plays. In reality, there’s a bit of both models going on. But essentially AlphaGo the first time it plays go with that competitor, it learns, and then it forgets it when it then goes on to the second game.
That may seem crazy. But the reason we can do this is that although AlphaGo essentially has infinite moves, it has very specific rules.
So unlike Google Photos, where there’s infinite different things versus cats, dogs, frogs, trees, AlphaGo has got a limited set of rules, you can only move a piece in a particular way. At a particular time, there’s a definition of success of winning and a definition of losing. So what we can do is we can use those rules are bound those almost limitless moves down to just the moves that are in front of me.
So what AlphaGo will do is it will look at only the next move. It may also look at the next couple of moves. But in principle, it looks at what’s directly in front of it. It evaluates all the opportunities according to the rules and then plays the move it determines is best.
So AlphaGo is solving a limitless problem, but much quicker with lower computational effort, because we break down the problem using the rules. And we only evaluate feasible solutions.
This means computation that you’re going to get deterministic output. So if you reset AlphaGo, every time we play, you’re always gonna get the same moves for a given board layout. That’s really not a bad thing.
It means that, for example, those applications where we need to know that the AI is deterministic, we have a solution. And it doesn’t need to keep improving over time by learning. And one reason that we don’t want to teach unsupervised methods is we don’t want to introduce bias.
So, computationally, there is zero bias. Because we’re using those rules, we cannot break the rules. It doesn’t matter what the inputs are that we’ve given it, we’ve given it the rules. So we’re removing that bias, meaning that we know we’re going to get a result, that is, as approximately as we can play into the rules.
So two different categories here, I bet all of you are thinking about the left-hand side, when you think of AI. Think of training, you think of a large AI engine, churning numbers all day long every day. But what we’re going to talk about now is when we’re talking about writing unit tests using AI is we’re using an unsupervised method. So why do we use unsupervised methods when writing unit tests with AI, so there are infinite essentially infinite number of unit tests possible? There are because there are essentially infinite ways of writing code.
We can’t evaluate all those possibilities, just slightly can’t evaluate the 2.1 times 10 to the 170 potential moves in the game go. There are also practical things in the application. There’s desk IO, there’s network IO, there’s a framework to deal with humans write different code in different ways. Different companies have different coding styles.
So what we do is we use what’s called a probabilistic search. We use an unsupervised model where we’re looking at the probability of finding a good unit test based on using the rules. So we take the code in front of us, we apply the rules of Java. And we use that to break down the problem. And we use statistics to then find the right unit test. I’m going to show you exactly how that works now.
So definitely cover is a tool that we produce here at Oxford in the UK, we were spun out of the University of Oxford computer science department, because this is a really hard scientific, mathematical problem to solve. So we’re going to dig a little into some deep computer science for a couple of minutes to show you how we actually solve this problem.
So here, this is showing you that method that we use to write unit tests with the unsupervised reinforcement learning.
So the first thing that we do is we look at your code. And we say we want to write a unit test that exercises all the different pathways and behaviours of this code.
We use the rules of Java. Remember the rules of the syntax, there’s only certain combinations of keywords you can use to gather, we use rules about the frameworks that we know you’ve used, we’ve searched through your configuration, and we’ve learned a lot about your project. And we write an initial guess at what a good unit test would be.
We then run that unit test for the first time and we look at the results that we get from that unit test. We measure how good the unit test is. So we have different ways of determining whether tests are good. We look at how much coverage we get because coverage is an indicator of how much code we’ve executed. But we also look at other qualities about the unit test to determine if it’s going to actually catch regressions in the field. We then change that initial candidate unit test, we predict what a better test might look like using the score we got. We rerun the test. And we run this repeatedly 1000s of times until we get the test that is exercising the behaviour of the code. And that can actually catch regressions, and can write unit tests that look human like and I’m going to explain why that’s important in a few slides from now.
So this is reinforcement learning where you do something, you measure the success. If it’s positive, you reinforce that success by making it closer to success. If you’ve made the test worse, you don’t reinforce and you actually take the test in a different direction to give you a better test.
So the way we actually used this in production is you start with your working code base. So on the left-hand side here, the developer has their code base, their project and that they are working on. They make their code change. And they kick off their pipeline.
When they run the test, though, they actually run the tests that were written before they made the code change. So definitely has already written a set of unit tests that describe the behaviour of every piece of your code before you made the code change. So when you run those same tests, after your code change, there’ll be able to tell you about how your code change, change the behaviour of your application at the most granular level.
So you can then look at those test results and determine did I change the code in the right way? Or have I made a mistake, and I need to fix a regression or an incorrect behaviour?
Once you’re happy that every single behaviour is correct, your code changes approved, and we update the working code, we push back our unit tests and your code change back to your repo ready to protect the next developer.
So here’s an example of a difficulty unit test. So this is a unit test that’s testing the uploading of a file to an Amazon s3 bucket. This is one very specific behaviour.
I’ll challenge anyone to spot that a machine wrote this. The only thing that tells you that this was written by a machine is that a human would have taken about half an hour to write this test for the first time. Because they’re Googling, like on Stack Overflow, how do I unit test uploading a file to an Amazon s3 bucket and they’re trial and error writing this unit tests and changing the assertions trying to get them to work, they can end up writing unit tests that doesn’t actually exercise the behaviour, but that would have taken them half an hour. So you can tell a machine wrote it because it only took us 1.6 seconds.
So this unit test has been written quickly. It’s human like because you need to understand what this test is doing. We talked about how unit tests are documentation; they help you understand the behaviour of your code.
And then finally, we know this test is exercising the behaviour because those assertions were written by AI that validating every behaviour, there’s no bias from the human here, making a mistake of missing an assertion, or biassing, an assertion and we have written all the assertions that describe all the side effects of this method call. So you’ve written it quicker and more accurately.
So in the real world, Goldman Sachs used this on an example project, a small project that they gave this go with. And they found that we’d written a year’s worth of code in eight hours.
And that’s where the title of this talk comes from, is that in the real world, our customers are using this to speed up their software development. And for them, we wrote enough coverage to double that coverage with high-quality tests that are testing every single behaviour that I get to catch regressions. We did what would have taken a senior developer a year to do.
So let’s have a look at Diffblue in action. It comes in two different flavours. So we have an IntelliJ plugin for use by the developer on the desktop. And we have a CLI version that is most typically used in a CI pipeline. So we’re going to jump here to IntelliJ, I’m just gonna give you a quick look at this.
We have our plugin installed. And you can see next to each method of my application. I have a spring application here, a small science flask from chemistry class with a green Plus, we can click on that flask to write tests for this method. So there are various pathways through this method. And you can see once we do this, I prepared one here earlier. We go through; we write tests for the different pathways. And in this case, we’ve written for tests. So this allows you to write code as you’re developing on the desktop.
As well as this, we have our CI pipeline tool. So this is using the CLI version. Here, I’m showing you in GitHub. But this works the same in Jenkins, or other build pipeline tools are running in your pipeline. It means that we are running consistently in the same way for all of your developers. So if you’ve got a team of 100 developers, you know that because we’re in the pipeline, every developer is getting the same treatment, we’re able to also enforce policy. So you can say I’m only going to allow this code change through if Diffblue has done these particular activities to the code.
So here, you can see that I’ve made a code change. I’ve added a new method to this module. You can see my code commit. Diffblue has then run the existing tests, and we see they fail. That tells us that my code change has changed the behaviour of the existing code. So I should inspect that find out why I actually ended up changing the behaviour of the code. I only added a method.
The other thing is definitely has run and updated the unit tests. So a hidden cost of unit testing is not just writing the unit tests. It’s updating those.
As you get more and more coverage, you feel confident that you’re protected. But it costs you more, because you have to keep those unit tests up to date, Diffblue has updated them for me. So we can actually see Difflue has made its own commit here.
The developer interacts with the pipeline by looking at the diff between the code before and after they made the change. And this is where the diff in Diffblue comes from, is we look at what did the tests look like before I made my code change, and the test after I make my coaching.
So quickly, we can pretend to be a code reviewer here.
I can see that previously, the test failed in a particular way. So we’ve created the objects is required. And we’ve called the method under test to test a particular behaviour. And we see that before I made my code change, we asserted that when I call the method in this condition, the return value is zero. But now, because of my code change, the same conditions lead to a different result. And this is telling me when I call the same method in the same conditions as before, I now getting 20 return, this could be a regression, I could have accidentally refactored something incorrectly and got a different answer, I’ve surgically found that there is a regression and exactly where to fix it. I know the regression is manifesting in the take from balance method, and I can see exactly which part of the behaviour has changed. And displays just showed me that without looking at test results, I see the diff between the before and after behaviour. As well as this, we can see a change to the unit test by an addition of a unit test. So this is a new test for a method called get transaction hash definitely is written in a unit test that calls the method under test and shows me what happens when I do. I can look at the return type and value to see if it is correct or not. So definitely is helping me validate new behaviour and catching regressions. As a code reviewer, I start with the unit tests to understand what the behaviour is that’s actually implemented. Then I look back at the code that I’ve implemented, or sorry that I’m reviewing to see if they have done correctly it.
So we have the plugin, we have the CLI. And as well as this, we also have a lot of data that we produce. So Diffblue is running 1000s of times against your code. And we’re finding out a lot of information about your code. We intimately understand every single behaviour in your application. So we have an analytics tool that gives you much deeper insights into your code than you’ll have ever had before. We can break down the coverage. We can tell you where you’ve hit a coverage ceiling. Why can’t I get any more coverage? It’s because your code may not be testable, it’s legacy they did not write it to be tested. How can I make it more testable? Where am I wasting time? Where am I developers still writing code coverage that definitely could do it for me?
And helping you also understand risk, you feel comfortable, I’ve got my 65% coverage. But finding that all of your coverage is lack of coverage is maybe one area of your code that you should go focus time on.
We also have a couple of other tools that are part of the Difflue cover product portfolio.
These tools are helping you to be even more efficient. So it definitely takes us an hour and 20 minutes to run the unit tests. We’ve got really high coverage, but it takes a long time to run those tests. So, Cover Optimize is a tool within the Diffblue Cover that helps you to speed up your release process by only running the tests that are actually going to find, possibly find a regression. So the way we do this is when you call your Maven or Gradle test command, we have a plugin that will look at the diff in the code and only select the tests that could exercise the code that you’ve changed and only run those tests. All the other tests are useless. There’s no point in running them. They will not test the code you’ve changed. So for us, we get that down from an hour and 20 minutes to more like 15 minutes. And that’s a big deal. Because we’re stopping that developer from context switching again, we’re saving your compute time as well. We run our tests in a cloud environment where we’re paying a bill for every unit tests were running unnecessarily. So you move faster, you can release more features more often and we’re lowering your cost of production.
Another problem is we briefly looked at with our analytics tool is code that’s not testable. So here we’ve got some code that’s not testable, because a private field is not accessible to the unit test. The counter field is private a unit tests cannot write an assertion against this field. Maybe this is because it’s legacy code.
Your developer had written it and everything about unit tests. We need to make this code testable by writing a getter.
Cover Refactor, again is part of Diffblue cover. It will automatically give you a code change that will add those getters and other things to make your code more testable. This is saving your developers all that time of understanding. Why can’t I write the test for this code? So in this case, we make the code testable. And either use a developer can write the unit test, or leave it for us to do that work.
So definitely Cover is doing a lot of different things. But most importantly, definitely covers allowing you to do effective unit testing, because unit testing is going to help you deliver software faster. And being able to deliver software faster. It’s an indicator of an organisation that is healthy, has good code quality, and takes it software testing practices really seriously.
We don’t want developers to be writing unit tests that they don’t need to. So using AI tools can allow you to automate that process, giving you time back to be delivering valuable features to your customers.
Diffblue Cover also provides other ways, including analytics and other tools to help you speed up so that you can further make your software delivery process more efficient and more effective, and ultimately, allowing you to reduce regressions and keep your customers happy.
So that’s everything. For me today. I’ve got a lot of different material there. I hope you’ve learned something about unit testing, then the differences between good and bad unit testing. You’ve explored a bit about AI and how we use AI to write a year’s worth of code in eight hours. So thank you for your time.