rss-bridge 2024-09-11T23:52:00+00:00

SE Radio 633: Itamar Friedman on Automated Testing with Generative AI

Itamar Friedman, the CEO and co-founder of CodiumAI, speaks with host Gregory M. Kapfhammer about how to use generative AI techniques to support automated software testing. Their discussion centers around the design and use of Cover-Agent, an open-source implementation of the automated test augmentation tool described in the Foundations of Software Engineering (FSE) paper entitled "Automated Unit Test Improvement using Large Language Models at Meta" by Alshahwan et al. The episode explores how large-language models (LLMs) can aid testers by automatically generating test cases that increase the code coverage of an existing testing suite. They also investigate other automated testing topics, including how Cover-Agent compares to different LLM-based tools and the strengths and weaknesses of using LLM-based approaches in software testing.

Itamar Friedman, the CEO and co-founder of CodiumAI, speaks with host Gregory M. Kapfhammer about how to use generative AI techniques to support automated software testing. Their discussion centers around the design and use of Cover-Agent, an open-source implementation of the automated test augmentation tool described in the Foundations of Software Engineering (FSE) paper entitled “Automated Unit Test Improvement using Large Language Models at Meta“ by Alshahwan et al. The episode explores how large-language models (LLMs) can aid testers by automatically generating test cases that increase the code coverage of an existing testing suite. They also investigate other automated testing topics, including how Cover-Agent compares to different LLM-based tools and the strengths and weaknesses of using LLM-based approaches in software testing.

Show Notes

Related Episodes

SE Radio 603: Rishi Singh on Using GenAI for Test Code Generation

SE Radio 533: Eddie Aftandilian on Github Copilot

SE Radio 324: Marc Hoffmann on Code Test Coverage Analysis and Tools

SE Radio 589: Zac Hatfield-Dodds on Property-Based Testing in Python

Transcript

Transcript brought to you by IEEE Software magazine and IEEE Computer Society. This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number.

Gregory Kapfhammer 00:00:18 Welcome to Software Engineering Radio. I’m your host Gregory Kapfhammer. Today’s guest is Itamar Friedman. He’s the CEO and Co-founder of CodiumAI. Itamar and his team develop a whole bunch of cool automated testing tools and they use generative AI techniques large language models. Itamar welcome to the show.

Itamar Friedman 00:00:40 Hey, very nice being here. Thank you so much for inviting me.

Gregory Kapfhammer 00:00:44 We’re glad to talk to you today about an open-source tool that you’ve developed at CodiumAI called Cover Agent. Cover Agent is based on a paper that was published at the Foundations of Software Engineering Conference. Let’s start by exploring some of the features that Cover Agent provides and how it works. Are you ready to dive in?

Itamar Friedman 00:01:03 Yeah, of course.

Gregory Kapfhammer 00:01:04 Okay, so at the start I’m going to read a quotation that comes from the documentation for the Cover Agent tool and then we’re going to use this sentence to explore more about how the tool works. So the sentence is as follows, it says “CodiumAI Cover Agent aims to efficiently increase code coverage by automatically generating qualified tests to enhance existing test suites.” So let’s start at a high level. Can you explain what are the testing tasks that Cover Agent automatically performs?

Itamar Friedman 00:01:37 Yeah, so I guess you’re mainly referring to types of testing, is it unit testing, component testing, integration testing, etc.. So basically, I think the Cover Agent can try to generate all of these, but the sweet spot that we saw is mostly around component testing. If you provide an initial few test that cover one or more components and you run Cover Agent, it’ll try to generate many more. Actually, it’ll try to generate as many as it could until reaching your criteria, for example, until a certain threshold was met or a certain iteration that you define that you don’t want to run more than a few iterations because there’s also a cost that comes together. Basically what Cover Agent does, it takes the first few tests that was given as part of the test suite and is exploiting these to inspire it to generate more. So if you will come with integration test or end-to-end test of different types, it could be end-to-end test and playwright or Cypress etc.. It’ll try to mimic that If you come with a few simple component tests, it’ll try to mimic that integration test, etc.. So you can aim and try to generate any type of testing. Having said that, it works the best for component testing at the moment, with the current implementation July 24.

Gregory Kapfhammer 00:03:02 So you said it works best with component testing. When you say component, is that equivalent to talking about unit testing?

Itamar Friedman 00:03:09 Yes, the reason I prefer the term “component” is because I think it makes it clear that you can do more than four lines of code, you know, than what you expect from a simple method or something like that. If you’re doing clean code, of course, and if you have for example a class that could have 500 lines of code with just an example five methods actually I think you will be more happier, more satisfied running Cover Agent on that because it’s within the limit of Cover Agent from what we see technically. And experimenting with it empirically to generate good tests for that, for this kind of a setup. And I think that provides more value. Of course, it’s a matter of taste and also situation. Some people they have better appetite for specific methods unit test because for different reason they want them to be fast, they want to verify that each specific method is being highly tested or some would prefer going one level above which is a component. In many cases you gain a lot of the property of unit tests; they’re still fast, they’re still independent and things like that, but you have more opportunities for interesting, you know, behaviors that are capturing more of what the software is supposed to do. So yes, so I believe that component testing is the way to go and Cover Agent covers that quite nicely. So that’s why it’s our recommendation, but you can go with unit test as well considering our terminology, of course.

Gregory Kapfhammer 00:04:42 Okay, thanks for that response. It was really helpful. Now just a moment ago you mentioned this idea of increasing code coverage. Can you talk briefly what is code coverage in the context of Cover Agent, and why is it a good idea to increase code coverage?

Itamar Friedman 00:04:57 Right, so code coverage in the most simple terminology is considering a test suite and considering a set of files, methods, classes, packages you want to cover, you want to check how much of the lines, we’ll come back to this notion of how many of these lines are being covered by the test suite. So, just an example, let’s say you have a test suite including five tests and you have two files with each one having two components and each one has 100 lines. Okay? So 400 lines you want to check the coverage will report, the coverage report will tell you how many of these 500 lines are being covered by these five tests. It could also tell you a breakdown in most cases depends on the package that you’re using to create the coverage report. In most cases it would tell you each method or each package or each file the coverage of that specific component, a specific item.

[...]

Original source

SE Radio 633: Itamar Friedman on Automated Testing with Generative AI

Show Notes

Related Episodes

Research Papers

Blog Posts

Software Tools

Transcript