The Quality of Auto-Generated Code

“We know how to test whether or not code is correct (at least up to a certain limit). Given enough unit tests and acceptance tests, we can imagine a system for automatically generating code that is correct. Property -based testing might give us some additional ideas about building test suites robust enough to verify that code works properly. But we don’t have methods to test for code that’s “good.” Imagine asking Copilot to write a function that sorts a list. There are lots of ways to sort. Some are pretty good—for example, quicksort. Some of them are awful. But a unit test has no way of telling whether a function is implemented using quicksort, permutation sort, (which completes in factorial time), sleep sort, or one of the other strange sorting algorithms that Kevlin has been writing about.

Do we care? Well, we care about O(N log N) behavior versus O(N!). But assuming that we have some way to resolve that issue, if we can specify a program’s behavior precisely enough so that we are highly confident that Copilot will write code that’s correct and tolerably performant, do we care about its aesthetics? Do we care whether it’s readable? 40 years ago, we might have cared about the assembly language code generated by a compiler. But today, we don’t, except for a few increasingly rare corner cases that usually involve device drivers or embedded systems. If I write something in C and compile it with gcc, realistically I’m never going to look at the compiler’s output. I don’t need to understand it…”

Request a Quote

Log In

The Quality of Auto-Generated Code

The Quality of Auto-Generated Code

The Quality of Auto-Generated Code