This is a very informative article, a good starting point to understand the complexities and nuances of integrating these tools into large software projects.
As one commenter notes, we seem to be heading towards a “don't ask, don't tell policy”. I do find that unfortunate, because there is great potential in sharing solutions and ideas more broadly among experienced developers.
I used Claude Code to review a kernel driver patch last week. If found an issue that was staring me in the face, indeed one I would’ve expected the compiler to flag.
This is the fix. The bug commit was squashed. I don’t really know what I need to prove here, except that Claude Code helped identify an issue and fix it.
> The copyright status of LLM-generated code is of concern to many developers; if LLM-generated code ends up being subject to somebody's copyright claim, accepting it into the kernel could set the project up for a future SCO-lawsuit scenario.
There is no reason why I can't sue every single developer to ever use an LLM and publish and/or distribute that code for AGPLv3 violations. They cannot prove to the court that their model did not use AGPLv3 code, as they did not make the model. I can also, independently, sue the creator of the model, for any model that was made outside of China.
No wonder the model makers don't want to disclose who they pirated content from.
If their model reproduces enough of an AGPLv3 codebase near verbatim, and it cannot be simply handwaved away as a phonebook situation, then it is a foregone conclusion that they either ingested the codebase directly, or did so through somebody or something that did (which dooms purely synthetic models, like what Phi does).
I imagine a lot of lawyers are salivating over the chance of bankrupting big tech.
The onus is on you to prove that the code was reproduced and is used by the entity you're claiming violated copyright. Otherwise literally all tools capable of reproduction — printing presses, tape recorders, microphones, cameras, etc — would pose existential copyright risks for everyone who owns one. The tool having the capacity for reproduction doesn't mean you can blindly sue everyone who uses it: you have to show they actually violated copyright law. If the code it generated wasn't a reproduction of the code you have the IP rights for, you don't have a case.
TL;DR: you have not discovered an infinite money glitch in the legal system.
Reviewer burden is going to worsen. Luckily, LLMs are good at writing code that checks code quality.
This is not running a prompt as that’s probabilistic so doesn’t guarantee anything! This is having an agent create a self-contained check that becomes part of the codebase and runs in milliseconds. It could do anything - walk the AST of the code looking for one anti-pattern, check code conventions.. a linter on steroids.
Building and refining a library of such checks relieves maintainers’ burden and lets submitters check their own code.
I’m not just saying it - its worked super well for me. I am always adding checks to my codebase. They enforce architecture “routes are banned from directly importing the DB they must go via the service layer” or “no new dependencies”, they inspect frontend code to find all the fetch calls & href’s then flag dead API routes and unlinked pages. With informative error messages, agents can tell they’ve half finished/half assed an implementation. My favorite prompt is “keep going til the checks pass”.
What kernel reviewers do is complex - but I wonder how much can be turned into lore in this way. Refined over time to make kernel development even more foolproof as it becomes more complex.
I'm glad open source projects will profit from the huge productivity gains AI promises. I haven't noticed them in any of the software I use, but any day now.
This is a great article in discussing the pros / cons in adopting LLM-generated patches in critical projects such as the kernel. Even some of the comments give their nuanced observations on this, for exammple the top comment gives an accurate assessment of the strengths and limitations of LLMs perfectly:
> LLMs are particularly effective for language-related tasks - obviously. For example, they can proof-read text, generate high-quality commit messages, or at least provide solid drafts.
> LLMs are not so strong for programming, especially when it comes to creating something totally new. They usually need very limited and specific context to work well.
The big takeaway is regardless of whoever generated the code: "...it is the human behind the patch who will ultimately be responsible for its contents." which implies they need* to understand what the code does with no regressions introduced.
As one commenter notes, we seem to be heading towards a “don't ask, don't tell policy”. I do find that unfortunate, because there is great potential in sharing solutions and ideas more broadly among experienced developers.
https://github.com/PADL/linux/commit/b83b9619eecc02c5e95a1d3...
Ain't that anticipatory obedience?
There is no reason why I can't sue every single developer to ever use an LLM and publish and/or distribute that code for AGPLv3 violations. They cannot prove to the court that their model did not use AGPLv3 code, as they did not make the model. I can also, independently, sue the creator of the model, for any model that was made outside of China.
No wonder the model makers don't want to disclose who they pirated content from.
If their model reproduces enough of an AGPLv3 codebase near verbatim, and it cannot be simply handwaved away as a phonebook situation, then it is a foregone conclusion that they either ingested the codebase directly, or did so through somebody or something that did (which dooms purely synthetic models, like what Phi does).
I imagine a lot of lawyers are salivating over the chance of bankrupting big tech.
TL;DR: you have not discovered an infinite money glitch in the legal system.
This is not running a prompt as that’s probabilistic so doesn’t guarantee anything! This is having an agent create a self-contained check that becomes part of the codebase and runs in milliseconds. It could do anything - walk the AST of the code looking for one anti-pattern, check code conventions.. a linter on steroids.
Building and refining a library of such checks relieves maintainers’ burden and lets submitters check their own code.
I’m not just saying it - its worked super well for me. I am always adding checks to my codebase. They enforce architecture “routes are banned from directly importing the DB they must go via the service layer” or “no new dependencies”, they inspect frontend code to find all the fetch calls & href’s then flag dead API routes and unlinked pages. With informative error messages, agents can tell they’ve half finished/half assed an implementation. My favorite prompt is “keep going til the checks pass”.
What kernel reviewers do is complex - but I wonder how much can be turned into lore in this way. Refined over time to make kernel development even more foolproof as it becomes more complex.
> LLMs are particularly effective for language-related tasks - obviously. For example, they can proof-read text, generate high-quality commit messages, or at least provide solid drafts.
> LLMs are not so strong for programming, especially when it comes to creating something totally new. They usually need very limited and specific context to work well.
The big takeaway is regardless of whoever generated the code: "...it is the human behind the patch who will ultimately be responsible for its contents." which implies they need* to understand what the code does with no regressions introduced.