Linux lays down the law on AI-generated code, says yes to Copilot, no to AI slop, and humans take the fall for mistakes — after months of fierce debate, Torvalds and maintainers come to an agreement

Lee Duna@lemmy.nz · 12 days ago

Linux lays down the law on AI-generated code, says yes to Copilot, no to AI slop, and humans take the fall for mistakes — after months of fierce debate, Torvalds and maintainers come to an agreement

Blue_Morpho@lemmy.world · 11 days ago

The title of the article is extraordinary wrong that makes it click bait.

There is no “yes to copilot”

It is only a formalization of what Linux said before: All AI is fine but a human is ultimately responsible.

" AI agents cannot use the legally binding “Signed-off-by” tag, requiring instead a new “Assisted-by” tag for transparency"

The only mention of copilot was this:

“developers using Copilot or ChatGPT can’t genuinely guarantee the provenance of what they are submitting”

This remains a problem that the new guidelines don’t resolve. Because even using AI as a tool and having a human review it still means the code the LLM output could have come from non GPL sources.

marlowe221@lemmy.world · edit-2 11 days ago

Yeah, that’s also my question. Partially because I am a former-lawyer-turned-software-developer… but, yeah. How are the kernel maintainers supposed to evaluate whether a particular PR contains non-GPL code?

Granted, this was potentially an issue before LLMs too, but nowhere near the scale it will be now.

(In the interests of full disclosure, my legal career had nothing to do with IP law or software licensing - I did public interest law).

wonderingwanderer@sopuli.xyz · 11 days ago

If it’s flagged as “assisted by <LLM>” then it’s easy to identify where that code came from. If a commercial LLM is trained on proprietary code, that’s on the AI company, not on the developer who used the LLM to write code. Unless they can somehow prove that the developer had access to said proprietary code and was able to personally exploit it.

If AI companies are claiming “fair use,” and it holds up in court, then there’s no way in hell open-source developers should be held accountable when closed-source snippets magically appear in AI-assisted code.

Granted, I am not a lawyer, and this is not legal advice. I think it’s better to avoid using AI-written code in general. At most use it to generate boilerplate, and maybe add a layer to security audits (not as a replacement for what’s already being done).

But if an LLM regurgitates closed-source code from its training data, I just can’t see any way how that would be the developer’s fault…

sem@piefed.blahaj.zone · 11 days ago

Pretty convenient.

This is how copyleft code gets laundered into closed source programs.

All part of the plan.

wonderingwanderer@sopuli.xyz · 11 days ago

How would they launder it? Just declare it their own property because a few lines of code look similar? When there’s no established connection between the developers and anyone who has access to the closed-source code?

That makes no sense. Please tell me that wouldn’t hold up in court.

lagoon8622@sh.itjust.works · 10 days ago

Please tell me that wouldn’t hold up in court.

First tell us how much money you have. Then we’ll be able to predict whether the courts will find in your favor or not

wonderingwanderer@sopuli.xyz · 10 days ago

Sad but true…

sem@piefed.blahaj.zone · 10 days ago

First of all, who is going to discover the closed source use of gpl code and create a lawsuit anyway?

Second, the llm ingests the code, and then spits it back out, with maybe a few changes. That is how it benefits from copyleft code while stripping the license.

Maybe a human could do the same thing, but it would take much longer.

wonderingwanderer@sopuli.xyz · 10 days ago

Wait, did you just move the goalposts? I thought the issue we were talking about was open-source developers who use LLM-generated code and unwittingly commit changes that contain allegedly closed-source snippets from the LLM’s training data.

Now you want to talk about LLM training data that uses open-source code, and then closed-source developers commit changes that contain snippets of GPL code? That’s fine. It’s a change of topic, but we can talk about that too.

Just don’t expect what I said before about the previous topic of discussion to apply to the new topic. If we’re talking about something different now, I get to say different things. That’s how it works.

sem@piefed.blahaj.zone · 10 days ago

I was responding specifically to this part

But if an LLM regurgitates closed-source code from its training data, I just can’t see any way how that would be the developer’s fault…

showing what would happen when the llm regurgitates open source code into close source projects.

Sorry if you didn’t like that.

ricecake@sh.itjust.works · 10 days ago

I believe what they’re referring to is the training of models on open source code, which is then used to generate closed source code.
The break in connection you mention makes it not legally infringement, but now code derived from open source is closed source.

Because of the untested nature of the situation, it’s unclear how it would unfold, likely hinging on how the request was formed.

We have similar precedent with reverse engineering, but the non sentient tool doing it makes it complicated.

wonderingwanderer@sopuli.xyz · 9 days ago

That makes sense. I see the problem with that, and I don’t have a good solution for it. It is a divergence of topic though, as we were discussing open-source programmers using LLMs which are potentially trained on closed-source code.

LLMs trained on open-source code is worth its own discussion, but I don’t see how it fits in this thread. The post isn’t about closed-source programmers using LLMs.

Besides, closed-source code developers could’ve been stealing open-source code all along. They don’t really need AI to do that.

Still, training LLMs on open-source code is a questionable practice for that reason, particularly when it comes to training commercial models on GPL code. But it’s probably hard to prove what code was used in their datasets, since it’s closed-source.

ricecake@sh.itjust.works · 9 days ago

I don’t really see it as a divergence from the topic, since it’s the other side of a developer not being responsible for the code the LLM produces, like you were saying.
In any case, it’s not like conversations can’t drift to adjacent topics.

Besides, closed-source code developers could’ve been stealing open-source code all along. They don’t really need AI to do that.

Yes, but that’s the point of laundering something. Before if you put foss code in your commercial product a human could be deposed in the lawsuit and make it public and then there’s consequences. Now you can openly do so and point at the LLM.

People don’t launder money so they can spend it, they launder money so they can spend it openly.

Regardless, it wasn’t even my comment, I just understood what they were saying and I’ve already replied way out of proportion to how invested I am in the topic.

anarchiddy@lemmy.dbzer0.com · 11 days ago

Yup.

I would also just point out that this doesnt change the legal exposure to the Linux kernel to infringing submissions from before the advent of LLMs.

lechekaflan@lemmy.world · 11 days ago

The title of the article is extraordinary wrong that makes it click bait.

It’s the pain in the ass with some of those fucking tech/video/showbiz news outlets and then rules in some fora where you cannot make “editorialized” post titles, even though it’s so tempting to correct the awful titling.

TheOctonaut@piefed.zip · 11 days ago

the LLM output could have come from non-GPL sources

Fundamentally not how LLMs work, it’s not a database of code snippets.

BradleyUffner@lemmy.world · 11 days ago

“Derivative works”

Fmstrat@lemmy.world · 11 days ago

Because even using AI as a tool and having a human review it still means the code the LLM output could have come from non GPL sources.

I get why they are passing this by though, since you don’t know the provenance of that Stack Overflow snippet, either.

scarabic@lemmy.world · 10 days ago

That’s probably why they say “a human is responsible” not “a human must validate it.” I certainly agree that validation is not always possible. And this problem will get worse in time.

CanIFishHere@lemmy.ca · 11 days ago

AI is here, another tool to use…the correct way. Very reasonable approach from Torvalds.

Newsteinleo@infosec.pub · 11 days ago

I don’t have a problem with LLMs as much as the way people use them. My boss has offloaded all of his thinking to LLMs to the point he can’t fix a sentence in a slide deck without using an LLM.

It’s the people that try to use LLMs for things outside their domain of expertise that really cause the problems.

InternetCitizen2@lemmy.world · 11 days ago

This is a big point. People need to understand that the LLMs are more like a fancy graphing calculator; they are very good and handle multiple things, but its on you to understand why the calculation is meaningful. At a certain point no one wants to see your long division or factorial. We want the results and for students and professionals to focus on the concept.

NekoKoneko@lemmy.world · 11 days ago

I get the metaphor but it’s not a great one for AI in mathematics especially. A statistical word generator is not going to perform reliable math and woe to anyone who acts otherwise.

I would call it an autistic sycophantic savant with brain damage. It’s able to perform apparent miraculous feats of memory and creativity but then be unable to tell reality from fiction, to tell if even the simplest response is valid, and likely will lie about it to make itself seem more competent to please you.

If you have a use for an assistant like that, then great. But a calculator - simple and cheap and reliable - it definitely is not.

NotMyOldRedditName@lemmy.world · 11 days ago

It’s the people that try to use LLMs for things outside their domain of expertise that really cause the problems.

That seems to general. Im a mobile developer and sometimes I need a simple script outside my knowledge area. I needed to scrape a website recently, not for anything serious, but to save me time. Claude wrote it and it works. Its probably trash code, but it works and it helped. But you wouldn’t want me using Claude to do important work outside my specific area of focus either or im sure Id cause problems.

Newsteinleo@infosec.pub · 11 days ago

I’m talking about people that are accountants that now thing they can create software. Or engineers who think they can now write legal briefs for court.

boraginoru@lemmy.zip · 10 days ago

I’m also a mobile app dev and at my workplace they’re having non-mobile devs submit code to my codebases totally vibed with no understanding behind it. It’s absolutely causing problems, especially for me, who is one of the only lines of defense keeping stuff even remotely maintainable.

So yes basically you’re right. If people only used it to learn and do initial code review passes and other reasonable things we’d be totally fine. But that’s unfortunately not the reality 🙈

NotMyOldRedditName@lemmy.world · 10 days ago

It’s absolutely causing problems, especially for me, who is one of the only lines of defense keeping stuff even remotely maintainable.

The next step is, CEO, look at how good these non-mobile devs are, they’re submitting 10x the commits to the mobile repo than boraginoru our mobile dev! We should fire him and just let the backend devs keep vibe coding it!

CanIFishHere@lemmy.ca · 11 days ago

Very frustrating for sure. Like any tool, it’s up to humans to know when the tool is useful.

filcuk@lemmy.zip · 11 days ago

Partly a marketing issue.
Companies keep advertising their new AI’s as destroyers of worlds, and something that’s too dangerous to even release.
As with anything else, the average user will not have but the most surface level understanding of the tool

null@lemmy.zip · 11 days ago

Clickbait got me. No mention of “Yes copilot” which I assumed was a joke anyway.

oyzmo@lemmy.world · 11 days ago

👆🏻true

sonofearth@lemmy.world · 11 days ago

I am the c/fuck_ai person but at this point I have made peace we can’t avoid it. I still don’t want it to do artsy stuff (image gen, video gen) and to blindly use it in critical stuff because humans are the ones that should be doing it or have constant oversight. I think the team’s logic is correct here, because there is no way to know if the code is from an LLM or a human unless something there screams LLM or the contributor explicitly mentions it. Mandating the latter seems like a reasonable move for now.

DaleGribble88@programming.dev · 11 days ago

I consider myself to be more pro AI than not, but I’m certainly not a zealot and mostly agree with the take that it shouldn’t be used in artistic pursuits. However, I love using AI to help me create art. It can give great critiques, often good advice on how to improve, and is great for rapid experimentation and prototyping. I actually used it this weekend to see what a D&D mini might look like with different color schemes before painting it. I could have done the same with Gimp, but it would have taken much longer for worse results that was ultimately just for a brain storming session. How do you feel about my AI usage from your perspective? I suppose from an energy conservation perspective, all of it was bad, but I’m more interested in a less trivial take.

sonofearth@lemmy.world · 11 days ago

Yes the energy consumption is bad. My main gripe about LLM generated art is that it will not be original. It will use its training data from uncredited artworks to generate it. Art usually is made by humans to express something or convey something in a creative way. LLMs fail at that. What LLMs can actually be helpful at is making learning art more accessible to everyone. Art schools or private art classes can be expensive. This lowers the barrier to entry.

As for you using generated Art is that the it might be really beautiful but it will be very difficult to maintain that style and even more difficult to convince that it is your style. The Artist doesn’t get much recognition with LLM generated art. Using it as a critique also seems stupid because LLMs will aways try to give an objective view on it than subjective. Your art won’t trigger an emotion in it and might say it is bad or “do this to make it more understandable” — that’s where you lose as an artist.

My mom likes to paint as a hobby. What she does it searches stuff on Pinterest (which is mostly LLM Generated). She uses it as an inspiration to do it in her own style and maybe give it some spin. She keeps all of it for herself.

MeekerThanBeaker@lemmy.world · 10 days ago

I’m a writer. I got paid to write on a few things here and there, but mostly there are just huge barriers for people without connections.

I plan on using AI to turn my writing into a visual animated format for people to consume. I don’t much care about the style of art, I just want my work to be seen. I can’t afford to pay for artists. If I could, I would. But at least, this would give me an opportunity to show my work without some execs saying no a hundred times.

When I look at the art for cartoons in the 70s/80s, there is so much crap animation with mistakes and duplications, you would think it’s “a.i. slop.” I understand that these were done overseas, pumped out quickly so quality control was overlooked for speed… but it wasn’t the animation I was interested in, it was the stories and characters.

I still think original artists will continue to exist. A.I. is just another tool. People will get bored of the same old stuff and want originality. I really hope it’ll make our lives better in the long run, but we’re just in the weird middle stage of A.I. crawling before running.

sonofearth@lemmy.world · 10 days ago

I can’t afford to pay for artists

You can afford LLMs right now because all of the LLM companies are losing money on it. If they decide they want to make a profit, they will raise their prices significantly. So you still end up in the same situation. You don’t have much control on what an LLM spits out while with doing animation manually, you have total control or at-least sit with an actual animator to make it look how you envision it to be.

I plan on using AI to turn my writing into a visual animated format for people to consume.

What makes you think that people will respond the same way and in the same numbers to LLM generated animation than if it were crafted by an artist? I reckon that it will be much lower. I see it on youtube constantly. I watched a video about a topic, then I got recommended something related to it from a different channel. Guess what? The script and the animation were so damn similar and the shit they were spewing wasn’t even true in the end. Everything that both the channels made was slop. Sure they spit out more content than conventional methods and got a few thousand views each video and made decent money on it. But they aren’t gonna sustain for long if they want audience retention.

Since then I have been more mindful on what video I click on and going to the extent of disabling recommendations and watch history.

MeekerThanBeaker@lemmy.world · 10 days ago

I have downloaded my own LLM that can be used on my own computer… So the only cost is electricity since I upgraded my computer before the prices went to shit. Newegg even gave me free RAM with the purchase of a motherboard so I lucked out on that. Storage is not an issue too since I got that back in 2024 knowing Trump would fuck everything up.

And no, people might not respond the same way to my work, but then again I’m not taking any work away from anyone else because then it would not even exist. If you want to fund me and the artist for our work, then okay. Show me the money.

One thing I’ve noticed is that I see many more people complain about slop than slop itself. It’s so annoying at this point that’s it’s making me go in the opposite direction. Hey everyone, slop here… Microsoft slop here… Use Linux Linux Linux. Slop slop slop. Sloppy joes. It’s like candlestick makers complaining to Nikola Tesla.

Cataphract@lemmy.ml · 10 days ago

Another great example of how AI is just wreaking havoc on people’s brains.

Wants to show an enticing product to execs, doesn’t want to invest in paying an artist
realizes they have to have connections but doesn’t want to network
wants recognition of their hard work, hasn’t sought out a community or collaboration but states “show me the money”

AI will fix everything for me! Slop doesn’t exist! (ignores the very article we’re in, any platform algorithm feed, the us president shit posting, all the slop that gets presented here). Go get em Nik, don’t let haters stop your brilliance.

MeekerThanBeaker@lemmy.world · 10 days ago

A very extreme takeaway, but okay.

sonofearth@lemmy.world · 10 days ago

my own LLM that can be used on my own computer

May I ask how many B parameters does it have? Because the paradox over here is:

if it is weak then you will be getting much much worse results than even the Big Models the corpos have (we don’t even know how much tbh), let alone the quality of an actual artist.
If you have a respectfully powerful model then your PC might cost thousands of dollars (even by ignoring the price hikes) which eliminates the excuse to pay an actual artist.

catlover@sh.itjust.works · 11 days ago

I’d still be highly sceptical about pull requests with code created by llms. Personally what I noticed is that the author of such pr doesn’t even read the code, and i have to go through all the slop

kcuf@lemmy.world · 11 days ago

Ya I’m finding myself being the bad code generator at work as I’m scattered across so many things at the moment due to attrition and AI can do a lot of the boilerplate work, but it’s such a time and energy sink to fully review what it generates and I’ve found basic things I missed that others catch and shows the sloppiness. I usually take pride in my code, but I have no attachment to what’s generated and that’s exposing issues with trying to scale out using this

Repple (she/her)@lemmy.world · edit-2 11 days ago

Same. There’s reduction in workforce, pressure to move faster, and no good way to do that without sloppiness. I have never been this down on the industry before; it was never great, but now it’s terrible.

Danitos@reddthat.com · edit-2 11 days ago

Some thought I had the other day: LLM is supposed to make us more productive, say by 20%. Have you won a 20% pay rise since you adopted it? I haven’t

NotEasyBeingGreen@slrpnk.net · 11 days ago

Increases in productivity go to the owners, not the workers. Even imaginary increases in productivity.

Feyd@programming.dev · 11 days ago

Just fucking stop using it? Wtf? Tell you boss to pound sand! They’re going to blame you when it goes south anyway so you might as well stay honest.

terabyterex@lemmy.world · edit-2 11 days ago

Did we all forget about stackoverflow?

Peopleblindly copy/pasted from there all the time.

Railcar8095@lemmy.world · 11 days ago

Couple of years back I got a PR at work that used a block of code that read a CSV, used some stream method to covert it to binary to then feed it to pandas to make a dataframe. I don’t remember the exact steps it did, but was just crazy when pd.read_csv existed.

On a hunch I pasted the code in google and found an exact match on overflow for a very weird use case on very early pandas.

I’m lucky and if people send obvious shit at work I can just cc their manager, but I fell for the volunteers at large FOSS projects, or even paid employees.

Evotech@lemmy.world · 11 days ago

Yeah people have not understood their code for centuries now

jj4211@lemmy.world · 11 days ago

I suspect the answer will be that such large requested as you frequently see with LLM codegen will just be rejected.

Already I see changes broken up and suggested bit by bit, so I presume the same best practice applies.

NewNewAugustEast@lemmy.zip · 11 days ago

Copilot? You mean the AI with terms of service that are in bold and explicit: “for entertainment purposes only”?

Which is why its in the title and not the article? EntertainBait?

Zacryon@feddit.org · 11 days ago

I suppose GitHub Copilot is meant, which is a different thing.

Senal@programming.dev · 11 days ago

Different how, isn’t github owned by microsoft ?

lepinkainen@lemmy.world · 11 days ago

There are like 70 copilots

ThinkyMcThinkface@lemmy.zip · 11 days ago

Up to 81 now

Diurnambule@jlai.lu · 11 days ago

The hell. How can they expect people to understand ? They plan to sell 100 things under the same name and try to sell it as one big AI when it is hundred of différents things unrelated ?

JcbAzPx@lemmy.world · 10 days ago

They’ve never been good at naming things, but they now seem to be going out of their way to try to be the worst with the names of their software. For instance, they named the successor to the already generically named “remote desktop protocol” “windows app”.

Diurnambule@jlai.lu · 10 days ago

This one is funny. Go google windows app commands. They just fucked sysadmins

Senal@programming.dev · 11 days ago

Ok, so there are 70-81 copilots, github is one of them.

Why is github copilot a different thing in the context of the reply that was being responded to ?

lepinkainen@lemmy.world · 11 days ago

Copilot is the harness, Claude and GPT are the models

Copilot is by far the worst harness of all the major players

Senal@programming.dev · 10 days ago

Yes, i get that, copilot is like opencode or cursor, though perhaps with less general access to models.

There was a reply

Copilot? You mean the AI with terms of service that are in bold and explicit: “for entertainment purposes only”?

followed by

I suppose GitHub Copilot is meant, which is a different thing.

i was asking why github copilot is different in that context.

dev_null@lemmy.ml · 10 days ago

Different in that it’s not an AI model, it’s just a tool you can use to run AI models like Claude.

Senal@programming.dev · 10 days ago

see my reply here

bss03@infosec.pub · 10 days ago

Source: https://lifehacker.com/tech/microsoft-copilot-for-entertainment-purposes-only

Electricd@lemmybefree.net · 10 days ago

Just legal stuff. Making a huge deal of it is dumb

NewNewAugustEast@lemmy.zip · 10 days ago

I disagree.

Legal stuff would be Use at your own risk, or answers may not be correct.

This is really strong language.

Seth Taylor@lemmy.world · edit-2 11 days ago

Bad actors submitting garbage code aren’t going to read the documentation anyway, so the kernel should focus on holding human developers accountable rather than trying to police the software they run on their local machines.

“Guns don’t kill people. People kill people”

Torvalds and the maintainers are acknowledging reality: developers are going to use AI tools to code faster, and trying to ban them is like trying to ban a specific brand of keyboard.

The author should elaborate on how exactly AI is like “a specific brand of keyboard”. Last I checked a keyboard only enters what I type, without hallucinating 50 extra pages. And if AI, a tool that generates content, is like “a specific brand of keyboard”, does that mean my brain is also a “specific brand of keyboard”?

I get their point. If you want to create good code by having AI create bad code and then spending twice the time to fix it, feel free to do that. But I’m in favor of a complete ban.

Simulation6@sopuli.xyz · 11 days ago

The keyboard thing is sort of a parable, it is as difficult to determine if code was generated in part by AI as it is to determine what keyboard was used to create it.

Miaou@jlai.lu · 11 days ago

The (very obvious) point is that this cannot be enforced. So might as well deal with it upfront.

Shayeta@feddit.org · edit-2 11 days ago

AI is a useful tool for coding as long as it’s being used properly. The problem isn’t the tool, the problem is the companies who scraped the entire internet, trained LLM models, and then put them behind paywalls with no options to download the weights so that they could be self-hosted. Brazen, unaccountable profiteering off of the goodwill of many open source projects without giving anything back.

If LLMs were community-trained on available, open-source code with weights freely available for anyone to host there wouldn’t be nearly as much animosity against the tech itself. The enemy isn’t the tool, but the ones who built the tool at the expense of everyone and are hogging all the benefits.

cartoon meme dog@lemmy.zip · 10 days ago

There are hundreds of such LLMs with published training sets and weights available on places like HuggingFace. Lots of people run their own LLMs locally, it’s not hard if you have enough vram and a bit of patience to wait longer for each reply.

Electricd@lemmybefree.net · 10 days ago

Eh, trust me, anti AI people don’t think this much about it

Also, there are a lot of open weight models out there that are pretty good

alyth@lemmy.world · 10 days ago

Out of curiosity how much code have you contributed to the Linux kernel?

Electricd@lemmybefree.net · 10 days ago

You’re the one comparing AI and guns/killing people, and then saying their metaphorical comparison isn’t accurate? Lol

ede1998@feddit.org · 11 days ago

Last I checked a keyboard only enters what I type

I’ve had (broken) keyboard “hallucinate” extra keystrokes before, because of stuck keys. Or ignore keypresses. But yeah, that means the keyboard is broken.

BigPotato@lemmy.world · 10 days ago

Wooting and Razer had a macro function that allowed Counterstrike players to setup a function to always get counter strafe. Valve decided that was a bridge too far and banned “Hardware level” exploits.

So, Valve once banned a keyboard.

bassow@lemmy.world · 10 days ago

Torvalds and the maintainers are acknowledging reality: developers are going to use AI tools to code faster, and trying to ban them is like trying to ban a specific brand of keyboard.
The author should elaborate on how exactly AI is like “a specific brand of keyboard”. Last I checked a keyboard only enters what I type, without hallucinating 50 extra pages. And if AI, a tool that generates content, is like “a specific brand of keyboard”, does that mean my brain is also a “specific brand of keyboard”?

It’s about the heritage of code not being visible from the surface. I don’t know about your brain.

ziproot@lemmy.ml · 10 days ago

Last I checked a keyboard only enters what I type

I’m assuming the author is talking about mobile keyboards, which have autocomplete and autocorrect.

0ndead@infosec.pub · 12 days ago

“Yes to Copilot, no to AI slop”

Pick One

truthfultemporarily@feddit.org · 11 days ago

Where does slop start? If you use auto complete and it is just adding a semicolon or some braces, is it slop? Is producing character by character what you would have wrote yourself slop?

How about using it for debugging?

ell1e@leminal.space · 11 days ago

If you would have written it yourself the same way, why not write it yourself? (And there was autocomplete before the age of LLMs, anyway.)

The big problems start with situations where it doesn’t match what you would have written, but rather what somebody else has written, character by character.

hperrin@lemmy.ca · 11 days ago

You don’t need AI to autocomplete code. We’ve had autocomplete for over 30 years.

BoxOfFeet@lemmy.world · 11 days ago

To me, it starts at anything beyond correcting spelling for individual words or adding punctuation. I don’t even want it suggesting quick reply phrases.

Is producing character by character what you would have wrote yourself slop?

Yes.

badgermurphy@lemmy.world · 11 days ago

There’s the rub. When establishing laws and guidelines, every term must be explicitly defined. Lack of specificity in these definitions is where bad-faith actors hide their misdeeds by technically obeying the letter of the law due to its vagueness, while flagrantly violating its spirit.

Its why today, in the USA, corporations are legally people when its convenient, and not when its not, and the expenditure of money is governments protected “free speech”.

femtek · 11 days ago

I mean I don’t use copilot but a self hosted Claude at work for debugging and creating templates. I still run thru and test it. I’m only doing crossplane, kyverno, kubernetes infra things though and I started without it so I have an understanding. Now running their someone’s crossplane composition written in go and I asked them about this error and he just said get the AI to fix it was worrying since his last day is next week.

chilicheeselies@lemmy.world · 11 days ago

Its only slop if you accept slop. What i mean is that it cna and does generate perfectly fine code. It also generates code that is ok, but needs a human touch. It also generates verbose garbage.

Its only slop if you approve the slop. Its perfectly fine to let it generate the boilerplate of what you want, and tweak it. If its prompted well enough, you get less slop.

Ultimately I am with Linus on this one. The genie is out of the bottle. Use it responsibly.

null@lemmy.org · 11 days ago

Ah, the solution that recognizes there’s no way to eliminate AI from the supply chain after it’s already been introduced.

sunbeam60@feddit.uk · 11 days ago

You make it sound as if there was another choice if just people had better principles. Pray tell us, what would you have done, now. Not in the past, now.

null@lemmy.org · 11 days ago

That wasn’t my intent. This is me saying, “of course that’s what they’re going to do because there’s nothing else they can do.”

sunbeam60@feddit.uk · 11 days ago

I completely misunderstood you. I’m sorry.

Feyd@programming.dev · 11 days ago

You’re agreeing with the comment you replied to. Why the fuck are you trying to be so smug???

Katherine 🪴@piefed.social · 11 days ago

Linux kernel being written by Microsoft’s AI.

MoogleMaestro@lemmy.zip · 11 days ago

Microsoft needs to try to ruin Linux somehow, it can’t just hurt windows 11 with AI slop code, it needs to expand it’s efforts to other systems.

nutsack@lemmy.dbzer0.com · 11 days ago

which is trained on free and open source code

Sylvartas@lemmy.dbzer0.com · 10 days ago

That will definitely not introduce some weird things when it starts feeding on itself.

gandalf_der_12te@discuss.tchncs.de · 11 days ago

I agree. If AI becomes outlawed, it will simply be used without other people knowing about it.

This approach, at least, means that people will label AI-generated code as such.

emmy67@lemmy.world · 11 days ago

Maybe. There’s still strong disapproval around it. I can imagine many will still hide it.

Jankatarch@lemmy.world · 11 days ago

Maintainers’ only responsibility is to ensure quality and shouldn’t have to check for rogue AI submissions.

Tho I still miss consistent fucking weather so year of the netbsd?

Electricd@lemmybefree.net · 10 days ago

Ensuring you don’t approve garbage, either human or AI generated, is part of quality

ell1e@leminal.space · edit-2 12 days ago

Ultimately, the policy legally anchors every single line of AI-generated code

How would that even be possible? Given the state of things:

https://dl.acm.org/doi/10.1145/3543507.3583199

Our results suggest that […] three types of plagiarism widely exist in LMs beyond memorization, […] Given that a majority of LMs’ training data is scraped from the Web without informing content owners, their reiteration of words, phrases, and even core ideas from training sets into generated texts has ethical implications. Their patterns are likely to exacerbate as both the size of LMs and their training data increase, […] Plagiarized content can also contain individuals’ personal and sensitive information.

https://www.theatlantic.com/technology/2026/01/ai-memorization-research/685552/

Four popular large language models—OpenAI’s GPT, Anthropic’s Claude, Google’s Gemini, and xAI’s Grok—have stored large portions of some of the books they’ve been trained on, and can reproduce long excerpts from those books. […] This phenomenon has been called “memorization,” and AI companies have long denied that it happens on a large scale. […]The Stanford study proves that there are such copies in AI models, and it is just the latest of several studies to do so.

https://www.twobirds.com/en/insights/2025/landmark-ruling-of-the-munich-regional-court-(gema-v-openai)-on-copyright-and-ai-training

The court confirmed that training large language models will generally fall within the scope of application of the text and data mining barriers, […] the court found that the reproduction of the disputed song lyrics in the models does not constitute text and data mining, as text and data mining aims at the evaluation of information such as abstract syntactic regulations, common terms and semantic relationships, whereas the memorisation of the song lyrics at issue exceeds such an evaluation and is therefore not mere text and data mining

https://www.sciencedirect.com/science/article/pii/S2949719123000213#b7

In this work we explored the relationship between discourse quality and memorization for LLMs. We found that the models that consistently output the highest-quality text are also the ones that have the highest memorization rate.

https://arxiv.org/abs/2601.02671

recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models. However, it remains an open question if similar extraction is feasible for production LLMs, given the safety measures […]. We investigate this question […] our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs.

How does merely tagging the apparently stolen content make it less problematic, given I’m guessing it still won’t have any attribution of the actual source (which for all we know, might often even be GPL incompatible)?

But I’m not a lawyer, so I guess what do I know. But even from a non-legal angle, what is this road the Linux Foundation seems to embrace of just ignoring the license of projects? Why even have the kernel be GPL then, rather than CC0?

I don’t get it. And the article calling this “pragmatism” seems absurd to me.

anarchiddy@lemmy.dbzer0.com · 11 days ago

That’s not really how copyright law works.

ell1e@leminal.space · 11 days ago

Would you also say that to this lawyer reviewing Co-Pilot in 2026? https://github.com/mastodon/mastodon/issues/38072#issuecomment-4105681567

Disclaimer: this isn’t legal advice.

anarchiddy@lemmy.dbzer0.com · 11 days ago

LLMs themselves being products of copyright isnt the legal question at issue, it’s the downstream use of that product.

If I use a copyright-infringing work as a part of a new creative work, does that new work infringe copyright by default? Or does the new work need to be judged itself as to the question of infringing a copyrighted work?

And if it is judged as infringing, who is responsible for the damage done? Can I pass the damages back to the original infringing work? Or should I be held responsible for not performing due diligence?

hperrin@lemmy.ca · edit-2 11 days ago

It is though. If you commit copyrighted code that was output by an LLM, you do have to follow the license of that code. If you don’t, that’s copyright infringement.

Even if the code isn’t copyrighted code, then it’s public domain code that can’t be copyrighted:

https://sciactive.com/human-contribution-policy/#More-Information

anarchiddy@lemmy.dbzer0.com · 11 days ago

The Linux Kernel is under a copyleft license - it isnt being copyrighted.

But the policy being discussed isn’t allowing the use of copyrighted code - they’re simply requiring any code submitted by AI be tagged as such so that the human using the agent is ultimately responsible for any infringing code, instead of allowing that code go undisclosed (and even ‘certified’ by the dev submitting it even if they didnt write or review it themselves)

Submissions are still subject to copyright law - the law just doesnt function the way you or OP are suggesting.

hperrin@lemmy.ca · 11 days ago

Copyleft doesn’t mean it’s not copyrighted. Copyleft is not a legal term. “Copyleft” licenses are enforced through copyright ownership.

Did you read the quotes from the copyright office I linked to? I am going to go ahead and trust the copyright office over you on issues of copyrightability.

anarchiddy@lemmy.dbzer0.com · 11 days ago

Even if this were true, it would only mean that the GNU license is unenforceable, not that the Linux kernel itself is infringing copyright

hperrin@lemmy.ca · 11 days ago

Unless the code the AI generated is a copy of copyrighted code, of course. Then it would be copyright infringement.

I can cause the AI to spit out code that I own the copyright to, because it was trained on my code too. If someone used that code without including attribution to me (the requirement of the license I release my code under), that would be copyright infringement. Do you understand what I mean?

anarchiddy@lemmy.dbzer0.com · 11 days ago

That would be true even if they didn’t use AI to reproduce it.

The problem being addressed by the Linux foundation isn’t the use of copyrighted work in developer contribution, it’s the assumption that the code was authored by them at all just because it’s submitted in their name and tagged as verified.

Does that make sense?

AeonFelis@lemmy.world · 10 days ago

they’re simply requiring any code submitted by AI be tagged as such so that the human using the agent is ultimately responsible for any infringing code, instead of allowing that code go undisclosed

This makes zero sense, because the article says that this new tagging will replace the legally binding “Signed-off-by” tag. Wouldn’t that old tag already put that responsibility on the person submitting the code.

Also - what will holding the submitter responsible even achieve? If an infringement is detected, the Linux maintainers won’t be able to just pass all the blame to the submitter of that code while keeping it in the codebase - they’ll have to remove the infringing code regardless of who’s responsible for putting it in.

anarchiddy@lemmy.dbzer0.com · 9 days ago

Kinda, but they’re specifically saying the the AI agent cannot itself tag the contribution with the sign-off - like, someone using Claude Code to submit PRs on their behalf. The developer must add the tag themselves, indicating that they at least reviewed and submitted it themselves, and it wasn’t just an agent going off-prompt or some other shit and submitting it without the developer’s knowledge. This is saying ‘the dog ate my homework’ is not a valid excuse.

The developer can use AI, but they must review the code themselves, and the agent can’t “sign-off” on the code for them.

Also - what will holding the submitter responsible even achieve?

What does holding any individual responsible on a development team do? The Linux project is still responsible for anything they put out in the kernel just like any other project, but individual developers can be removed from the contributing team if they break the rules and put it at risk.

The new rule simply makes the expectations clear.

hperrin@lemmy.ca · 11 days ago

There are so many reasons not to include any AI generated code.

https://sciactive.com/human-contribution-policy/#Reasoning

stylusmobilus@aussie.zone · 10 days ago

any resulting bugs or security flaws firmly onto the shoulders of the human submitting it.

Watch Americans and their companies pull some mad gymnastics on proportioning blame for this

Electricd@lemmybefree.net · 10 days ago

Well yea, it’s the human submitting the code, and using a tool known to be imperfect

Your comment is pretty dumb

stylusmobilus@aussie.zone · 9 days ago

At this point it’s 23 on -5 with opinions on that dumb comment sunshine

Electricd@lemmybefree.net · 9 days ago

Because obviously the majority always right.

twinnie@feddit.uk · 12 days ago

No point getting upset about this, it’s inevitable. So many FOSS programmers work thanklessly for hours and now there’s some tool to take loads of that work away, of course they’re going to use it. I know loads of people complain about it but used responsibly it can take care of so much of the mundane work. I used to spend 10% of my time writing code then 90% debugging it. If I do that 10% then give it to Claude to go over I find it just works.

geekwithsoul@piefed.social · edit-2 11 days ago

“I used to spend 10% of my time writing code then 90% debugging it”

Skill issue

(Edited to add context)

NaibofTabr@infosec.pub · 11 days ago

This is a bad take, which dismisses the amount of labor involved in maintaining widely used software projects.

geekwithsoul@piefed.social · 11 days ago

I was referring (mostly jokingly) to his spending 90% of his time debugging. But you do you.

femtek · 11 days ago

Time issue

ell1e@leminal.space · 11 days ago

Whatever it is, it doesn’t mean LLMs are a sane or “inevitable” answer.

Mihies@programming.dev · 11 days ago

How is it time issue if you have percentages?

femtek · 11 days ago

I mean oss is not getting the support they need and have to keep up with security, bug, and features so using LLMs to speed up development will help.

uuj8za@piefed.social · 11 days ago

but used responsibly

That’s like the most incredibly hard part of all of this. Everything is aligned so that you don’t use it responsibly. And it’s really hard to guard against this.

Just a few days ago, I was pairing with a coworker and he was using Claude to do a bunch of stuff. He didn’t check any of it. I thought he was gonna check stuff before pushing stuff… And nope! I said, “Wait, shouldn’t we review the changes to make sure they’re correct?” And he said, “Nah, it’s probably fine. I trust it. Plus, even if it’s wrong, we’ll just blame the AI and we can just fix it later.”

…

Yes, checking the work would have negated all of the “time saved” and he was being a lazy fuck.

People who don’t like coding or engineering use this and they are not interested in using this responsibly.

Tiresia@slrpnk.net · 11 days ago

That’s valid for workers in a capitalist system or for capitalists trying to scam people. But why would someone sign their real name to unchecked AI slop for an open source project? It would risk ruining their reputation for little personal gain.

uuj8za@piefed.social · 11 days ago

See also: https://youtu.be/xcq5XYkFJfY?t=705

for why using it “responsibly” is super hard, even if you’re an expert. We’re hardwired to take mental shortcuts, so we might not even realize we’re using heuristics or falling for cognitive biases when fact checking the AI.