Yeah that makes sense. I know people are concerned about recycling AI output into training inputs, but I don’t know that I’m entirely convinced that’s damning.
Yeah I agree garbage in garbage out, but I don’t know that is what will happen. If I create a library, and then use gpt to generate documentation for it, I’m going to review and edit and enrich that as the owner of that library. I think a great many people are painting this cycle in black and white, implying that any involvement from AI is automatically garbage, and that’s fallacious and inaccurate.
Yes, but for every one like you, there’s at least one that doesn’t and just trusts it to be accurate, or doesn’t proof read it well enough and misses errors. It may not be immediate, but that will have a downward effect over time on quality, which likely then becomes a feedback loop.
The theory behind this is that no ML model is perfect. They will always make some errors. So if these errors they make are included in the training data, then future ML models will learn to repeat the same errors of old models + additional errors.
Over time, ML models will get worse and worse because the quality of the training data will get worse. It’s like a game of Chinese whispers.
I think the biggest issue arises in the fact that most new creations and new ideas come from a place of necessity. Maybe someone doesn’t quite know how to do something, so they develop a new take on it. AI removes such instances from the equation and gives you a cookie cutter solution based on code it’s seen before, stifling creativity.
The other issue being garbage in garbage out. If people just assume that AI code works flawlessly and don’t review it, AI will be reinforced on bad habits.
If AI could actually produce significantly novel code and actually “know” what it’s code is doing, it would be a different story, but it mostly just rehashes things with maybe some small variations, not all of which work out of the box.
Yeah that makes sense. I know people are concerned about recycling AI output into training inputs, but I don’t know that I’m entirely convinced that’s damning.
GIGO.
Yeah I agree garbage in garbage out, but I don’t know that is what will happen. If I create a library, and then use gpt to generate documentation for it, I’m going to review and edit and enrich that as the owner of that library. I think a great many people are painting this cycle in black and white, implying that any involvement from AI is automatically garbage, and that’s fallacious and inaccurate.
Yes, but for every one like you, there’s at least one that doesn’t and just trusts it to be accurate, or doesn’t proof read it well enough and misses errors. It may not be immediate, but that will have a downward effect over time on quality, which likely then becomes a feedback loop.
No matter how good your photocopier is, a copy of a copy is worse, and gets worse everytime you do it.
The theory behind this is that no ML model is perfect. They will always make some errors. So if these errors they make are included in the training data, then future ML models will learn to repeat the same errors of old models + additional errors.
Over time, ML models will get worse and worse because the quality of the training data will get worse. It’s like a game of Chinese whispers.
I think the biggest issue arises in the fact that most new creations and new ideas come from a place of necessity. Maybe someone doesn’t quite know how to do something, so they develop a new take on it. AI removes such instances from the equation and gives you a cookie cutter solution based on code it’s seen before, stifling creativity.
The other issue being garbage in garbage out. If people just assume that AI code works flawlessly and don’t review it, AI will be reinforced on bad habits.
If AI could actually produce significantly novel code and actually “know” what it’s code is doing, it would be a different story, but it mostly just rehashes things with maybe some small variations, not all of which work out of the box.
It may be fine for code, because malformed code won’t compile/run.
It’s extremely bad for image generators, where subtle inconsistencies that people don’t notice will amplify.