• teft@piefed.social
      link
      fedilink
      English
      arrow-up
      53
      ·
      2 months ago

      I’ve found that the people who understand these “agents” the least are the ones who are promoting them the most.

      • Pommes_für_dein_Balg@feddit.org
        link
        fedilink
        arrow-up
        42
        ·
        edit-2
        2 months ago

        And everyone promotes them for tasks they aren’t experts in.
        Managers think they could replace devs, but never a manager.
        Devs think they could replace management but never a senior developer.
        Storyboard drawers think they can write screenplays. Screenplay writers think they can draw storyboards. Etc.
        As an expert, you know how shit AI is in your own field, but surely those other jobs are simple enough to be replaced.

        • baines@lemmy.cafe
          link
          fedilink
          English
          arrow-up
          11
          ·
          2 months ago

          90% of my experience with management is having none at all would be a net benefit

          why would we want to add ai to that mix

            • baines@lemmy.cafe
              link
              fedilink
              English
              arrow-up
              7
              ·
              2 months ago

              problem is ai would be another layer of separation from decisions and consequences

      • panda_abyss@lemmy.ca
        link
        fedilink
        arrow-up
        7
        ·
        2 months ago

        This.

        They’re incredibly useful, but you have to treat their output as disposable and untrustworthy. They’re reinforcement trained to generate a solution, regardless of if it’s right, because it’s impossible to AI evaluate that these solutions are correct at scale.

        If you’re writing some core code: you can use an agent to review it, refactor parts, stump the original version, infill methods, and to run your test/benchmark scripts.

        but you still have to manage it, edit it, make sure it’s not recreating the same code in 6 existing modules, generating faked tests, etc.


        As an example this week on my side project I had Claude Opus write some benchmarks. Total throwaway code.

        It actually took my input files, generated a static binary payload from it using numpy, and loaded that into my app’s memory (on its own that’s really cool), then it ran my one function and declared the whole system 100x faster than comparable libraries that parse the original data. Not a fair test at all, nor was it a useful test.

        You cannot trust this software.

        You’ll see these games metrics, gamed tests, duplicate parallel implementations, etc.

        • baines@lemmy.cafe
          link
          fedilink
          English
          arrow-up
          9
          ·
          2 months ago

          spend more time fixing slop compared to just doing it manually and correct the first time

  • apfelwoiSchoppen@lemmy.world
    link
    fedilink
    arrow-up
    68
    ·
    2 months ago

    My sister-in-law is a software engineer and project manager. This isn’t groundbreaking news or anything but she said that her engineers are using generative AI like this. The problem is that it created exceedingly inefficient and bloated code that barely works. En masse it will bog down systems due to the exponential inefficiencies.

    It’s fine. Everything is fine.

  • Inucune@lemmy.world
    link
    fedilink
    arrow-up
    47
    ·
    2 months ago

    Stop hiring 20 managers. Hire 1 manager and have them in meetings all day so real work can be done.

  • sasquash@sopuli.xyz
    link
    fedilink
    arrow-up
    47
    ·
    2 months ago

    “The new skill isn’t typing faster”.

    Since I added a second keyboard I am programming twice as fast and don’t even need AI!

  • Aganim@lemmy.world
    link
    fedilink
    arrow-up
    33
    ·
    edit-2
    2 months ago

    One of our devs came to me with an LLM rewrite of some parts of our automation. Even at first glance you could see a lot was missed, the refactor simply wasn’t going to work in that state and critical migration logic just wasn’t present.

    I binned the branch and did the refactor myself, as it would have taken more time to figure out the damage caused than just starting over.

    So glad we now pay premium prices for RAM and non-volatile storage, just so some LLM can vomit up a reheated turd.

    • _lilith@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      2 months ago

      and that’s the fucking crux of it right there, the damage. Any code base without someone smart enough to throw this trash out is going to take forever to fix or might just be too far gone to save

  • fubarx@lemmy.world
    link
    fedilink
    arrow-up
    21
    ·
    2 months ago

    Last week, curious what would be generated, told Cursor (with Claude Opus 4.5) to create an animated LED strip effect for an ESP32 device in C. Pretty simple stuff. It thinks for a long time. Creates a ton of scaffolding, docs, step-by-step agentic checklists, even a Makefile to build and deploy the binary. It then says: “Done.”

    I go compile it. Lots of errors. I paste over the logs and ask it what’s wrong. Claude thinks for a while longer, then goes:

    “I see the issue - I only created the header file but never completed the LED manager implementation. Let me check what’s there and finish the implementation.”

      • fubarx@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        2 months ago

        It said it had. Obviously, a fabrication. Even after it said it implemented it, it took a couple more hours of coaxing it and pointing whst should be done before it actually worked.

        Point is, one shouldn’t go near these LLMs for coding unless they know what to do and how to look for problems.

  • _lilith@lemmy.world
    link
    fedilink
    arrow-up
    15
    ·
    2 months ago

    man does this dude think the hardest part about writing a book is fucking typing it out? someone give this dork a swirly

    • VitoRobles@lemmy.today
      link
      fedilink
      English
      arrow-up
      17
      ·
      2 months ago

      Fun story!

      The CEO was charmed by some AI vibe dude who

      1. Absolutely ripped into multiple software departments about our “shit code”
      2. Bragged that he could do it faster and better with AI

      CEO gave him a three month trial run to show it.

      AI vibe dude spent the first two weeks showing off all this cool new frontend to managers. Nothing actually worked. They gave him a round of feedback.

      Then he spent another two weeks struggling to meet the feedback.

      Nobody in the tech department wanted to help him because he came in shit talking.

      They ended the trial because the AI Vibe coder dude couldn’t handle system changes, how to fix bugs, implementing new feature requests without breaking old stuff, and didn’t have any real coding skills. He barely lasted a month.

      • Spider
        link
        fedilink
        English
        arrow-up
        8
        ·
        2 months ago

        The first half of this story made me wonder if we were colleagues.

        The second half was different though. Our guy was a personal friend of some high up, slandered the existing codebase without so much as even speaking to the existing devteam, and then took the better part of a year claiming he could replace the entire decade old codebase while making vague promises that it was coming soon. Meanwhile upper management was taking the slander seriously, punished my department and got a new manager for it. It wasnt until the new manager outed him as a fraud for his ass to finally get caught.

        I doubt he was able to read the legacy codebase at all.

  • SleeplessCityLights@programming.dev
    link
    fedilink
    arrow-up
    13
    ·
    2 months ago

    Do people actually use agents for production code? I feel like it is one of those things that people don’t use but is sold to us that everyone uses. A lie to promote bullshit.

    • brygphilomena@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      14
      ·
      2 months ago

      I have an engineer that uses it heavily.

      It adds so much extra and he’ll push thousands of lines of code into a PR every week. He had one bug and tried to refactor it, he bloated that single file by 16%.

      It’s almost impossible to review.

      • pivot_root@lemmy.world
        link
        fedilink
        arrow-up
        6
        ·
        2 months ago

        It adds so much extra and he’ll push thousands of lines of code into a PR every week.

        This is just a waste of the reviewer’s own time that could be better spent doing actual work. Is it at least split up into multiple commits, or is it one giant shitshow?

    • lad@programming.dev
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 months ago

      I sometimes do, but very sparsely, it’s hard to come across a task that’s good fit for an LLM unless you’re prototyping something, imo

  • Pencilnoob@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    ·
    2 months ago

    Prompt: “write a python script to create a thousand linkedin accounts with plausible sounding names, and then setup a cronjob to post every day a punchy linkedin just-so story explaining why everyone should be paying for my LLM and to keep paying for it when I jack up the price 100x to cover my expenses”

    • in_my_honest_opinion@piefed.social
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      2 months ago

      You’ll want to set a systemd timer for that actually, easier to have the agent journalctl to get the full stderr and properly hallucinate more jobs.

  • AlexLost@lemmy.world
    link
    fedilink
    arrow-up
    12
    ·
    2 months ago

    There isn’t a talent shortage, there’s a shortage of people who will take your shit at sub-par wages working two + jobs at your company.

  • drsilverworm@midwest.social
    link
    fedilink
    arrow-up
    11
    ·
    2 months ago

    Pokémon Red & Blue was 373 KB. So much efficiency and creative coding made early high-content video games possible. Imagine how bloated that would be if it was vibe coded by AI

    • Entertainmeonly
      link
      fedilink
      arrow-up
      8
      ·
      2 months ago

      But could you imagine all the exploits we would have found by now? That special little coast line would have been nothing. Missing Number? Bet we would have had the whole Missing Alphabet haha.