• rbn@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    9
    ·
    edit-2
    5 months ago

    For converting your spoken words into text, it taps into OpenAI’s Whisper model, an automatic speech recognition system renowned for its accuracy and ability to handle various accents and background noise.

    Have the hardware requirements of Whisper dropped significantly over the last few months? I played around with it in context of home assustant year of the voice. Despite using a (4 year old) ThinkPad with 32 GB of RAM and a 4 core (8 threads) i7 the accuracy and performance of Whisper was still not at a point that I’d use for productive use.

    A rather simple sentence like ‘turn the light in the living room on’ worked maybe in 70% of the cases if I sat right next to the microphone and without any background noise. With music playing in the background or other people talking in parallel it dropped to ~25% accuracy.

    If it now runs just fine on a Raspberry Pi Zero that would be a massive improvement!

  • fubarx@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    5 months ago

    Nice! On-device AI makes so much sense. I bet the next version will have a camera.

  • wizzor@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    3
    ·
    5 months ago

    My bad… But then… What’s the point of having a dedicated piece of LLM hardware, isn’t that like having a hardware client for email?

  • wizzor@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    3
    ·
    5 months ago

    The models an rpi zero can run are very limited though. 512 mb ram is very, very little for AI models.

    • Redex@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      5 months ago

      The LLM isn’t local

      For the actual conversational responses, the project typically utilizes cloud-based large language models accessed via APIs

      • rbn@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        16
        ·
        5 months ago

        Then, from my perspective, there’s little to no value to have a dedicated piece of hardware for it. At least I’d guess that 99.999% the target audience for such a thing already has a smartphone with them. What - if not for the sake of privacy - is the added value of a special chatbot device?

        • Redex@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          5 months ago

          I guess it makes it a bit easier to access + it’s a fun project to DIY, not much else.