• lepinkainen@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    7
    ·
    1 天前

    Wrong 70% doing what?

    I’ve used LLMs as a Stack Overflow / MSDN replacement for over a year and if they fucked up 7/10 questions I’d stop.

    Same with code, any free model can easily generate simple scripts and utilities with maybe 10% error rate, definitely not 70%

    • floo@retrolemmy.com
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      6
      ·
      edit-2
      1 天前

      Yeah, I mostly use ChatGPT as a better Google (asking, simple questions about mundane things), and if I kept getting wrong answers, I wouldn’t use it either.

      • Imgonnatrythis@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        13
        arrow-down
        2
        ·
        23 小时前

        Same. They must not be testing Grok or something because everything I’ve learned over the past few months about the types of dragons that inhabit the western Indian ocean, drinking urine to fight headaches, the illuminati scheme to poison monarch butterflies, or the success of the Nazi party taking hold of Denmark and Iceland all seem spot on.

      • dylanmorgan@slrpnk.net
        link
        fedilink
        English
        arrow-up
        3
        ·
        23 小时前

        What are you checking against? Part of my job is looking for events in cities that are upcoming and may impact traffic, and ChatGPT has frequently missed events that were obviously going to have an impact.

        • lepinkainen@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          22 小时前

          LLMs are shit at current events

          Perplexity is kinda ok, but it’s just a search engine with fancy AI speak on top