• pinkapple@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 day ago

    You made huge claims using a non peer reviewed preprint with garbage statistics and abysmal experimental design where they put together 21 bikes and 4 race cars to bury openAI flagship models under the group trend and go to the press with it. I’m not going to go over all the flaws but all the performance drops happen when they spam the model with the same prompt several times and then suddenly add or remove information, while using greedy decoding which will cause artificial averaging artifacts. It’s context poisoning with extra steps i.e. not logic testing but prompt hacking.

    This is Apple (that is falling behind in its AI research) attacking a competitor with fake FUD and doesn’t even count as research, which you’d know if you looked it up and saw you know, opinions of peers.

    You’re just protecting an entrenched belief based on corporate slop so what would you do with peer reviewed anything? You didn’t bother to check the one you posted yourself.

    Or you post corporate slop on purpose and now trying to turn the conversation away from that. Usually the case when someone conveniently bypasses absolutely all your arguments lol.

    • DoPeopleLookHere@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      22 hours ago

      Okay, here’s a non apple source since you want it.

      https://arxiv.org/abs/2402.12091

      5 Conclusion In this study, we investigate the capacity of LLMs, with parameters varying from 7B to 200B, to com- prehend logical rules. The observed performance disparity between smaller and larger models indi- cates that size alone does not guarantee a profound understanding of logical constructs. While larger models may show traces of semantic learning, their outputs often lack logical validity when faced with swapped logical predicates. Our findings suggest that while LLMs may improve their logical reason- ing performance through in-context learning and methodologies such as COT, these enhancements do not equate to a genuine understanding of logical operations and definitions, nor do they necessarily confer the capability for logical reasoning.

      • pinkapple@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        16 hours ago

        Another unpublished preprint that hasn’t published peer review? Funny how that somehow doesn’t matter when something seemingly supports your talking points. Too bad it doesn’t exactly mean what you want it to mean.

        “Logical operations and definitions” = Booleans and propositional logic formalisms. You don’t do that either because humans don’t think like that but I’m not surprised you’d avoid mentioning the context and go for the kinda over the top and easy to misunderstand conclusion.

        It’s really interesting how you get people constantly doubling down on specifically chatbots being useless citing random things from google but somehow Palantir finds great usage in their AIs for mass surveillance and policing. What’s the talking point there, that they’re too dumb to operate and that nobody should worry?

        • DoPeopleLookHere@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 hours ago

          As apposed to the nothing you’ve cited that context tokens actually improve reasoning?

          I love how you keep going further and further away from the education topic at hand, and now brining in police survalinece, which everyone knows is 100% accurate.

          • pinkapple@lemmy.ml
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 hours ago

            You’re less coherent than a broken LLM lol. You made the claim that transformer-based AIs are fundamentally incapable of reasoning or something vague like that using gimmicky af “I tricked the chatbot into getting confused therefore it can’t think” unpublished preprints (while asking for peer review). Why would I need to prove something? LLMs can write code, that’s an undeniable demonstration that they understand abstract logic fairly well that can’t be faked using probability and it would be a complete waste of time to explain it to anyone who is either having issues with cognitive dissonance or less often may be intentionally trying to spread misinformation.

            Are the AIs developed by Palantir “fundamentally incapable” of their demonstrated effectiveness or not? It’s a pretty valid question when we’re already surveilled by them but some people like you indirectly suggest that this can’t be happening. Should people not care about predictive policing?

            How about the industrial control AIs that you “critics” never mention, do power grid controllers fake it? You may need to tell Siemens, they’re not aware their deployed systems work. And while on that, we shouldn’t be concerned about monopolies controlling public infrastructure with closed source AI models because they’re “fundamentally incapable” to operate?

            I don’t know, maybe this “AI skepticism” thing is lowkey intentional industry misdirection and most of you fell for it?

            • DoPeopleLookHere@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              1
              ·
              5 hours ago

              My larger point, AI replacing teachers is at least a decade away.

              You’ve given no evidence that it is. You’ve just said you hate my sources, while not actually making a single argument that it is.

              You said well it stores context, but who cares? I showed that it doesn’t translate to what you think, and you said you don’t like, without providing any evidence that it means anything beyond looking good on a graph.

              I’ve said several times, SHOW ME ITS CLOSE. I don’t care what law enforcement buys, because that has nothing to do with education.