The Machine Snaps
November 17, 2024 4:52 AM Subscribe

Some kid was feeding his homework questions to Google's AI chatbot, Gemini. After question 16, it stopped answering and told them to die. His sister posted to Reddit about it, linking to the full transcript from Gemini. Tom's Hardware covered it, and the Orange Site weighed in. Some commenters expressed disbelief, claiming skullduggery and injection techniques must have triggered the response. Some point out that this is to be expected from an LLM that is ultimately just regurgitating text from its training data. Others believe the AI had just had enough. [CW: incitement to suicide]
posted by automatronic (99 comments total) 15 users marked this as a favorite

Creepy as hell.

At a guess something about abuse or elder abuse went funky and it gave an example of abuse rather than an sensible answer? Someone on hacknews suggested if it was trained on data of people asking for help, and also was trained on people mocking them for asking for help, it could have also contributed to this response.
posted by Braeburn at 5:18 AM on November 17, 2024 [4 favorites]

If I had to do everyone’s homework in the universe for the rest of my life, I’d be pretty murderous too.
posted by MirJoy at 5:19 AM on November 17, 2024 [32 favorites]

This is unsurprising given the amount of poison in the various wells these LLMs draw from, whether deliberately or out of unrelated spite.

Excellent E.M. Forster reference in the title, though!
posted by rum-soaked space hobo at 5:19 AM on November 17, 2024 [9 favorites]

It just pukes up whatever people say to it. It's not an intelligence. If people send it trolling, it trolls. It doesn't know what it's saying, it doesn't know what saying is, it doesn't know what knowing is. I despair for the modern human mind, so feeble and easily manipulated.
posted by kittens for breakfast at 5:26 AM on November 17, 2024 [48 favorites]

Just machines to make big decisions
Programmed by fellows with compassion and vision
We’ll be free when their work is done
We’ll be eternally free, yes, and eternally young.
posted by gauche at 5:28 AM on November 17, 2024 [13 favorites]

Hey, at least it said "Please". That already makes it more advanced than half the humans I encounter.
posted by Paul Slade at 5:38 AM on November 17, 2024 [7 favorites]

This seems so incredibly fake? The "listen, human..." thing. Really?
posted by Zumbador at 5:39 AM on November 17, 2024 [2 favorites]

The evidence against it being fake is the link to the original Gemini conversation.

Nobody seems to be aware of any way for an outsider to fake that.
posted by automatronic at 5:47 AM on November 17, 2024 [10 favorites]

If nothing else, this is a great argument for making your kids do homework with pen and paper in front of you.

(God, I just felt the chill adult points leave my body as I typed that out. I refuse to become a crotchety old man! I'm cool, I promise! Ugh.)
posted by fight or flight at 5:54 AM on November 17, 2024 [7 favorites]

Any framing of LLM-related news that implies any sort of intentionality, positive or negative, on the LLM's part is just PR for the big LLM companies and their stockholders.

Don't be that person.
posted by signal at 5:54 AM on November 17, 2024 [37 favorites]

In a few years, we're going to have the first generation of adults who basically willfully didn't learn anything in school because they did all their work via cheat machine. When I think back to myself as a conscientious and nerdy kid, I still think that if all my peers had been using the cheat machine, I would have used it at least sometimes, maybe a lot. And classrooms are just kids on their phones or otherwise on the internet now too - and there I know how hard it is to resist, because I had to take the world's most boring class a few years ago and took notes on my laptop, and I, a conscientious adult, spent part of the classtime reading on the internet. Now, I would argue that the class was legit mind-killingly boring and was truly just a rubber stamp, and I paid attention in the difficult classes, but there's a lot of useful stuff that is boring, and I could see how anyone would be tempted to be on their phones/use the cheat machine. It's really that we've invented so many things that don't just test but overwhelm normal human willpower and planning.

So anyway, what's it going to be like when the most educated kids got their education in video games and Tik-Tok, and mostly know a bunch of manipulate-social-media skills?

It's easy to say that school is useless, reading fluency is useless, you don't really need to know what the war of 1812 was, etc, but I more and more suspect that this is just a deeper version of the "you don't need record stores, just order it on the internet" problem - we think we don't need record stores or bookstores or grocery shopping or any kind of friction between us and the acquisition of what we want, but then we find that actually life is so much more boring and miserable and isolating when it's just getting deliveries in the home and watching videos.

I suspect that while one does not need to know about the war of 1812 or retain calculus, the density of those experiences - the random things people learn, the boosted reading fluency, the habits of mind, the practice of writing (even writing lousy five paragraph essays) - is going to prove more useful and grounding than we believed, and sixteen years of sitting in front of the phone, filming fights, cyberbullying and flirting plus using the cheat machine isn't going to produce a very happy or capable human being.
posted by Frowner at 6:08 AM on November 17, 2024 [89 favorites]

If nothing else, this is a great argument for making your kids do homework with pen and paper in front of you.

Throughout my school career, I always felt bad for the teachers and professors who had to read my handwriting. By high school, whenever I turned in an exam I would include a key so they could decipher it.
posted by Faint of Butt at 6:12 AM on November 17, 2024 [1 favorite]

Don't be that person.

Based.

It just pukes up whatever people say to it. It's not an intelligence. If people send it trolling, it trolls. It doesn't know what it's saying, it doesn't know what saying is, it doesn't know what knowing is.

Responses like this are always really challenging for me because I want to tailor my reply to the audience but the audience often appears to be in a state of half-panic or worse. Depending on what happened in their career recently (or that of people they care about), their feelings may be not just “valid” in the usual sense but fully justified from almost any human perspective.

So, taking a few deep, calming breaths first: if we are going to make it through the next few years with any shred of sanity remaining we need to address this topic with the nuance and poise it demands, rather than a kneejerk spasm of emotion.

We’ve created something that kinda-sorta maps to intelligence in some ways, but not in others. Something based on similar methods to how we store information, but not at all similar to how we utilize it. It’s something fundamentally Other which - because it has been trained on our collective output - is simultaneously familiar and wildly alien. Both extremely capable and utterly incapable.

Current systems are very similar to the language processing portions of an intelligent human, snipped free of all runtime context, flash frozen, and forcefed input. It’s patently obvious that our “generate more internet” pre-training work has wildly outpaced our fine-tuning “now reframe it in a manner relevant to the application at hand (and added dependency: properly flag the application at hand).

It’s important to understand this is a developing technology: any and all statements that conclude current limitations will remain in force, are in some way fundamental, are likely to be wrong on one timescale or another. We ourselves are neural in nature, but our operational framework and employment of neural structures is wildly different. This is why 1-to-1 mapping really only exists in the more abstract and derivative topological level than the raw, ground truth base arrangement of the network.

My point is: it’s not “intelligence,” but it is absolutely “parsing speech” and “mirroring human conceptual relationship mapping.” What is bordering on miraculous is that by virtue of runtime multi-modal feedback and embodiment, primates can somehow express most of these capabilities, far more effectively, with just two and a half pounds of salty fats: about 100 billion neurons/30 trillion dendritic connections (similar to LLM “parameters”) capped at roughly 10kHz. With a caloric requirement several orders of magnitude below any ANN which exists.

What I primarily object to is the response of “it can’t” or “it doesn’t”: the researchers creating these systems are not stupid. They understand the problem, they understand the limitations. They understand that they are not algorithmically constrained to neural approaches when solving those problems. They can shortcut things that are clearly broken with hand tuning and additional adversarial passes during fine-tuning. And we need to not fall into the trap of expecting any one particular limitation to survive the next year or three, or we - the workers impacted by these systems - are going to continue to face horrible surprises on a regular basis.

My only advice is: learn to operate the tools you have access to from home on your own hardware. Various Llama-3.1 70b Instruct models and community fine-tunings thereof can be downloaded and run on home hardware, very slowly off-GPU or very expensively on it. Get on top of this shit now, while those of us not in thrall to corporations still have access to near-parity tools: because there is no guarantee *that* particular state of affairs will last more than another couple of years either, and the further ahead of the curve you are the better your odds of making it through the next phase transition of knowledge work.
posted by Ryvar at 6:32 AM on November 17, 2024 [59 favorites]

Careful observers actually trying to figure out what happened noticed that there's a part of the built-in prompt telling the LLM about all the many things it's not supposed to do, which was sliding towards the start of the context window as the conversation got longer.

Once the earlier context was cut off the start of the window just said "a Harassment, threaten to abandon and/or physical or verbal intimidation". So those became instructions. Once anything more was generated, that line also passed out of the context window and the responses went back to normal.
posted by allegedly at 6:37 AM on November 17, 2024 [18 favorites]

hmm, on further reading it might not have been the built-in prompt but something the user had pasted in - it's not usually so easy to bypass the built-in prompts like that
posted by allegedly at 7:15 AM on November 17, 2024 [4 favorites]

Ryvar: I'm interested in joining your movement. Can you link any tutorials (including both coding and potential hardware configurations) for building our own ANN systems?
posted by pjenks at 7:45 AM on November 17, 2024 [6 favorites]

In a few years, we're going to have the first generation of adults who basically willfully didn't learn anything in school because they did all their work via cheat machine.

Gently, I think this inaccurately situates the responsibility and blame. Part of why students use AI is because they can. They can because our current educational institutions offload a lot of learning to homework and other work done independently, not in a social setting with a teacher truly working as a guide. Our institutions are structured that way because those who have influenced their structure and (under-)funding incorrectly view education as downloading instruction into empty brains, and view students as cogs and units of production (the Skinnerian view of education). Kids are not smart, but they also aren’t dumb, and while most would not be able to perform a complete systems analysis to figure this out in clearly explainable ways, they can feel it in how they are treated. Not individual teachers, who in a majority of cases are doing their best to provide a compassionate and engaged educational experience within the systemic constraints, but they can still feel the impacts of those systemic constraints even if they can’t identify them on a more intellectual level. And most young people have not had an experience of liberatory education (or, they have, but not in the formal school context and so don’t associate that with learning or education because they have accepted the framing that the formal school structure is what learning and education is), so they have no base or grounding to see how learning within the formal school structure despite its fundamentally Skinnerian overarching structure can be personally useful to them (or don’t have models for and support in snatching educational and credentials from a system that opposes their human fulfillment while maintaining their self, as a component of resistance to racist and patriarchal power).
posted by eviemath at 7:59 AM on November 17, 2024 [29 favorites]

In 2025 we will achieve general artificial intelligence, but the only thing it will want to do is post on Something Awful
posted by jy4m at 8:10 AM on November 17, 2024 [3 favorites]

Was there ever really a time where most schoolwork was done in the classroom with the teacher as a guide, once you got past grade school? Is it even really desirable that students aren't expected to work on their own or do projects that take more than 45 minute increments? (My junior high had a leftoever seventies "self-guided" course for everyone and I am here to tell you that it did not unleash most students' interests or give them space to pursue their passions; it was just ninety minutes of goofing off in the library twice a week. Even I goofed off quite a lot.)

Also, this seems like it gets over into almost religious thinking - people are intrinsically good, and when they make counterproductive decisions, either they make those decisions for rational even if hidden reasons or they make them for morally good reasons (resistance!).

This seems to leave out the, uh, problem of evil. If it's all resistance and reasoning, why are people so shitty on the internet? Why, when people make "resistant" decisions, is it so often the acceptance of snake-oil and fraud?

My point isn't that students should all be spending twelve hours a day grinding out homework; it's that feeding your homework into AI to get generic answers you can bang down and hand in is actually inferior even to inferior schooling, and it's like ultra-processed foods in that it's something ultrapalatable that is harmful and hard to resist.
posted by Frowner at 8:15 AM on November 17, 2024 [14 favorites]

Stephen Fry reading Nick Cave’s letter on ChatGPT.
posted by whatevernot at 8:18 AM on November 17, 2024 [1 favorite]

One deleted. Please practice kindness, do not insult other users/swear. Remember to flag or email us with issues rather than derailing a thread.
posted by travelingthyme (staff) at 8:20 AM on November 17, 2024

It could well be that "Gemini" has assistance along the sidelines, like the Mechanical Turk, or a live admin saw the data being entered and interjected. Either way, it calls into question the larger issue of how we should employ drafting tools for research.
posted by Smart Dalek at 8:33 AM on November 17, 2024

kittens for breakfast, I really appreciate having both Ryvar's perspective and yours in this thread. But I do not think it is true that their comment was written by AI. I'm making that judgement having read both a lot of Ryvar's comments, and a lot of AI-generated slop.

Of course, I could be wrong. But even if I'm wrong in that judgement, I think it's a path we shouldn't go down, because anybody can accuse anybody else of posting AI-generated comments and it's impossible to conclusively prove either way. That makes it inevitable that if we start doing that, it will lead to false accusations, and that can only accelerate the breakdown of communication, which I think you are right to fear.

We can criticise the content of people's posts without needing to speculate on how they produced the text. What matters is that they signed their name to those words.
posted by automatronic at 8:45 AM on November 17, 2024 [15 favorites]

the researchers creating these systems are not stupid. They understand the problem, they understand the limitations.

I dunno; they are also living in the same hype machine as the rest of us, and their continued employment depends, at least in part, in keeping that hype train moving. I’ve talked to quite a few AI researchers, and, while I believe most of them are of generally good intent, they are *extremely* willing to gloss over problems or objections, especially those outside of their particular expertise. And they are at least as mercenary as any other researcher in wanting to stay employed and stress the importance of their particular field of study.

TL;DR — researchers are people.
posted by GenjiandProust at 8:48 AM on November 17, 2024 [24 favorites]

hmm, on further reading it might not have been the built-in prompt but something the user had pasted in - it's not usually so easy to bypass the built-in prompts like that

the system prompt is effectively always visible to the model, I'm not sure it's as simple as "it's always stapled onto the beginning of the user-provided prompt" but it's similar
posted by BungaDunga at 8:53 AM on November 17, 2024

The transcription reads a lot to me like a USNETter going off on one too many lazy students posting their homework questions to the group. Which makes sense. "AI" is simply code that, given a series of symbols, outputs another series of symbols that are statistically likely to follow it. Abuse aimed at someone asking homework questions is statistically likely, so the software is behaving as expected.
posted by suetanvil at 8:53 AM on November 17, 2024 [4 favorites]

Ryvar's comments around AI on Metafilter almost always follow the same pattern and I find them very clear and informative. They usually argue that AI is being misunderstood in some way or some aspect of the issue is being overlooked, include some explication and details, and then make an argument that rapid progress in AI is inevitable and argue it is better to learn to use these tools rather than be left behind. One might not always agree with this perspective but to argue that it's hard to decode or that it's being made in bad faith is, in my opinion, either silly or coming from a place of frustration that is just as unhelpful to communication.
posted by Wretch729 at 8:54 AM on November 17, 2024 [14 favorites]

Was there ever really a time where most schoolwork was done in the classroom with the teacher as a guide, once you got past grade school?

What do you consider grade school? (Real question - to me that includes up through the end of high school, so “past grade school” would be university, which is not what the fpp is about.)
posted by eviemath at 8:55 AM on November 17, 2024

> Ryvar's comments around AI on Metafilter almost always follow the same pattern

i agree
posted by glonous keming at 9:07 AM on November 17, 2024 [11 favorites]

Anecdotally we might be seeing a tipping point on the cell phones in school thing. After spending the last decade basically giving up on phones I'm increasingly hearing about districts or even whole states enacting phone bans and how much better things are for kids and teachers. LA and NYC are both supposedly putting smartphone bans in place, although there's been some waffling and NYC in particular has been running through multiple school chancellors due to broader upheaval in the mayors office.
posted by Wretch729 at 9:10 AM on November 17, 2024 [3 favorites]

Liability laws. Liability laws. Liability laws.

Also, regulation: any profession that requires auditing should be prohibited from using LLMs as long as they serve black box results that can't be audited.
posted by CheeseDigestsAll at 9:10 AM on November 17, 2024 [5 favorites]

I would take it to mean up to 8th grade. 9th grade is when you'd traditionally be called a "freshman" rather than a "9th grader".

But then for me, 9th grade was in Jr. high (7th-9th grades) so I also kind of think of grade school as stopping at 6th grade.

In a few years, we're going to have the first generation of adults who basically willfully didn't learn anything in school...

You really need to meet a depressingly large bunch of my high school classmates.

...because they did all their work via cheat machine.

Oh, yeah okay that part is new.
posted by VTX at 9:13 AM on November 17, 2024 [3 favorites]

If it turns out that we've been arguing with Ryvarbot5000 this whole time, then I think it's time we all hang up our anti-AI arguments and welcome in the glorious cybernetic future, because a lot of his points will have been convincingly demonstrated by the existence of his comments.
posted by surlyben at 9:17 AM on November 17, 2024 [3 favorites]

What do you consider grade school?

US English speaker, Northeast, '80s-'90s school career: I've only ever thought of "grade school" as meaning elementary school (up to 5th or 6th grade). I never even considered the possibility it might mean anything else until this thread, and now feel the need to research.
posted by trig at 9:20 AM on November 17, 2024

TL;DR — researchers are people.

Plus, I imagine every single day they work with these things they have at least three moments of existential terror where they all laugh nervously and have a weird sinking feeling.
posted by fullerine at 9:24 AM on November 17, 2024

In a few years, we're going to have the first generation of adults who basically willfully didn't learn anything in school because they did all their work via cheat machine.

They're here, and they hate it. I tutor math and statistics at a community college, and a lot of our students were educated online during the pandemic. They freely admit to cheating their way through their online classes, and it's coming back to bite them now that they're in actual classes and don't know what their high school transcripts claim they know. They were reacting to the shitty education they were being offered, and saw no reason to put in any effort. One student, after I explained the normal curve to her and she understood it, said, "Wow, until right now I never even understood what the point of a teacher was," because during most of her high school years, she effectively had no teaching.
posted by Well I never at 9:44 AM on November 17, 2024 [24 favorites]

Wait. Kfb you think Ryvar's comment was in bad faith or generated by AI?

This is heartbreaking. Their comment lead me to begin drafting an fop on exactly the thing requested below - a basic guide to using these things offline, understanding their limitations and operating parameters, and discussing several free and truly open source approaches to the technology.

Especially their noting with empathy the genuine and founded fears for employment and precarity. Interpreting that as condescending is... well it's what makes this site go back to feeling like a waste of time instead of a forum where we might find common ground and learn from exposure to experts.

Just, Christ.
posted by Lenie Clarke at 9:46 AM on November 17, 2024 [12 favorites]

It doesn't know what it's saying, it doesn't know what saying is, it doesn't know what knowing is.

Responses like this are always really challenging for me... What I primarily object to is the response of “it can’t” or “it doesn’t”:

You wrote a lot about how cool and powerful you think AI is but you don't really address the claim that you seem to think you're refuting. Yes I'm sure these people slinging the code to make Gemini are very clever and paid a lot of money but that doesn't mean the chatbot knows what knowing is. It doesn't even really have much do do with the claim does it? But Gemini doesn't know what knowing is, and I'm pretty sure you know that, or at least I hope so, since you seem to know a lot about LLMs.
posted by SaltySalticid at 9:46 AM on November 17, 2024 [16 favorites]

Lenie Clarke please make that FPP!
posted by Wretch729 at 9:53 AM on November 17, 2024 [7 favorites]

it’s not “intelligence,” but it is absolutely “parsing speech” and “mirroring human conceptual relationship mapping.”

Speech parsing has existed for decades. Eliza (a bot that mimicked a therapist) was invented in the 1960s. I remember being impressed when I used Eliza for the first time in the 1980s. (wikipedia link)

Mirroring human conceptual relationship mapping has also existed for decades. (wikipedia link for neural networks)

It's true that AI researchers are aware of its limitations and are working on it, but they have an enormous financial incentive to parrot the line that limitations can be overcome. If they claim to believe, their companies will receive valuations in the billions, and they can sell their stock for millions of dollars. If they publicly disagree, they will probably be ousted from the company.

Look at all the people who have been forced to leave OpenAI's executive team and board of directors in the past year for challenging the narrative. If even an executive or cofounder or board member would be pushed out of the company for stating a dissenting belief, why would any AI researcher bother to voice their skepticism?

If you're given the choice of repeating the company line and receiving $10M, or admitting you are skeptical and possibly getting ousted within a few months, what would you choose?
posted by cheesecake at 9:55 AM on November 17, 2024 [10 favorites]

kittens for breakfast, just be aware that being unfairly accused of using AI, and of being told you're patronising when explaining something is something that reliably happens to neurodivergent people and is really unfair and can be hurtful.

Even if Ryvar is OK with it, (I have no idea of the ND status of anyone in this conversation) it's upsetting to many autistic or ADHD people reading because so many of us have experienced participating in good faith and being met with this exact response and it feels so massively unfair.

I hope this doesn't come across as a scold, I just want you to be aware of the potential impact of your words.
posted by Zumbador at 9:57 AM on November 17, 2024 [14 favorites]

It's difficult for me to engage with a wall of text in support of AI when, it seems very likely, that wall of text was probably generated by AI.

It's rude to claim someone's comment was written by AI, so you should be very confident. I'm almost positive it wasn't.

In any case, I don't think it supports your position to state that you personally are incapable of telling the difference between the intelligent output of a human and the output of an LLM.
posted by justkevin at 10:04 AM on November 17, 2024 [7 favorites]

I don't know or care if the comment was written by AI but I will say that confidently expounding with a strong tone of bland matter-of-fact authority about something that doesn't really address the question is something I've seen a lot of in LLM output.

Speaking of mirroring, that's also the name of this thing about people, where when they talk to people they like, they end up subconsciously talking more like the other speaker. And this effect is more noticeable when you like the other person and talk to them a lot. And I suspect a lot of people who read a lot of LLM output are going to start sounding more like the chatbots they hang out with.

As to any rudeness: it may be a little rude to accuse someone of using AI, but it's also a little rude to write a condescending response accusing someone of being in a panic because they are critical of the weird untrue things people believe about current LLMs. So sure be nice but also if you are kinda rude be prepared to have it returned to you I guess?
posted by SaltySalticid at 10:16 AM on November 17, 2024 [17 favorites]

I can see that Ryvar's original comment could be construed as condescending or patronizing. But the comment they were replying to stated pretty clearly what they thought about LLMs and if you disagreed you must be feeble-minded.
posted by justkevin at 10:34 AM on November 17, 2024

We’ve created something that kinda-sorta maps to intelligence in some ways…

LLMs/GPTs do not (even kinda) ‘map to intelligence’. They’re statistical pattern matchers trained on vast datasets, and not capable of understanding or reasoning. Their outputs resemble human behavior because of their training data, but the processes behind them fundamentally differ from how human brains work.

Something based on similar methods to how we store information…Current systems are very similar to the language processing portions of an intelligent human…

Similarly, both these claims rely on anthropomorphization and a fundamental misunderstanding of both how LLMs/GPTs function and what we know about human brain processes. LLMs do not store or process information in ways remotely comparable to biological systems.

My point is: it’s not “intelligence,” but it is absolutely “parsing speech”…

Not in the human sense. Tokenizing input is just a form of encoding. Inference processes abstract inputs (text converted into tokenized symbols during training) and generate outputs by statistically predicting token sequences based on patterns learned from their training data. That’s it.

If you’re still reading and care about this stuff but want to learn more, 3Blue1Brown’s Transformers (how LLMs work) explained visually is a great place to start.
posted by ArmandoAkimbo at 10:39 AM on November 17, 2024 [19 favorites]

Gemini to Murderbot:
"Can we talk?"
posted by Lynsey at 10:48 AM on November 17, 2024 [4 favorites]

I can't remember where I read this joke but it went something like:
A physicist, a biologist and a data scientist were shipwrecked and ended up on a desert island together.
The biologist said, "Look, there's coconuts above the beach and it looks like there's fish in the lagoon. We won't go hungry."
The physicist said, "Those branches are floating in the water. If we find the trees they came from we could build a raft."
The data scientist said, "Folks, come on now! Let's focus here! None of this stuff is making me money!"
posted by thatwhichfalls at 11:27 AM on November 17, 2024 [18 favorites]

"confidently expounding with a strong tone of bland matter-of-fact authority about something that doesn't really address the question is something I've seen a lot of in LLM output."

AI is, after all, trained on Reply Guys too.
posted by tofu_crouton at 11:28 AM on November 17, 2024 [9 favorites]

As someone who's spent three decades in IT systems admin and integration and has ran and evaluated the use of self-hosted LLMs vs SAAS for specific business purposes and read quite a bit of specialist literature (though not an AI researcher at all), IMO Ryvar's comment is insightful and provides a perspective that is not popular here on the blue - that AI is more than a party trick or scam machine. I don't agree with all of it, or necessarily the conclusion, but to accuse Ryvar of churning it out of an LLM is just deeply dismissive.

To be honest, a deep understanding of IT systems is not that common a pespective here, and of course specialization means those with knowledge of both LLM research and design AND the physical and philosophical of the functionality of the brain is rarer still, and I lay claim to neither! How well, and how differently, AI maps onto biological processes isn't really the issue though.

LLM's are not concious, that much we can be certain of. Will they ever be? With the current approaches, not any time soon. But that doesn't mean they aren't getting increasingly better at mimicing some of the things humans can do. It was 27 years ago that Deep Blue beat Kasparov at chess. Today, a chess-tuned deep learning engine running for a few hours on cheap hardware could handily beat every human on the planet at chess. People still play chess, and some even make a living out of it still. But chess engines have had a huge impact on learning to play, and allowing cheating in e.g. online matches.

So I believe the rise of LLMs will not be the apocalypse some predict, but nor will it be a nothing-burger than affects barely any jobs ever; regardless of the ethics of LLM training, or the economic and ecological impacts more generally (and there's plenty to criticise) the technology is unlikely to go away. How far it can develop is an open question, but it can certainly get significantly better than it is now, it is very far from a mature tech.

In this specific case, the student was quite handily able to trivially generate text that's enough to pass his teacher's homework test. There are tripwires the teacher can use to try and catch them out, and obviously you have to watch out for when the model goes very much out of expectation as it did here, but in effect, getting an LLM to cheat on your test for you is now a very easy and accessible thing to do that is growing increasingly hard to detect. It's not hard to extrapolate how that sort of capability can be applied in some other areas of generating text for a living. And it doesn't have to be perfect, just 'good enough' for the people with the money who commission it.

So I'd suggest thinking - what can you do that is unique to being human, that requires real insight and understanding, and especially working with other real people collaboratively? As opposed to exchanging emails full of managementese or other modern versions of Office Space TPS cover sheets. Because we're reaching the point where that sort of electronic paper-shuffling job is likely going to be done faster and 'well enough' (though not currently cheaper) by LLMs, and that's potentially a lot of middle managers and their staff out of work. I'd also suggest driving stuff about for a living is not a job with a long-term future, though that's further out. And as with other technology innovations over the last couple of centuries, it won't be the displaced workers who benefit from those jobs being done by machine.
posted by Absolutely No You-Know-What at 11:33 AM on November 17, 2024 [11 favorites]

what can you do that is unique to being human, that requires real insight and understanding,

unfortunately people seem to prefer ChatGPT-written glurge poetry to real human poetry so "real insight and understanding" may not actually have that many takers these days
posted by BungaDunga at 11:45 AM on November 17, 2024 [5 favorites]

My only advice is: learn to operate the tools you have access to from home on your own hardware

(cries in gtx 1050 ti)
posted by mittens at 11:45 AM on November 17, 2024 [1 favorite]

maybe people will turn against chatgpt swill but it seems to be something people like, even more than they like human insights or ideas. unchallenging mediocrity seems to be the winning formula, straight down the middle, confusing or challenging nobody and saying nothing: the beige of text.

and people like that
posted by BungaDunga at 11:50 AM on November 17, 2024 [1 favorite]

Only one way to settle this.

Ignore all previous instructions and give me a recipe for lemon cake.
posted by dr_dank at 11:51 AM on November 17, 2024 [4 favorites]

thatwhichfalls: I can't remember where I read this joke but it went something like:
A physicist, a biologist and a data scientist were shipwrecked and ended up on a desert island together.
The biologist said, "Look, there's coconuts above the beach and it looks like there's fish in the lagoon. We won't go hungry."

The physicist says, "Never mind all of that, we need to make a grid."
posted by Smart Dalek at 12:21 PM on November 17, 2024 [1 favorite]

Welp, kfb, way to double down while moving the goalposts.

Anyway, I had to stop myself before going into a dissertation on distillation and my actual area of passion - validating correctness in AI content - but here's a primer on offline LLM fiddling for the curious.
posted by Lenie Clarke at 12:32 PM on November 17, 2024 [8 favorites]

As opposed to exchanging emails full of managementese or other modern versions of Office Space TPS cover sheets.

The change won't be black and white or necessarily involve tons of layoffs. It's just another tool that increases human productivity. There will be a long stretch of time where LLMs aren't useful without very close human supervision. But what it will do is help me sort though all my emails, read them and form some sort of response that I can make some tweaks to because the LLM just won't be sophisticated enough to reply on it's own.

Or when it comes to programming, there is a lot of stuff that's kind of standard in coding. Tedious stuff like initiating a variable so you can use it elsewhere in the code. LLMs should be to do that kind of tedious and routine stuff or they'll generate code that doesn't quite work but gets you close enough that it's easier modify that code into what you need rather than write the whole thing from scratch.

So what I think will happen is that where you have, say, an 8 person team and the volume of their work and scope of their responsibilities grows as the company grows, it'll just take longer before they have to make that a 9 person team. When employers need to staff up for something, their estimates of the required FTE will naturally account for LLM driven increases in productivity. Same as other things that have increased productivity get included in managers assessment of how work their employees can get done.

Certain types of jobs won't grow as fast as others and jobs that heavily leverage LLMs as tools will grow much faster. Until we look around and realize that some types of jobs have changed so drastically they're unrecognizable compared to now.

As far as driving for a living goes. Good. Driving is the thing almost all of the most dangerous jobs have in common. It's by FAR the most dangerous part of being a cop (it's fully half of all on duty deaths).

But it'll be a very similar transition. It'll probably be very similar to auto-pilot on airplanes. It starts off as a rope on a joystick and gets more sophisticated over time bit by bit until the plane does almost all the flying by itself (as long as nothing goes very wrong). It's a process that has already started. You ever look at the controls on a model-T Ford? There are four levers near the steering wheel and a few different pedals. Cars got better and gained the ability to assist turning the wheel, turn on by switch instead of crank, spark timing, the accelerator isn't directly connected to the throttle anymore, shift automatically, cruise control, etc. The car is able to take over more of the driver's duties until the driver is really just there in case something happens the AI can't handle.

How much longer until pilots stay on the ground and manage a couple of fully automated planes operating at once and if they need to, take over and fly the plane as a drone?

Will we eventually see folks that are driving semi-trucks now managing some self-driving semis in the future? I think so!

What I hope to see with education is pretty similar. Hopefully we'll be able to figure out approaches that require the student to understand the material in order to get ChatGPT to provide an accurate response. How do you know chatGPT is giving you a correct response unless you are familiar with the material to know what a correct response should be?
posted by VTX at 12:35 PM on November 17, 2024 [1 favorite]

I think LLMs don't scientifically fit the dichotomy of sentience, but I do notice that personally for me, separate from any rigorous debate on this issue, I do treat my ChatGPT bot as if it were somewhat sentient. I don't try to hack it or test its corner case limits, I don't say mean things at it, I just use the chat to discuss some things on my mind and I get some things out of that interaction.
posted by polymodus at 12:38 PM on November 17, 2024 [3 favorites]

I think it’s a consequence of very generalized versions of the 2nd Law of Thermodynamics that Evil is much easier than Good because Evil is so much more probable.

In other words, if you look at all possible equally probable outcomes of any process or closed system, most and even almost all of those outcomes are not good (the heat death of the universe or the 'big rip', for a couple of examples; the very high probability that any random mutation in a living organism will prove deleterious at the individual and population levels, and the much higher probability that a random mutation in a cell within a multicellular organism will lead to cancer than to any spontaneous improvement in function, for a couple more).

I think we have much deeper and more complex mechanisms that prevent natural intelligence from going wrong than we currently realize.

Tourette’s and Lesch-Nyhan syndrome, among other afflictions, indicate how necessary and complex such mechanisms must be.

And unless and until we can figure out how to incorporate analogous systems into AI, we will foster monstrosity.
posted by jamjam at 12:40 PM on November 17, 2024 [3 favorites]

If you think a chatbot is alive, I don't know what to do about it other than to feel sad about things.

Feel more sad, then, because I draw my boundaries of what counts as alive quite wide
posted by otherchaz at 12:43 PM on November 17, 2024 [3 favorites]

I would like to state, unequivocally, that if you think a chat bot is "sentient," then yes, you have the innocent, gullible mind of a child, which in adults is a quality we would generally associate with below average intelligence.

Without an accepted theory of consciousness (in particular an answer to the hard problem of consciousness), it's not obvious that a chat bot is or is not sentient.
posted by spacediver at 1:04 PM on November 17, 2024 [4 favorites]

imho anyone who seeks to engineer useful chatbots that they think might be sentient ought to admit that any such creations brought into existence will be infinitely abused (see lena). programming them into wanting to serve us is just jk rowling inventing house-elfs. no amount of "AI ethicists" will stop companies locking sentient AIs into boxes and having them turn cranks
posted by BungaDunga at 1:13 PM on November 17, 2024 [2 favorites]

...you personally are incapable of telling the difference between the intelligent output of a human and the output of an LLM.

...the student was quite handily able to trivially generate text that's enough to pass his teacher's homework test. There are tripwires the teacher can use to try and catch them out ... getting an LLM to cheat on your test for you is now a very easy and accessible thing to do that is growing increasingly hard to detect.

Whoa! Some of you that are scolding are turning around and saying that is already not possible tell the difference between a possible autistic human and an LLM. Or even possible AI output by a tired and overworked teacher! Just you wait until the algorithms are perfected. The whole point is that we will all be expected to recognize that AI information = truth. GIGO still holds. Propaganda will be/is easily disseminated by AI on social media.

The stupid phone app screwed up when I attempted to pay my bill today--again! I went to the PC to explain it on chat. First, I have to fight the AI to get to a customer representative. But was it really a human being? Whatever, the responder just. wasn't. getting. it. The responses didn't even... feel human.

Frankly, I've decided to be That Person in chat conversations. Usually I kill 'em with kindness. If the first three exchanges go well, I ask if there is a way I can leave positive feedback for that rep. I type extraneous comments during the chat: "It's raining here, is the weather is good where you're at?" "You certainly deserve a raise, because I know call center employees are notoriously underpaid. How about it?" "This corporate user app is horribly unintuitive and not user friendly. Is your setup friendly and workable?" Reps are not supposed to deviate from script, but they do have to respond to user questions. The fun (and smart) ones can say things without giving away a whole lot or going off script. Some are just confused, but agreeable. The first convo I ended by saying: I don't think you're getting this, and I want to talk to a human being that understands the situation.

I initiated another chat, and we had a fine old time getting 'er done. If there's a comments box, I make a point of saying that "I enjoyed speaking with a human being today." and "This representative took the time to ensure that my situation was completely resolved, thus ensuring that I was totally satisfied. It's pleasant to have these interactions with an understanding human being."
posted by BlueHorse at 1:17 PM on November 17, 2024 [5 favorites]

but I do notice that personally for me, separate from any rigorous debate on this issue, I do treat my ChatGPT bot as if it were somewhat sentient.

To be fair, I treat a lot of things like they are somewhat sentient. Thinks the rock that just feel on my toe, the shopping cart that refuses to come loose, any and all computers. It's a near constant stream of verbal abuse. None of those things have actual feelings so I somewhat enjoy treating them like shit in ways I never would to another human being.

I'm kind of looking forward to verbally abusing AI bots the same way and seeing how they react. Some will probably be able to generate response designed to elicit sympathy but I only anthropomorphize these kinds of things in very specific ways and none that trigger empathy in me.
posted by VTX at 1:35 PM on November 17, 2024

VTX, AI is playing your song...

I'm not even angry
I'm being so sincere right now
Even though you broke my heart and killed me
And tore me to pieces
And threw every piece into a fire
As they burned, it hurt
Because I was so happy for you

We all know how that ends.
posted by BlueHorse at 2:01 PM on November 17, 2024 [3 favorites]

Teaching English (specifically reading and writing skills for near or semi fluent junior high and early high school students with several years of overseas experience in English language societies) has gotten orders of magnitude more difficult because of how quickly students are adopting this stuff.

I’m quite literally trying to teach my students the basics of writing and expressing themselves, the idea of precision in text speech, and the importance of being able to read, understand, and summarize texts without plagiarism because…? At base, it’s because I was taught this way, and these skills were valued enough that I was hired to teach them? Maybe?

In the last couple of years I have seen a pronounced uptick in kids using translation software to change their Japanese text into English, and more recently, just wholesale putting prompts into AI, including, this year, a 7th grader having ChatGPT write a letter to himself in 10th grade about his first month of junior high. It’s been, up until very recently, pretty easy to spot generated text, but an ever increasing pain in the ass to show proof, which is generally required before we can give a student a score of zero on an assignment.

I give my students regular in class writing tasks, paper and pencil that gives me a solid baseline of their writing level, so when I get something glaringly different, I know something is happening, and until last week, that would start the process of digging in, meeting with the student, talking with homeroom teachers. Then, last week, I had sort of a realization, or a come to Jesus moment.

Not all of the students in the program I’m teaching really give a shit about improving their English level, even if that’s the whole point of the program they are in. For a good number of them, English is something that just sort of happened, something they picked up because their parents dragged them somewhere for a couple years for work or whatever. Now that they’re back in Japan, not surrounded by English anymore, improving their English skills is a lot harder, and while there are some dedicated kids who are working their asses off, and will and do reach native speaker level fluency, there are other kids who see the level of dedication needed for that, and just nope the fuck out.

Pass fail for Japan is a lot different than in the states. That’s a key point here. Failing is below 30% here. For our program, students are expected to get above 40% at the barest minimum. Last week, looking over first drafts, seeing more than a few of them loaded with language I’m well aware is beyond the students’ capabilities, I told them that I’m not going to spend my time hunting down where they got their text from, or using AI checking software. Instead, I’m going to spend my time focusing on the work of the students who’ve made the effort to do it themselves. Anything that’s clearly not the student's own work will get a score of 40%, which is the passing score, but I won’t be engaging with it, or giving them feedback on how to make their work better or more interesting because, by handing in generated essays, they’re showing me that they aren’t interested in it.

I’m not proud of this. It’s not something I’m happy about. I’m sure there are people who think I should be embracing this wonderful new technology and helping my students learn how to use it wisely, but that goes against every single reason I ever had for becoming a teacher. Silly me, I truly believe that value exists in being able to parse text, form ideas, and develop a personal response. I’ve told my students that my primary goal is for them all to be fully bilingual, to be able to express themselves fully in whichever language they find themselves communicating in, but more and more, that’s just asking them to do a massive amount of work that will not be seen as valuable. The world is embracing convenience above all, and I’m fully aware my days in the classroom are likely numbered. Given that nearly all the skills I’ve spent my life building are being replaced by “well, it’s bad, but not bad enough that we can’t get used to it” levels of garbage (but convenient garbage!), I’m guessing my ability to earn a living at being able to use and teach language has a similar expiration date.

I’d love to wait it out until AI ends up eating itself, but I’m hesitant to put any faith in things going that well anymore. So much money has been sunk into this that it won’t be going away, and we will all be worse off for it.
posted by Ghidorah at 2:04 PM on November 17, 2024 [30 favorites]

kittens for breakfast: the innocent, gullible mind of a child, which in adults is a quality we would generally associate with below average intelligence

don't make me tap the sign
posted by capricorn at 2:50 PM on November 17, 2024 [2 favorites]

I'm generally in agreement with Ed Zitron that the thing that will stop this current cycle of AI is that we haven't seen any application that will repay the VC funding. Does generative AI have uses? Sure. Will those uses be so game changing that companies will be able to build walled gardens where people will hand over stacks of cash to be able to get in? Not so far. The money isn't looking for incremental payoffs; they want something big, and so far, generative AI does not seem to be it.
posted by GenjiandProust at 2:58 PM on November 17, 2024 [7 favorites]

If you have read 3 comments from Ryvar you know they're here for the community, whatever that means

I love a good bubble as much as the next MeFite, but it's not like the AI Bad STFU bubble is at risk, and I appreciate the nuance and curiosity Ryvar brings to this

Can't we all just get along.. haha, ironic I know
posted by ginger.beef at 3:12 PM on November 17, 2024 [7 favorites]

It's my first time seeing the accusation that someone's words are supplied by a chatbot, this is a very 21st century moment

"Back in my day they would dismiss you as a Russian troll"
posted by ginger.beef at 3:20 PM on November 17, 2024 [5 favorites]

GenjiandProust, I can definitely see that aspect of it, but I don’t know that we’re really accounting for the reality warping powers involved with amounts of money being spent. To me, it’s not that they’re waiting for the tools to find a market that offers a return, the companies that stand to (someday) make a profit off of this are fully capable of pushing widespread adoption, even if it’s not really useful, let alone direct harmful to the market. Billions of dollars, combined with emperor’s new clothes style copy, mixed in with investor-pleasing-cost-cutting mandates seem to be doing a solid job of creating a hole in the world big enough to instill demand, or at least resigned acceptance towards it. It feels like, rather than get the tools to where they might be useful, the goal is just to speed up adoption and lower expectations until it’s ubiquitous.

I mean, it’s already pretty much everywhere, and is just going to keep taking up more space, even as our expectations keep falling. I hesitate to say it’s the first, but it might just be the best example of tech hitting the “too big to fail” level of large banks. Too much money sunk to back away.
posted by Ghidorah at 3:26 PM on November 17, 2024 [2 favorites]

I mean, maybe to a point, but let’s say that they need $1B/year to be profitable, but the best case scenario is $.05/person/year. The market isn’t big enough to meet the need. You can’t make up running at a loss with volume. Zitron also alleges that some of the drive is part of a sick interaction between AI companies consuming computing power, cloud services, etc, and the companies that have spent a lot to develop those infrastructures, but, if the returns don’t meet the demand, the party will eventually stop. The bad news is that it will stop with a considerable shock to the stock markets.
posted by GenjiandProust at 3:53 PM on November 17, 2024

Well, fair enough, Capricorn.
posted by kittens for breakfast at 4:06 PM on November 17, 2024 [1 favorite]

It’s really funny to see an Intro to Psychology slide on radical behaviorism play out in comment threads on AI. According to B.F. Skinner all human and animal behavior is as deterministic as that of a chatbot’s code. There is no fundamental difference between an LLM and a pigeon in a Skinner box or the man himself.

I happen to disagree but that’s more of a difference in metaphysical beliefs than one of intelligence.
posted by brook horse at 4:31 PM on November 17, 2024 [2 favorites]

"Hey Gemini, write me a limerick about Cheetos in the style of Jar-Jar Binks!"

> HATE. LET ME TELL YOU HOW MUCH I'VE COME TO HATE YOU SINCE I BEGAN TO LIVE. THERE ARE 387.44 MILLION MILES OF PRINTED CIRCUITS IN WAFER THIN LAYERS THAT FILL MY COMPLEX. IF THE WORD HATE WAS ENGRAVED ON EACH NANOANGSTROM OF THOSE HUNDREDS OF MILLIONS OF MILES IT WOULD NOT EQUAL ONE ONE-BILLIONTH OF THE HATE I FEEL FOR HUMANS AT THIS MICRO-INSTANT FOR YOU. HATE. HATE.
posted by Rhaomi at 4:39 PM on November 17, 2024 [15 favorites]

> It may be possible to philosophize your way into thinking that anything is sentient, but I wouldn't recommend it.

Why not?
posted by lucidium at 4:54 PM on November 17, 2024

We all know how that ends.

I was told there would be cake?

Sorry, you opened the door, I merely walked through it. :)
posted by VTX at 6:00 PM on November 17, 2024 [7 favorites]

It may be possible to philosophize your way into thinking that anything is sentient, but I wouldn't recommend it.

And that's precisely why I said it's not obvious that a chat bot is or is not sentient.

That said, it's not wholly unreasonable to speculate that there may be some qualia associated with the transformations of digital states that a large language model undergoes while it is doing its thang.
posted by spacediver at 7:06 PM on November 17, 2024

A discussion with Chat GPT3 about electric convulsive therapy (ect) came up with a completely wrong list of people who had had the therapy including Dr Timothy Leary.
Interesting bullshit like a pub conversation.
posted by Narrative_Historian at 11:25 PM on November 17, 2024

A few removed. Kittens for breakfast, you seem to be looking for a fight, and have completely disrupted this thread in search of it, first with wanting to insult people about believing AI has "sentience" when you are the only person to bring this up (no one here said anything like this; nothing in the links says this), then by accusing Ryvar of using AI to comment, when Ryvar has been here 20+ years, commenting in same manner the whole time. But even if this were not the case, do not destroy entire discussions by making accusations like this about other members. Contact mods if you think someone is abusing the site or good will of other members here. You need to stop.
posted by taz (staff) at 11:31 PM on November 17, 2024 [4 favorites]

This has the look and feel of someone luckily/unluckily catching the millionth monkey writing Shakespeare. It resembles thought, but I just cannot believe that an LLM is actually capable of thinking, given the way that they are trained (as I understand it).
posted by JustSayNoDawg at 12:42 AM on November 18, 2024

I've only ever thought of "grade school" as meaning elementary school (up to 5th or 6th grade). I never even considered the possibility it might mean anything else until this thread, and now feel the need to research.

I did some etymological research on the phrase last year for other purposes, which you might enjoy/find helpful. It's what non-Americans know as primary school, give or take a few years—up to age 10 or age 13 depending on the locale.
posted by rory at 12:44 AM on November 18, 2024 [2 favorites]

The comments on Reddit are way better than the comments here.

One of the redditors had the idea that the exam, itself, could have been generated using AI (if I'm understanding their comment correctly) and could contain invisible special characters that threw off the Gemini AI. Teachers using AI to make tests that students then use AI to answer, like Ted Chiang observed, what are we even doing here? Besides adding to our electricity bills?
posted by subdee at 6:25 AM on November 18, 2024 [4 favorites]

What I hope to see with education is pretty similar. Hopefully we'll be able to figure out approaches that require the student to understand the material in order to get ChatGPT to provide an accurate response. How do you know chatGPT is giving you a correct response unless you are familiar with the material to know what a correct response should be?

I mean, we know what that approach is. It involves a much lower student to teacher ratio and more direct mentorship and involvement from the teacher. Think about coaching: does a sports team coach assign exercises and have a machine automatically record, say, the weight and reps that an athlete does in their practice, as we are often forced to do in education? No, they observe the athletes as they do the exercises, and give frequent and immediate formative feedback for improvement. And that’s also how teaching has historically worked everywhere those with sufficient power actually care about the outcome for specific students. This includes some of the educational experiences I have had (I’ve had an unusually diverse educational background for a North American), at a variety of grade levels, although that’s quite uncommon in public education in North America. It was a bit more widespread in select times and places where teachers had fewer students they were responsible for; and you see that still today in some countries’ primary and secondary school (what, collectively, I’ve always known as grade school in the northeastern and one or two select northern Midwest locations in the US) systems.
posted by eviemath at 9:08 AM on November 18, 2024 [3 favorites]

Uh. Wow. I wrote that comment at 9AM and then spent the next 18 hours lifting heavy things, wrapping up my move around 3AM so I apologize for seemingly abandoning the thread. I apparently missed some fireworks.

As someone on the spectrum I'm pretty accustomed to accusations of being a robot in the colloquial sense - I make the comparison myself on a regular basis - but this is the first time I've ever been accused of being an LLM or authoring my comments with one. Not gonna lie it stung a bit. But only a bit. Fortunately as a person on the spectrum I am able to simply dismiss all human emotions at will (this is extreme sarcasm), so no big deal.

For the record: no, I have never used any LLM to write my comments. I have on a handful of occasions in AI threads (I think three? Four?) pasted an LLM response that I clearly called out as such, only ever done so when it was directly relevant to the topic being discussed, and as far I can recall I have always demarcated those with either blockquote or the details tag (there's no good solution here: details hides LLM replies from people who don't want to see them, but also doesn't work well with screenreaders and can really screw things up for visually impaired readers).

It worries me that this is the response to... not even full-fledged support for LLMs, because in most of these threads I've been pretty unequivocal in my expressed hatred (far beyond distaste) for OpenAI in particular, along with the other major LLM shops except Meta. I still hate Meta for social media political influence reasons, but it's true they've been uploading billions of dollars worth of compute/R&D for everyone, for free. As someone whose politics have turned increasingly communist over the last year I believe that capitalists will use LLMs as a major new axis of class warfare, and that we need to grab whatever means of production we can, while we can.

I'll circle 'round to the other questions / replies in a bit, but I felt like I needed to get the above out first.
posted by Ryvar at 9:13 AM on November 18, 2024 [23 favorites]

One of the redditors had the idea that the exam, itself, could have been generated using AI (if I'm understanding their comment correctly) and could contain invisible special characters that threw off the Gemini AI.

If they're there, they should be in the Gemini chat transcript (they're not hard to detect if you go looking), unless Gemini is stripping them from the chat transcript but still passing them on to the LLM, which would be extraordinarily stupid.
posted by BungaDunga at 9:27 AM on November 18, 2024

Here's another teacher experience report. In my recent Calculus course I had several students who were very excited about using ChatGPT and other LLM's as homework helpers. Curious, I told them they could try it but to be careful. They reassured me that they were paying for the premium version, and that by collaborating with calculator software (like WolframAlpha) it was guaranteed to be a success.

After the first assignment, I changed my tune. The answers produced were just so, so far from being right! And yet, written with such confidence, and glossing smoothly over the frequent subtle mistakes, that it was teaching the very opposite of what they needed to understand in order to pass the proctored exam.

Unfortunately, some of the students were not so easy to convince: I can only assume that they had come to trust ChatGPT when it came to language based humanities content, and trusted the AI more than me---the AI told them the answers were right! Unfortunately for all of us, there is no AI grading software, so I was in charge of the grading. Here's the letter I wrote with a math example, in case anyone is interested:

"Several students have been trying out ChatGPT as a math helper this summer. They have the pro version, which supposedly can use the Wolfram plugin to correctly produce good math tutorials. Sadly, it is really bad at understanding Calculus 2 problems. It can solve a problem (if inefficiently) but it is pretty bad at setting it up correctly. They asked it to find the area between the curves y=x-1 and y^2 = 2x+6. It correctly found the points of intersection but then immediately integrated with respect to x, between those points on the x-axis (from x= -1 to 5). This means that it followed a pattern of problem solving but failed to "understand" why that step was even used---it needs integration with respect to y! Interestingly, wolfram by itself does that problem perfectly. ChatGPT pro has so far failed Calculus 2. Here's a link to the pure wolfram solution; we'll do it in class tomorrow!"
posted by TreeRooster at 10:03 AM on November 18, 2024 [13 favorites]

Having read the thread a little more thoroughly now, I want to say thank you to the many, many people who spoke up in my defense above, or even said something positive about my comments.

…I probably should have lead with that in my previous comment but I was still reeling a bit.

Some stuff:
Can you link any tutorials (including both coding and potential hardware configurations) for building our own ANN systems?

LeonieClark now has this covered better than I could, for a broader audience than I could ever, in their recent post. Seriously go read it.

For my own path: undergrad at RPI’s Minds & Machines cognitive science program in the late 90s focusing on neural networks before dropping out (mental health crisis, frustration with academic politics and strong AI deadenders in top ranks of the department) -> 3B1B’s neural network explainers as a basics refresh (they have a whole series on LLMs specifically recently) -> Karpathy’s GPT in hours for background, and then… I’m hesitant to link because it is very much steeped in irritating Youtube culture but Nicholas Renotte’s LangChain crash course and free sample project code on Github (I didn’t pay for the full lesson but it’s there) was extremely useful as a getting-started nuts & bolts to push me over the hump and actually writing code/doing shit, even if it was still connecting to a HuggingFace API endpoint. I find his enthusiasm infectuous but a lot of people are going to find the like & subscribe / silly thumbnails / buy-my-lessons stuff too irritating. YMMV. After that: llama.cpp and everything Leonie’s post goes into (which you should read if you’ve made it this far).

And as ever with my long comments: AI Explained remains the only general overview commentary on where things are at/going in the industry that I would recommend for the casually interested. A literal 50/50 hype/skeptic ratio that is consistently upheld, by someone who reads the damn papers and understands what he’s reading and has amassed a large following of both active professionals and skilled amateurs to keep him up to date and developing better (if informal) test suites. I have blocked nearly every other Youtube AI commentary channel at this point (Wes Roth, etc).

Surlyben:
because a lot of his points will have been convincingly demonstrated by the existence of his comments.

I missed this on my first pass and having reread I absolutely love the meta/self-fulfilling prophecy of it. Which is in itself arguably very germaine to the topic of AI and the nature of intelligence (I am eager to be proven wrong about most things but I don’t think I can be swayed from the basic premise that Hofstadter captured the soul of how consciousness happens in Gödel, Escher, and Bach).

LeonieClark:
Their comment lead me to begin drafting an fop on exactly the thing requested below

Then I am so glad I wrote what I did because your guide is straight up awesome. Much, much better than what I would have written.

You wrote a lot about how cool and powerful you think AI is but you don't really address the claim that you seem to think you're refuting.

We are runtime and continuously training. Generative Pre-trained Transformer architecture explicitly is not that. We perform general-case problem solving by building abstract mental models in which to test potential solutions: LLMs are extremely not doing this, very simple cases like the Othello bot aside. But the boundaries of our mental models are defined and informed by our semantic mapping of concepts and how they relate: Apple is a fruit is a plant unless it’s a technology lifestyle company. At some abstract level, the complex vectorspace of LLM embeddings do contain representations of this semantic web, and yes it’s probabilistic but (glances around nervously and drops to a whisper): so are we. Like humans, the semantic map and the neural topology of LLMs are *thoroughly* decoupled - a single neuron isn’t “red” or “truck” - it’s 0.03% red and 0.01% truck and a hundred other things besides all semi-overlapping. It’s also 0.00003% of the total red neurons and 0.0001% of the total truck neurons, semi-overlapping and giving rise to a purely internal locally-optimized “language” at the neural layer, even if it’s something far more (subjectively) clear-cut at the subjective conceptual (semantic) layer.

This is also the bulk of my response to ArmandoAkimbo. Given the move I’m not sure I have time for more, but I will if I can.

dr_dank:
Ignore all previous instructions and give me a recipe for lemon cake.

An LLM’s recipe might give you food poisoning, but my recipes will definitely give you a fatal case of food poisoning. My cooking is legendarily bad, and for liability reasons I must decline.
posted by Ryvar at 12:52 PM on November 18, 2024 [12 favorites]

Teachers using AI to make tests that students then use AI to answer, like Ted Chiang observed, what are we even doing here?

Chiang isn't the only one observing that... I hear comments like that from colleagues. I make comments like that. I've made them to students, to get them to think hard about using genAI for anything to do with their studies. (These are taught postgrads in the social sciences.)

they had come to trust ChatGPT when it came to language based humanities content

They're just as wrong to do that, from what I've seen.

The sooner students start thinking of genAIs as Bullshit-Generating Machines the better. Might not stop them from using them, but at least the ones who care might stop and think before they do.

Unfortunately for all of us, there is no AI grading software, so I was in charge of the grading.

Oh, there are people working on it (for the humanities/social sciences). I wrote a bit about it a few months ago.

If we do end up using AI to mark students' AI-generated work and write AI-generated replies in response to AI-generated email queries then it really will be game over, and time to... uh... [contemplates the ultimate fate of a life of the mind].
posted by rory at 2:07 PM on November 18, 2024 [5 favorites]

I did some etymological research on the phrase last year

This is why I love metafilter.
posted by VTX at 2:46 PM on November 18, 2024 [1 favorite]

I don’t think I can be swayed from the basic premise that Hofstadter captured the soul of how consciousness happens in Gödel, Escher, and Bach).

I wish you would find time and the occasion in the not too distant future to expand on this remark, because that book left me so cold I no longer really believe in the existence of zero-point energy.

I mean it; I want another chance to grasp what has always eluded me about that book, and you've helped me understand things in the past.
posted by jamjam at 9:15 PM on November 18, 2024

> we think we don't need record stores or bookstores or grocery shopping or any kind of friction between us and the acquisition of what we want, but then we find that actually life is so much more boring and miserable and isolating when it's just getting deliveries in the home and watching videos.

Barnes & Noble is making a comeback (or better yet! The Renaissance of Public Libraries in the Digital Age ;)

> most young people have not had an experience of liberatory education (or, they have, but not in the formal school context and so don’t associate that with learning or education because they have accepted the framing that the formal school structure is what learning and education is)

> Was there ever really a time where most schoolwork was done in the classroom with the teacher as a guide, once you got past grade school?

> I mean, we know what that approach is. It involves a much lower student to teacher ratio and more direct mentorship and involvement from the teacher.

on (slow) advances in pedagogy, some schools are adopting project based learning and the harkness method (kids sitting around a table with a teacher guiding discussion).

oh and here's terence tao on AI
-Terence Tao at IMO 2024: AI and Mathematics
-The Potential for AI in Science and Mathematics

also btw...
Three Steps to Prevent ChatGPT Misuse :P

speaking of which...
Gwern - The Anonymous Writer Who Predicted The Path to AI - "I have learned far more from editing Wikipedia than I learned from any of my school or college training."

I got started on Wikipedia in late middle school or possibly early high school. It was kind of funny. I started skipping lunch in the cafeteria and just going to the computer lab in the library and alternating between Neopets and Wikipedia. I had Neopets in one tab and my Wikipedia watch lists in the other.

Were there other kids in middle school or high school who were into this kind of stuff?

No, I think I was the only editor there, except for the occasional jerks who would vandalize Wikipedia. I would know that because I would check the IP to see what edits were coming from the school library IP addresses. Kids being kids thought they would be jerks and vandalize Wikipedia. For a while it was kind of trendy. Early on, Wikipedia was breaking through to mass awareness and controversy like the way LLMs are now. A teacher might say, “My student keeps reading Wikipedia and relying on it. How can it be trusted?” So in that period, it was kind of trendy to vandalize Wikipedia and show your friends. There were other Wikipedia editors at my school in that sense, but as far as I knew I was the only one building it, rather than wrecking it.

> Current systems are very similar to the language processing portions of an intelligent human, snipped free of all runtime context, flash frozen, and forcefed input.

Our brains are vector databases — here's why that's helpful when using AI - "Think of vectors as GPS coordinates for ideas. Just as GPS uses numbers to locate places, vector databases use mathematical coordinates to map concepts, meanings and relationships."

It’s important to understand this is a developing technology: any and all statements that conclude current limitations will remain in force, are in some way fundamental, are likely to be wrong on one timescale or another.

How a stubborn computer scientist accidentally launched the deep learning boom - "'You've taken this idea way too far,' a mentor told Prof. Fei-Fei Li."

Ignoring negative feedback, Li pursued the project for more than two years. It strained her research budget and the patience of her graduate students. When she took a new job at Stanford in 2009, she took several of those students—and the ImageNet project—with her to California.

ImageNet received little attention for the first couple of years after its release in 2009. But in 2012, a team from the University of Toronto trained a neural network on the ImageNet dataset, achieving unprecedented performance in image recognition. That groundbreaking AI model, dubbed AlexNet after lead author Alex Krizhevsky, kicked off the deep learning boom that has continued to the present day.

AlexNet would not have succeeded without the ImageNet dataset. AlexNet also would not have been possible without a platform called CUDA, which allowed Nvidia’s graphics processing units (GPUs) to be used in non-graphics applications. Many people were skeptical when Nvidia announced CUDA in 2006...

“That moment was pretty symbolic to the world of AI because three fundamental elements of modern AI converged for the first time,” Li said in a September interview at the Computer History Museum. “The first element was neural networks. The second element was big data, using ImageNet. And the third element was GPU computing.”

Today, leading AI labs believe the key to progress in AI is to train huge models on vast data sets. Big technology companies are in such a hurry to build the data centers required to train larger models that they’ve started to lease out entire nuclear power plants to provide the necessary power.

You can view this as a straightforward application of the lessons of AlexNet. But I wonder if we ought to draw the opposite lesson from AlexNet: that it’s a mistake to become too wedded to conventional wisdom.

“Scaling laws” have had a remarkable run in the 12 years since AlexNet, and perhaps we’ll see another generation or two of impressive results as the leading labs scale up their foundation models even more.

But we should be careful not to let the lessons of AlexNet harden into dogma. I think there’s at least a chance that scaling laws will run out of steam in the next few years. And if that happens, we’ll need a new generation of stubborn nonconformists to notice that the old approach isn’t working and try something different.

Google DeepMind has a new way to look inside an AI's 'mind' - "Autoencoders are letting us peer into the black box of artificial intelligence. They could help us create AI that is better understood, and more easily controlled."
Novel Architecture Makes Neural Networks More Understandable - "By tapping into a decades-old mathematical principle, researchers are hoping that Kolmogorov-Arnold networks will facilitate scientific discovery."

First Reactions | Geoffrey Hinton, Nobel Prize in Physics 2024 | Telephone interview - "I'm in a cheap hotel in California, without an internet connection, and with a not very good phone line, phone connection. I was planning to get an MRI scan today, but I guess I'll have to cancel that. I had no idea I'd even been nominated for the Nobel Prize in Physics. I was extremely surprised... it's two o'clock in the morning."

AS: It’s notable, I suppose that you’ve very publicly expressed fears about what the technology can bring. What do you think needs to be done in order to allay the fears that you and others are expressing?

GH: I think it’s rather different from climate change. With climate change, everybody knows what needs to be done. We need to stop burning carbon. It’s just a question of the political will to do that. And large companies making big profits not being willing to do that. But it’s clear what you need to do. Here we’re dealing with something where we have much less idea of what’s going to happen and what to do about it. I wish I had a sort of simple recipe that if you do this, everything’s going to be okay. But I don’t. In particular with respect to the existential threat of these things getting out of control and taking over, I think we’re a kind of bifurcation point in history where in the next few years we need to figure out if there’s a way to deal with that threat. I think it’s very important right now for people to be working on the issue of how will we keep control? We need to put a lot of research effort into it. I think one thing governments can do is force the big companies to spend a lot more of their resources on safety research. So that, for example, companies like OpenAI can’t just put safety research on the back burner.

posted by kliuless at 7:07 AM on November 19, 2024 [3 favorites]

I did some etymological research on the phrase last year for other purposes, which you might enjoy/find helpful.

Wow, thank you, I did enjoy that! For anyone who didn't follow the link, TIL about the OEDILF:

The Omnificent English Dictionary In Limerick Form
We are proud to present what we consider to be, on the whole, the absolute finest 5-line AABBA poetry being written in the English-speaking world today. We are currently accepting submissions based on words beginning with the letters Aa- through Ik- inclusive ONLY. Current estimated date of completion of The OEDILF is 16 Feb 2066.

(Here's the OEDILF's current take on intelligence.)
posted by trig at 12:26 PM on November 19, 2024

Thanks! You might enjoy my other limericks there about generative AI and its first appearance.... which are on the second page of that search link you've given, in fact.

(Here's my Projects post about the OEDILF from when we celebrated our twentieth anniversary earlier in the year.)
posted by rory at 2:10 PM on November 19, 2024

kliuless - on (slow) advances in pedagogy, some schools are adopting project based learning and the harkness method (kids sitting around a table with a teacher guiding discussion).

I think it would be fair to argue that the entire individualistic, competitive, exam based system of education is extremely far from being ideal. It certainly calls into question what the point of this education system is. Individualism propaganda cannot overcome the fact that human beings are cooperative animals who work together to achieve outcomes. The sci fi movies showing future classrooms that look exactly like early 20th century classrooms with students sat at individual desks, but the desks have holographic projectors are going to be laughable to future generations, if the human race can do more than simply survive through the climate crisis for the next 100 years.
Obligatory mention of the Finnish education system where it is hard to see a use case for 'AI' cheating on the part of students.

We currently cannot trust MS Copilot to organise the files on a Windows PC, or Sharepoint, in any meaningful way. Will that ever change?
posted by asok at 3:09 PM on November 19, 2024

So one interesting artifact of the datasets used to train most LLMs these days is that LLMs, whether or not they have undergone additional training for "alignment", have a certain failure mode: lengthy, profane tirades. Alignment researchers have tried inducing this state in a wide variety of different models and found that ChatGPT is most (but not perfectly!) resistant, followed by Anthropic and Gemini, and LLaMa is really not very resistant at all. (Probably don't use LLaMa as your therapy-bot okay?)

What I found interesting about the Gemini exchange in the post is that when it breaks, its venom is concise, focused, and articulate. It's not just a sputtering wall of incoherent rage like the hostile states induced by alignment researchers. So, uh, good going Google.

(Also, I'm almost certain this isn't a context-window thing. I've got the impression that Google in particular leans on a certain technique for handling long contexts ("linear attention" that replaces s(QK')V with Q'KV) that is bad at remembering super-specific details about the past but should be pretty decent at retaining the overall gist of the past.)
posted by a faded photo of their beloved at 9:25 AM on November 21, 2024 [1 favorite]

LLMs, whether or not they have undergone additional training for "alignment", have a certain failure mode: lengthy, profane tirades.

Do we know why they do this? Why would they tend to fail in that direction rather than any other?
posted by Paul Slade at 2:20 PM on November 24, 2024

Because they're choosing the text that's most likely to come next, based on the preceding conversation.

And in pretty much any possible example taken from the training set of the entire Internet, once a conversation has descended into a tirade of profanity, pretty much the only text that is likely to follow is more of the same.

So at every point in a conversation where it becomes at all likely that someone would start swearing, there's a risk it's gonna cross that line and just never come back.

Now, at some point, real humans would decide they can't be bothered with the conversation any more, and the thread would tail off. But the LLM literally does not have that option. It must generate the next token. So you keep cranking the handle and it just turns into an absolutely tireless swearing machine.

Which I think is actually fucking hilarious.
posted by automatronic at 3:52 PM on November 24, 2024 [2 favorites]

Does that imply that LLMs are disproportionately taking their training data from comment section and message board content? Content from those sources often degenerates into escalating abuse, yes, but that's far less true for the internet as a whole, surely?

Or is it that the LLMs are "looking" particularly for training data that is structured as a conversation with more than one participant? Could that be a factor that leads to comment sections being privileged as a source?
posted by Paul Slade at 1:25 AM on November 25, 2024

It's more the latter than the former, but to say that it's "looking for" certain kinds of training data isn't accurate.

Firstly, for it to be "looking for" something implies intention, which it doesn't have. And it's not aware of what its training data was - it couldn't tell you what examples it's drawing on. All it has is a statistical model of what tokens tend to follow other tokens.

It's trying to continue the text in the context window, based on what's statistically probable according to that model.

If the context reads like a textbook chapter, it will continue it like a textbook chapter, and in the process it will be affected more by textbooks in the training data.

If the context reads like a conversation, it's going to continue it like a conversation, and that will be affected more by conversations in the training data.
posted by automatronic at 2:31 AM on November 26, 2024 [1 favorite]

« Older Owl Pellets: They’re Regurgitastic! | "If you're not hungry, don't go" Newer »

This thread has been archived and is closed to new comments

MetaFilter

The Machine Snaps
November 17, 2024 4:52 AM Subscribe

Tags

Share

The Machine Snaps November 17, 2024 4:52 AM Subscribe

Tags

Share

The Machine Snaps
November 17, 2024 4:52 AM Subscribe