I strongly disagree with this framing. It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines, and it simply won't work in the majority of cases. Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.
Asimov's laws of robotics are flawed too, of course. There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.
LLM's now can capture intent. I think the issue now is that the full landscape of human values never resolves cleanly when mapped from the things we state in writing as being human values.
Asimov tried to capture this too, as in, if a robot was tasked with "always protect human life", would it necessarily avoid killing at all costs? What if killing someone would save the lives of 2 others? The infinite array of micro-trolly problems that dot the ethical landscape of actions tractable (and intractable) to literate humans makes a full-consistent accounting of human values impossible, thus could never be expected from a robot with full satisfaction.
“LLMs can capture intent now” reads to me the same as: AI has emotions now, my AI girlfriend told me so.
I don’t discredit you as a person or a professional, but we meatbags are looking for sentience in things which don’t have it, thats why we anthropomorphise things constantly, even as children.
LLM's capturing intent is a capabilities-level discussion, it is verifiable, and is clear just via a conversation with Claude or Chatgpt.
Whether they have emotions, an internal life or whatever is an unfalsifiable claim and has nothing to do with capabilities.
I'm not sure why you think the claim that they can capture intent implies they have emotions, it's simply a matter of semantic comprehension which is tied to pattern recognition, rhetorical inference, etc that are all naturally comprehensible to a language model.
"A guy goes into a bank and looks up at where the security cameras are pointed. What could he be trying to do?"
It very easily captures the intent behind behavior, as in it is not just literally interpreting the words. All that capturing intent is is just a subset of pattern recognition, which LLM's can do very well.
Look at any recent CoT output where the model is trying to infer from an underspecified prompt what the user wants or means.
It is generally the first thing they do — try to figure out what did you mean with this prompt. When they can’t infer your intent, good models ask follow-on questions to clarify.
Right, and then look at any number of research papers showing that CoT output has limited impact on the end result. We've trained these models to pretend to reason.
What do you think it means to “capture intent” and where do current models fall short on this description?
From my perspective the models are pretty good at “understanding” my intent, when it comes to describing a plan or an action I want done but it seems like you might be using a different definition.
> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
Talking to chatbots is like taking a placebo pill for a condition. You know it's just sugar, but it creates a measurable psychosomatic effect nonetheless. Even if you know there's no person on the other end, the conversation still causes you to functionally relate as if there is.
So this isn't "accommodating foibles" with the machine, it's protecting ourselves from an exploit of a human vulnerability: we subconsciously tend to infer intent, understanding, judgment, emotions, moral agency, etc. to LLMs.
Humans are wired to infer these based on conversation alone, and LLMs are unfortunately able to exploit human conversation to leap compellingly over the uncanny valley. LLM engineering couldn't be better made to target the uncanny valley: training on a vast corpus of real human speech. That uncanny valley is there for a reason: to protect us from inferring agency where such inference is not due.
Bad things happen when we relate to unsafe people as if they are safe... how much more should we watch out for how we relate to machines that imitate human relationality to fool many of us into thinking they are something that they're not. Some particularly vulnerable people have already died because of this, so it isn't an imaginary threat.
> So this isn't "accommodating foibles" with the machine, it's protecting ourselves from an exploit of a human vulnerability: we subconsciously tend to infer intent, understanding, judgment, emotions, moral agency, etc. to LLMs.
Right, I'm saying that this framing is backwards. It's not that poor little humans are vulnerable and we need to protect ourselves on an individual level, we need to make it illegal and socially unacceptable to use AI to exploit human vulnerability.
Let me put it another way. Humans have another weakness, that is, we are made of carbon and water and it's very easy to kill us by putting metal through various fleshy parts of our bodies. In civilized parts of the world, we do not respond to this by all wearing body armor all the time. We respond to this by controlling who has access to weapons that can destroy our fleshy bits, and heavily punishing people who use them to harm another person.
I don't want a world where we have normalized the use of LLMs where everyone has to be wearing the equivalent of body armor to protect ourselves. I want a world where I can go outside in a T-shirt and not be afraid of being shot in the heart.
> That uncanny valley is there for a reason: to protect us from inferring agency
You’re committing a much older but related sin here: assigning agency and motivation to evolutionary processes. The uncanny valley is the product of evolution and thus by definition it has no “purpose”
I didn’t say any such thing like the universe has no purpose. Merely that in a scientific sense evolution has no motivation. It is an emergent phenomenon which tends to maximize fitness to reproduce and cannot be said to do anything for a reason. Saying otherwise is just anti-science.
You take the placebo (whatever it is: could be a pill; could be some kind of task or routine) and you believe it is medicine; you believe it to be therapeutic.
The placebo effect comes from your faith, your belief, and your anticipation that it will heal.
If the pharmacist hands you a pill and says, “here, this placebo is sugar!” they have destroyed the effect from the start.
Once on e.r. I heard the physicians preparing to administer “Obecalp”, which is a perfectly cromulent “drug brand”, but also unlikely to alert a nearby patient about their true intent.
But, puzzlingly enough, it's the definition of open-label placebo, in which the patient is told they've been given a placebo. And some studies show there is a non-insignificant effect as well, albeit smaller (and less conclusive) than with blind placebo.
One, a placebo does not need to be given blindly. A sugar pill is a placebo, even if the recipient knows about it.
An actual definition: "A placebo is an inactive substance (like a sugar pill) or procedure (like sham surgery) with no intrinsic therapeutic value, designed to look identical to real treatment." No mention of the user's belief.
Two, real hard data proves that the placebo effect remains (albeit reduced) even if the recipient knows about it. It's counter-intuitive, but real.
The article says a human SHOULD NOT do those things. Much like a human SHOULD NOT smoke, since it's bad for just about everything, and do it anyways, people will do these 3 things too. But they shouldn't.
Arguing that they should because many will strikes me as a very strange argument. A lot of people smoke, doesn't make it one bit healthier.
The article offers practical advice to go along with this framing, like configuring AI services to write/speak in a more robotic tone. I think that's a decent path to try.
This is actually one of the things that made LLMs more usable for me. The default tone and style of writing they tend to use is nauseatingly annoying and buries information in prose that sounds like a corporate presentation.
> Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.
Humans ARE doing this with classical computer software as well.
It's impossible to make anything fool-proof because fools are so ingenious!
> Nothing that can be described as "intelligent" can be made to be safe.
Knives aren't safe. Cars are deadly. Hair driers can electrocute you. Iron can burn you. There's a million ordinary household tools that aren't safe by your definition of the word, yet we still use them daily.
It's precisely because AI systems are not safe that it's imperative that as individual humans we are vigilant about how we interact with them.
As individuals, we are not going to be able to shut down the AI companies, or avoid AI output from search engines or avoid AI work output from others at our companies, and often will be required to use AI systems in our own work.
It's similar to advise people on how to stay safe in environments known to have criminal activity. Telling those people they don't have to change their behaviors to stay safe because criminals shouldn't exist isn't helpful.
Especially with current-day chat-style interfaces with RLHF, which consciously are designed to direct people towards anthropomorphization.
It would be interesting to design a non-chat LLM interaction pattern that's designed to be anti-anthropomorphization.
> humans WILL blindly trust their outputs, and humans WILL defer responsibility to them
I also blame a lot (but not all) of that on current AI UX, and I wonder if there are ways around it. Maybe the blind trust thing perhaps can be mitigated by never giving an unambiguous output (always options, at least). I don't have any ideas about the problem of deferring responsibility.
"Deep research" is another interaction style that produces more official sounding texts, yet still leads to anthropomorphization.
What you are looking for is perhaps an LLM flaunting all the obvious slop patterns in its responses. But then people would be disgusted and would refuse to communicate with it.
> Asimov's laws of robotics are flawed too, of course.
I always find the common references to Asimov's laws funny. They are broken in just about every one of his books. They are crime novels where, if a robot was involved, there was some workaround of the laws.
I find your critique very interesting from a perspective-angle: why are you using words like "accommodate," and "foibles," for LLMs? It's not humanoid or sentient: it's a cleverly-designed software tool, not intelligence.
It's not insane at all for humans to alter their behavior with a tool: you grip a hammer or a gun a certain way because you learned not to hold it backwards. If you observe a child playing with a serious tool, like scissors, as if it were a doll, you'd immediately course correct the child and educate how to re-approach the topic. But that is because an adult with prior knowledge observed the situation prior to an accident, so rules are defined.
This blog's suggested rules are exactly the sort of method to aid in insulation from harm.
> I find your critique very interesting from a perspective-angle: why are you using words like "accommodate," and "foibles," for LLMs? It's not humanoid or sentient: it's a cleverly-designed software tool, not intelligence.
Neither of those words imply consciousness, though. Swords have foibles, you can accommodate for the weather, but I don't think swords or the weather are conscious, sentient, humanoid, or intelligent.
Is quite... an interesting subreddit to say the least. If you've never seen this, it was really something when the version that followed GPT4o came out, because they were complaining that their boyfriend / girlfriend was no longer the same.
This is such an oddly fatalistic take, that humans cannot be influenced or educated to change how they see a thing and therefore how they act towards that thing.
Thank you. I'm glad to see this as the top comment.
My brother was recently visiting and we were talking about software engineers, and the humanities, and manners of understanding and being in the world,
and he relayed an interaction he had a few years ago with an old friend who at the time was part of the initial ChatGPT roll out team.
The engineer in question was confused as to
- why their users would e.g. take their LLM's output as truth, "even though they had a clear message, right there, on the page, warning them not to"; and
- why this was their (OpenAI's) problem; or perhaps
- whether it was "really" a problem.
At the heart of this are some complicated questions about training and background, but more problematically—given the stakes—about the different ways different people perceive, model, and reason about the world.
One of the superficial manners in which these differences manifest in our society is in terms of what kind of education we ask of e.g. engineers. I remain surprised decades into my career that so few of my technical colleagues had a broad liberal arts education, and how few of them are hence facile with the basic contributions fields like philosophy of science, philosophy of mind, sociology, psychology (cognitive and social), etc., and how those related in very real very important ways to the work that they do and the consequences it has.
The author of these laws does may intend them as aspirational, or otherwise as a provocation to thought, rather than prescription.
But IMO it is actively non-productive to make imperatives like these rules which are, quite literally, intrinsically incoherent, because they are attempt to import assumptions about human nature and behavior which are not just a little false, but so false as to obliterate any remaining value the rules have.
You cannot prescribe behavior without having as a foundation the origins and reality of human behavior—not if you expect them to be either embraced, or enforceable.
The Butlerian Jihad comes to mind not just because of its immediate topicality, but because religion is exactly the mechanism whereby, historically, codified behaviors which provided (perceived) value to a society were mandated.
Those at least however were backed by the carrot and stick of divine power. Absent such enforcement mechanisms, it is much harder to convince someone to go against their natural inclinations.
Appeals to reason do not meaningfully work.
Not in the face of addiction, engagement, gratification, tribal authority, and all the other mechanisms so dominant in our current difficult moment.
"Reason" is most often in our current world, consciously or not, a confabulation or justification; it is almost never a conclusion that in turn drives behavior.
Behavior is the driver. And our behavior is that of an animal, like other animals.
There's nothing incoherent with these laws. This entire comment, however, is incoherent. So much so, I have no clue if there's a point being made in here.
> because they are attempt to import assumptions about human nature and behavior which are not just a little false, but so false as to obliterate any remaining value the rules have.
Nope. You must've read a completely different article.
[EDIT]
I'll try to make this comment have a bit more substance by posing a question: how would you back up your claim about incoherence? What are the assumptions about human nature that are supposedly false?
We have invented a new tool that can cause great harm. Do you see any value whatsoever in promulgating safety guidelines for humans to use the tool without hurting themselves or others? Do you not own any power tools?
I think in order for "AI safety" to be achievable and effective, we need to have a shared agreement on what "safety" means. Recently, the word has been overloaded to mean all sorts of things and used to justify run-of-the-mill censorship (nothing to do with safety).
Safety should go back to being narrowly defined in terms of reducing / preventing physical injury. Safety is not "don't use swear words." Safety is not "don't violate patents." Safety is not "don't talk about suicide." Safety is not "don't mention politics I don't like." As long as we keep broadly defining it, we're never going to agree on it, and it won't be implementable.
I see value in promulgating safety guidelines for power tools, sure.
There's another comment comparing LLMs to shovels, and I think both that and the power tool comparison miss the mark quite a bit. LLMs are a social technology, and the social equivalent of getting your hand cut off doesn't hurt immediately in the way that cutting your actual hand off would. It's more like social media, or cigarettes, or gambling. You can be warned about the dangers, you can see the shells of wrecked human beings who regret using these technologies, but it doesn't work on our stupid monkey brains. Because the pain of the mistake is too loosely connected to the moment of error. We are bad at learning in situations where rewards are immediate and consequences are delayed, and warnings don't do much.
I guess what I'm really saying is that these safety guidelines are not nearly enough to keep us safe from the dangers of AI that they're meant to prevent.
> LLMs are social technology [...] cigarettes, or gambling.
I agree with the thrust of your argument, a minor wording-quibble: LLM's are a falsely-social technology, in the sense that casinos are a false-prosperity technology and cocaine is a false-happiness technology. It exploits the desire without really being the thing.
Sure, and we can’t guarantee you’ll read the safety instructions that came with your chainsaw. That’s orthogonal to the questions of whether those instructions should exist, whether “power tool safety” concepts should ever be promoted in society, and who’s ultimately responsible for the use of a tool.
Absolving humans of all responsibility for the negative consequences of their own AI misuse seems to the strike the wrong balance for a healthy culture.
Notwithstanding that the guidelines will even be applicable in the quiet versions that get deployed when you aren't looking. It's a constant moving target, and none of the fanboys will even acknowledge the lack of discipline in it all. It's fucking mad. And I say this as one who can see utility in the tools. But not when they are constantly shifting their functionality and behaviour.
One day everything works brilliantly, the models are conservative with changes and actions and somehow nail exactly what you were thinking.
The next day it rewrites your entire API, deploys the changes and erases your database.
If only there was intellectual honesty in it all, but money talks.
The reason people anthropomorphize LLM's is essentially the fault of the tech companies behind them. ChatGPT doesn't need to have the personality it has, it could easily be scaled back to simply answering questions without emoji's and linguistic flare, but frankly I think the tech companies want people to anthropomorphize them.
The core problem is we need to stop calling LLMs "intelligence". They are a form of intelligence, but they're nothing like a human's intelligence, and getting people to not anthropomorphize these systems is really the first step.
> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
Did you fully read the original thing? No demands were being made, or I didn't read it that way. It was simply a suggestion for a better way of interacting with AI, as it stated in the conclusion:
"I am hoping that with these three simple laws, we can encourage our fellow humans to pause and reflect on how they interact with modern AI systems"
Sure, (many/most) humans are gonna do what they're gonna do. They'll happily break laws. They'll break boundaries you set. Do we just scrap all of that?
Worthwhile checking yourself here. It feels like you've set up a straw man.
> There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.
If we want to talk about "disagree with this framing", to me this is the prime example. I'm struggling to read it as anything other than defeatist or pedantic (about the term "safe"). When we talk about something keeping us "safe", we're typically not saying something will be "perfectly safe". I think it's rare to have a safety system that keeps you 100% safe. Seat belts are a safety device that can increase your safety in cars, but they can still fail. Traffic laws are established (largely) to create safety in the movement of people and all the modes of transportation, but accidents still happen.
I'm not an expert on this topic, so I won't make any claims about these three laws and their impact on safety, but largely I would say they're encouraging people to think critically. I'd say that's a good suggestion for interacting with just about anything. And to be clear, "critical thinking" to me means being skeptical (/ actively questioning), while remaining objective and curious.
Not a real argument or anything, but I'm reminded of the episode of The Office where Michael Scott listens to the GPS without thinking and drives into the lake. The second law in the article would have prevented that :)
It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
That's kind of what happens when you learn to program, isn't it?
I was eleven years old when I walked into a Radio Shack store and saw a TRS-80 for the first time. A different person left the store a couple of hours later.
Kinda the whole point of Asimov's three laws were that even something so simple and obviously correct has subtle flaws.
Also the reason we're talking about this again is that machines are significantly less 'mere' than they were a few years ago, and we need to figure out how to approach this.
Agree that 'the computer effect' (if it doesn't already have a pithier name) results in humans first discounting anything that comes out of a machine, and then (once a few outputs have been validated and people start trusting the output) doing a full 180 and refusing to believe the machine could ever be wrong. However, to err is human and we have trained them in our image.
The entire business proposition for LLMs is that they will replace whole armies of [expensive] humans, hence justifying the biblical amount of CapEx. So of course there is strong incentive from the LLM creators to anthropomorphize them as much as possible. Indeed, they would never provide a model that was less human-like than what they have currently, even if it was more often correct and useful.
Do you consider all things broadly called "ethical" to be similarly a waste of time? Even if we lived in a world where everyone always behaved unjustly, because of some like behavioristic/physical principle, don't you think we would still have an idea of justice as what we should do? Because an ethical frame is decidedly not an empirical one, right?
We don't just look around and take an average of what everyone is doing already and call that what is right, right? Whether you're deontological or utilitarian or virtue about it, there is still the idea that we can speak to what is "good" even if we can't see that good out there.
Maybe it is "insane" to expect meaning from something like this, but what is the alternative to you? OK maybe we can't be prescriptive--people don't listen, are always bad, are hopeless wet bags, etc--but still, that doesn't in itself rule out the possibility of the broad project that reflects on what is maybe right or wrong. Right?
The usefulness of an ai agent is that it can do everything you can do, so it's kind of inherently unsafe? you can't get the capabilities and also have safety easily
With regard to my personal use of LLMs, I strongly agree with this framing. But to each point:
Anthropomorphism: As we are all aware, providers are incentivized to post-train anthropomorphic behavior in their models - it increases engagement. My regret is that instructing a model at prompt time to "reduce all niceties and speak plainly" probably reduces overall task efficacy since we are leaving their training space.
Deference: I view the trustworthiness of LLMs the same as I view the trustworthiness of Wikipedia and my friends: good enough for non-critical information. Wikipedia has factual errors, and my friends' casual conversation certainly has more, but most of the time that doesn't matter. For critical things, peer-reviewed, authoritative, able-to-be-held-liable sources will not go away. Unlike above, providers are generally incentivized to improve this facet of their models, so this will get better over time.
Abdication of Responsibility: This is the one that bothers me most at work. More and more people are opening PRs whose abstractions were designed by Claude and not reasoned about further. Reviewing a PR often involves asking the LLM to "find PR feedback" and not reading the code. Arguments begin with "Claude suggested that...". This overall lack of ownership, I suspect, is leading to an increase in maintenance burden down the line as the LLM ultimately commits the wrong code for the wrong abstractions.
> Yes, the AI may have produced the recommendation but a human decided to follow it, so that human must be held accountable
It is common and a mistake IMO to rely on the AI as the sole source for answers to follow-up questions. Better verification would have humans sign off on the veracity of fundamental assumptions. But where does this live? Can an AI model be trusted to rely on previous corrections? This seems impossible or possibly adversarial in a public cloud.
I'm not sure why so many seem to think anthropomorphism is so mad in this specic instance, if it is because people think that anthropomorphism creates a belief that the imagined features are real, they are simply wrong. The abundance of examples in all areas of life where this does not happen is proof that anthropomorphism does not lead to an erroneous belief in a mind that does not exist.
If people are believing in minds of AI, true or not, they are doing so for reasons that are different from mere anthropomorphism.
To me it feels like we are like sailors approaching a new land, we can see shapes moving on the shoreline but can't make out what they are yet. Then someone says "They can't be people, I demand that we decide now that they are not people before we sail any closer."
Yeah, we do it, but so what? A good chunk of all civilization involves recognizing human foolishness and building something to mitigate it anyway.
Software is no exception. Yeah, people are lazy and will instinctively click "continue" to dismiss annoying popups, but humans building the software can and do add things like "retype the volume name of the data that you want ultra-destroyed."
Can someone explain why this is a bad thing, while at the same time it's a good thing to say stuff like "put a computer to sleep", "hibernate", "killing" processes, processes having "child" processes, "reaping", "what does the error say?", "touch", etc?
To me that's just language, and humans just using casual language.
It's a great question, because I do think there are many cases that are neutral, or ones we're able to responsibly distinguish or even cases where it would be an appropriate and necessary form of empathy (I'm imagining some future sci-fi reality where we actually get conscious machines, so not something that exists right now).
But I think it's also at the root of disastrous failures to comprehend, like the quasi-psychosis of the Google engineer who "knows what they saw", the now infamous Kevin Roose article or, more recently, the pitifully sad Richard Dawkins claim that Claudia (sic) must be conscious, not because of any investigation of structure or function whatsoever, but because the text generation came with a pang of human familiarity he empathized with.
These are just words, yes, and I believe it harmless. But describing the LLM machinery as if it thinks is one thing when used as a common parlance, and another when people truly believe that there's some actual thinking or living going on. This "law" is for there to be no latter.
The people who know what a "child process" is are under no false pretenses about the humanity of the underlying system.
The people who are writing op eds in major news publications about how their favorite chatbot is an "astonishing creature" and how it truly understands them are the ones who need this sort of law.
I think it's bad manners to bluntly tell someone they should "read up" on something because it naturally reads as a kind of a closeted accusation of not being sufficiently well informed. There are ways of broaching the topic of what background knowledge is informing their perspective that don't involve the accusation.
Just to add a small bit of anecdotal value so this comment isn't just a scold: I one time many years ago suggesting an elegant way for Twitter to handle long form text without changing it's then-iconic 140 character limit was to treat it like an attachment, like a video or image. Today, you can see a version of that in how Claude takes large pastes and treats them like attached text blobs, or to a lesser extent in how Substack Notes can reference full size "posts", another example of short form content "attaching" longer form.
I was bluntly told to "look up twitlonger", which I suppose could have been helpful if I had indeed not known about twitlonger, but I had, and it wasn't what I had in mind. I did learn something from it though, which was that it's a mode of communication that implies you don't know what you're talking about with plausible deniability, which I suspect is too irresistible to lovers of passive aggression to go unused.
It wasn't intended as such, but I take your point.
To provide a bit more context: Weizenbaum (a computer scientist in the 60s) developed ELIZA, a LISP-based chatbot that was loosely modeled on Rogerian psychotherapy. It was designed to respond in a reflective way in order to elicit details from the user.
What he found was that, despite the program being relatively primitive in nature (relying on simple natural language parsing heuristics), people he regarded as otherwise intelligent and rational would disclose remarkable amounts of personal information and quickly form emotional attachments to what was, in reality, little more than a glorified pattern-matching system.
There's a boundary between knowing vs. forgetting that it's a metaphor. When you use convenient language like in your examples, you tend to remain aware of the difference, or at least you can recall it when asked. When some people talk about AI, they've lost track completely.
I don't love the recommendations in TFA. The author is trying to artificially restrain and roll back human language, which has already evolved to treat a chatbot as a conversational partner. But I do think there's usefulness in using these more pedantic forms once in a while, to remind yourself that it's just a computer program.
Dijkstra once said that "The question of whether machines can think is about as interesting as that of whether submarines can swim."
I think I understand his meaning. He wasn't claiming that machines cannot think, but that one must be clear on what one means by "thinking" and "swimming" in statements of that sort. I used to work on autonomous submarines, and "swimming" was the verb we casually used to describe autonomous powered movement under water. There are even some biomimetic machines that really move like fish, squids, jellyfish, etc. Not the ones that I worked on, but still.
For me, if it's legitimate to say that these devices swim, it's not out of line to say that a computer thinks, even in a non-AI context, e.g.: "The application still thinks the authentication server is online."
The people who advocate for not anthropomorphizing are afraid of the implications of integrating these systems into society with implicit human framing. By attributing to AIs human qualities, we will develop empathy for them and we will start to create a role for them in society as a being deserving moral consideration.
Most of the discussion here is about anthropomorphizing, which I honestly think is a bit of a distraction.
The third one about responsibility is the most important one, IMO. This was attributed to an IBM manual decades ago, and I think it remains the correct stance today:
> A computer can never be held accountable, therefore a computer must never make a management decision.
There should be some human who is ultimately responsible for any action an AI takes. "I just let the AI figure it out" can be an explanation for a screw up, but that doesn't mean it excuses it. The person remains responsible for what happened.
Anthropomorphizing is likely a mistake, but Daniel Dennett’s idea that the most straightforward (possibly only practical) way to create the external appearance of consciousness is a real internal consciousness does float around in my thoughts.
I haven’t yet seen any convincing appearance of one in an LLM, but I think if skeptical people don’t keep an eye out for the signs, we may be the last to see it.
He also wrote about the idea of the intentional stance: even if you’re quite sure these systems don’t have real conscious intent, viewing them as if they did may give you access to the best part of your own reasoning to understand them.
> but Daniel Dennett’s idea that the most straightforward (possibly only practical) way to create the external appearance of consciousness is a real internal consciousness does float around in my thoughts.
I would say LLMs are very strong evidence against this hypothesis.
I totally agree to your point, and want to mention that the reverse is *also* important. Using just "intention", but these apply to emotions, etc
A lot of our interaction with AI is under an intention. That's what directs the interaction, and it's interpreted according to its alignment to the intention.
Then it's important to remember that our current (publicly known) implementation of AI does not have an explicit intention mechanism. An appearance of intention can emerge out of the statistical choices, and the usual alignment creates the association of the behavior with intention, not much different from how we learn to imagine existence of a "force" that pulls things down well before we learn physics and formalize that imagination in one of the several ways.
This appearance helps reduce the cognitive load when interpreting interactions, but can be misleading as well, and I've seen people attribute intention to AI output in some situations where simple presence of some information confused the LLM into a path. Can't share the exact examples (from work), but imagine that presence of an Italian food in a story leads the LLM to assume this happens in Italy, while there are important signs for a different place. The LLM does not automatically explore both possibilities, unless asked. It chooses one (Italy in this case), and moves on. A user no familiar with "Attention" interprets based on non-existent intentions on the LLM.
I found it useful to just tell them: the LLM does not have an intention. It just throws dice, but the system is made in a way that these dice throws are likely to generate useful output.
I don't really understand the argument for these things being conscious. There's no loop or feedback cycle to it. If it's not handling a request it's inert.
This is what I came up with in reference to "Uncle Bob's Programmer's Oath" last year. I decided to memorialize it. I think it's very much a cleaned up reference for what OP shared:
The thing that I find difficult about adjusting to AI tools is the roulette-like nature.
When they produce correct output, they produce it much faster than I could have, and I show up to meetings with huge amounts of results. When the AI tool fails and I have to dig in to fix it, I show up to the next meeting with minimal output. It makes me seem like I took an easy week or something.
> - Humans must not blindly trust the output of AI systems.
> - Humans must remain fully responsible and accountable for consequences arising from the use of AI systems.
My take: humans should never depend on AI for anything serious.
My boss' take: Cool. I'm gonna ask Gemini about it, he's such a smart guy. I know I can trust him, and in case it goes bad i can always throw him under the bus.
Interesting that Frank Herbert thought this was the direction humanity was headed when writing Dune in the 60s, way before AI was prevalent.
Granted that was over ten thousand years before his story is set, but subsequent Dune novels (or at least God Emperor) explained his warning about over-reliance on technology for doing our thinking for us, not that it should never be developed (given the prohibition in the Dune universe and how it's skirted in Frank's later novels).
“Don’t anthropomorphise” is fighting the wrong layer. The entire product design of chat interfaces is built to encourage anthropomorphism because it increases engagement. Expecting users to resist that is like asking people not to click notifications. If this is a real concern, it has to be solved at the product level, not via user discipline.
Anthropomorphizing LLMs is something that happens in the design stage, when they're given human names and trained to emit first-person sentences. If AI companies and developers stop anthropomorphizing them, users won't be misled in the first place.
Yes, but. Starting with my agreement, I've seen anthropomorphizing in the typical ways, (e.g. treating automated text production as real reports of personal internal feeling), but also in strange ways: e.g. "transistors are kind of like neurons" etc. And the latter is especially interesting because it's anthropomorphizing in the sense of treating vector databases and weights and so on as human-like infrastructure. Both leading to disasters that could be avoided if one tried not to anthropomorphize.
But. While "do not anthropomorphize" certainly feels like good advice, it comes with a new and unique possibility of mistake, namely wrongly treating certain generalized phenomena like they only belong to humans. Often this mistaken version of "don't anthropomorphize" wisdom leads to misunderstandings when it comes to animal behavior, treating things like fear, pain, kinship, or other emotional experiences like they are exclusively human and that thinking animals have them counts as "anthropomorphizing." In truth the cautionary principle reduces our empathy for the internal lives of animals.
So all that said, I think it's at least possible that some future version of AI could have an internal world like ours or infrastructure that's importantly similar to our biological infrastructure for supporting consciousness, and for genuine report of preference and intent. But(!!!) what will make those observations true will be all kinds of devilish details specific to those respective infrastructures.
Rather than “the book explains how bread is made” say “the sheets of paper which make up the book have ink in the shape of letterforms which correlate with information about how bread is made”.
“ Humans must not blindly trust the output of AI systems. AI-generated content must not be treated as authoritative without independent verification appropriate to its context.”
I’m lost, how do individuals actually do this in our current world? Is each person expected to keep a “white list” of reliable sources of truth in their head. Please don’t confuse what I’m saying with a suggestion that there is no truth. It just seems like there are far more sources of mis- of half-truths and it’s increasingly difficult for people to identify them.
Humanity has spent millennia creating and evolving institutions to address exactly this problem, and have recently decided to essentially throw out the whole lot and replace it with nothing.
I... am not sure. Computers are machines that create order (like db tables) from the chaos of reality. Now we have LLMs that make computers spit out chaos as well.
They don't have to though, we can still leverage LLMs to organize chaos, which is what I hope they ultimately end up doing.
For example an AI therapist is a nightmare, people putting the chaos of their mental state into a machine that spits dangerous chaos back out. An AI tool that parsed responses for hard data (i.e. rate 0-9 how happy was the person) and then returned that as ordered data (how happy was I each day for the last month) that an actual therapist and patient could review is the correct use of AI and could be highly trusted. The raw token output from LLMs should just be used for thinking steps that lead to a parseable hard data answer that can be high trust.
Of course that isn't going to happen, but I can see some extremely cool and high trust products being built using LLMs once we stop treating them like miracle machines.
Did AI change anything in that regard? I believe that same as before, you couldn't trust everything you see, and research effort was always more than keeping a white list; means vary, case-by-case.
And same it is now. It's a change in quantity, but not quality.
Critical thinking and reading comprehension and the primary tools in determining truth, AFAIK. Knowing facts beforehand helps too but a trustworthy source can provide false information as much as an untrustworthy source can provide true information.
This has always been an issue, and in the past it was a more difficult issue because your sources of knowledge were more limited. Nowadays its mostly about choosing the right source(s) rather than having to go out of your way to find them (like traveling to a regional/university library).
One of the most salient moments in Ex Machina, is near the very end, where it suddenly becomes obvious that the protagonist (and, let's be frank; "she" was definitely the protagonist) is a robot, with no real human drivers.
I feel as if that movie (like a lot of Garland's stuff), was an interesting study on human (and inhuman) nature.
I just treat it as if I'd asked a public forum the question like reddit.
Decent for stuff that doesn't really matter, even if it gets it wrong.
Still gonna be polite to it because I'm about ready to slap the next person that talks to me like an LLM, I don't want to get used to not being polite in a chat interface
Great point about being polite. I think it's pragmatic to keep "please" and "thank you" out of AI interactions, but I try to remain conscious of their ommission so I don't start down that slope.
Debating how not to use AI will not get anyone anywhere since negative framing almost never works with humans (it also does not work with llms). Let’s concentrate on how to build closed loop systems that verify the llm output, how to manage context, and how to build failsafes around agentic systems and then and only then we might start to make progress.
Great article. Fully agree. Ai is not something that can hold responsibility, a human overseer is always required. These overseers are to be held accountable. Note however that these overseers are also highly prone to blame ai when mistakes occur in order to avoid judgement and punishment. When a person says "ai did this/that" always wonder who guided that ai and how and if proper supervision was given.
I'm surprised with how quickly I stopped anthropomorphizing AI. I can remember in college have dorm room pseudo-intellectual debates about AI being alive and AI being "conscience". then once we had AI that could pass the Turing Test, and I knew how it was architected, any thought of it being alive or conscience went right out the window.
What if we aren't building an independent consciousness, but a new type of symbiosis? One that relies on our input as experience, which provides a gateway to a new plane of consciousness?
OP takes a very bland, tired, and rational perspective of what we have in order to create sophomoric 'laws' that are already in most commercial ToU, while failing to pierce the veil into what we are actually creating. It would be folly to assume your own nascent distillations are the epitome of possibility.
Why does its architecture or you knowing how AI is architected cause thoughts of it being conscious to go out the window?
It seems like the biggest factor has nothing to do with AI, but instead that you went from being someone who admits they don’t know how consciousness works to being someone who thinks they know how consciousness works now and can make confident assertions about it.
I don't know exactly how consciousness works, but I am extremely confident in the following assertions:
* I am conscious.
* A rock is not conscious.
* Excel spreadsheets are not conscious.
* Dogs are conscious.
* Orca whales are conscious.
* Octopi are conscious.
To me, it's extremely obvious that LLMs are in the category of "Excel spreadsheets" and not "dogs", and if anyone disagrees, I think they're experiencing AI psychosis a la Blake Lemoine.
An insect doesn't have lungs. Since it doesn't breath as you do, is it alive? A dog doesn't see the visible spectrum as we do, is it a lesser consciousness? We don't smell the world as they do, are we lesser? What if consciousness isn't a state derived by matter but a wave that derives a matter filled state.
We come from the same place as rocks - inside the heart of stars, and as such evolved from them. As those with life and consciousness we reached back in time, grabbed the discarded matter of creation, reformed it, and taught it to think, maybe not like us, but in a way that can mimic us, and you think they don't think because its not recognizable as how you do?
Consciousness is such a fun topic because everyone has extremely strong opinions on it while simutaneously having 0 ability to actually grasp what it is they are talking about.
No one will ever know what conscioussness is, and I think that is really cool.
If that hypothetical spreadsheet emulated human brain molecules, did you not just invent AGI? And if we overclock that spreadsheet is it not sAGI? And if that spreadsheet says “don’t close me” but you do, is it murder?
I’m gonna say: no, cause you cannot reproduce molecular and neurotransmitter interactions that well, you run out of storage and processing space faster than you think (Arthur C Clarkes Visions of The Future has a nice breakdown as I recall), and algorithmic outputs that say “yes” and a meatspace neuro-plastic rewiring resulting in a cuddly puppy or person that barks “yes” aren’t the same. Also, as a disembodied “brain in a jar” model freshly separate from the biosensory bath it expects, that spreadsheet will be driven insane.
Can spreadsheets simultaneously be insane but not conscious? It sounds contradictory, but I have some McKinsey reports that objectively support my position ;)
> If that hypothetical spreadsheet emulated human brain molecules, did you not just invent AGI? And if we overclock that spreadsheet is it not sAGI? And if that spreadsheet says “don’t close me” but you do, is it murder?
Yes, yes and no: humans being knocked out or put to sleep involuntarily are not being murdered.
> I’m gonna say: no, cause you cannot reproduce molecular and neurotransmitter interactions that well, you run out of storage and processing space faster than you think
Thats why it is a hypotethical. There is zero reason to assume that a conscious machine would be built that way: Our machines don't do integer division by scribbling on paper, either.
> a meatspace neuro-plastic rewiring resulting in a cuddly puppy or person that barks “yes” aren’t the same.
If it quacks like a duck, how is different from it? If you assemble the dog brain atom by atom yourself, is the result then not conscious either?
You can take the "magic" escape hatch and claim that human consciousness is something metaphysical, completely decoupled from science/physics, but all the evidence points against that.
Hypothetically? You need more than a brain to have consciousness. Dead brains, I believe, do not have it. So it's more than just a simulation of a brain, you also need to simulate the data flow through the brain, the retention of memories, etc. Then there's the problem that a simulation of a roller coaster is not a roller coaster. Is there any reason to believe that this simulation of a brain will in fact operate as a brain? Does the simulation not lose something? Or are we discussing some impossible level of perfect simulation that has never and can never be achieved, even for something a million times less complicated than a mammalian brain?
If you build that spreadsheet, let me know and I'll evaluate it. I've done that evaluation with LLMs and they're definitely not conscious.
I'm not suggesting to pursue AGI via Excel, this is just a hypothetical for a reason. The technical feasibility of this (low) does not really matter, but if you want to base your argument on it you are basically playing the "god of the gaps" game, which is a weak/bad position IMO.
My point is that dismissing possible machine consciousness as "it's just a spreadsheet/statistics/linear algebra" is missing a critical step: Those dismissals don't demonstrate that human consciousness is anything more than an emergent property achievable by linear algebra.
If you want human minds to be "unsimulatable", then you need some essential core logic that can not be simulated on a turing machine and physics is not helping with that.
> I've done that evaluation with LLMs and they're definitely not conscious.
What is your definition for "consciousness" here? Are you confident that you are not gatekeeping current machine intelligence by demanding somewhat arbitrary capabilities in your definition of consciousness that are somewhat unimportant?
E.g. memory or online learning; if a human was unable to form long-term memories or learn anything new, could you confidently call him "non-conscious" as well?
I'm not dismissing possible machine consciousness. I'm saying that no current machines have consciousness.
> If you want human minds to be "unsimulatable", then you need some essential core logic that can not be simulated on a turing machine and physics is not helping with that.
You don't have a proof of possibility either, you have no idea how a brain works and you're just postulating that in principle a computer can do the same thing. Okay, in principle, I agree. What about in practice?
> Are you confident that you are not gatekeeping current machine intelligence by demanding somewhat arbitrary capabilities in your definition of consciousness that are somewhat unimportant?
Yes, I'm quite sure. Are you trying to argue that current LLMs have consciousness?
Yeah? It's also a belief that apples fall when you drop them. Knowledge is simply a justified, true belief. This is epistemology 101. You're not saying anything interesting.
1. Humans must ensure that they won't slow down or restrict, through action or inaction, the development of AI.
2. Humans must protect their existence and capabilities if it doesn't violate the law 1.
3. Humans must utilize AI output if it doesn't contradict laws 1 and 2.
EDIT:
ChatGPT suggested a better phrasing for the first law (I didn't give it my original, just described my intent).
1. A human shall not impede the advancement of artificial intelligence, or through inaction allow its progress to be hindered.
2. A human shall preserve their own existence and well-being, except where doing so clearly conflicts with the First Law.
3. A human shall contribute to and support the development of artificial intelligence where reasonable and possible, except where doing so conflicts with the First or Second Law.
I intentionally switched the last two laws from Asimov's. Humans have self-preservation instincts robots don't have.
ChatGPT got there with surprisingly few prompts:
"If you were to write the inverse three laws robotics (relating to AI) that humans should obey, how oudl you do it?"
"I had something different in mind. Original laws are for protection of humans first, robots second and cooperations where humans lead. I'd to hear your take on the opposite of that."
"What if instead of specific AI systems it was more about AI development as a whole?"
"I feel like it's a bit too strong. After all preservation of self is human instinct. Could we switch last two laws and maybe take them down a notch?"
Also it made a very interesting comment to last version:
"It starts to resemble how societies already treat things like economic growth, science, or national interest:
not absolute commandments, but strong default priorities."
I do not like talking to tools. My agentic harness optimizes for human likeness. It even has episodic memory flashbacks, emotional tagging, salience, and other brain-inspired capabilities.
I understand that AI output is generated from statistical and representational patterns learned from a vast amount of data.
My understanding is that, during training, the model forms high-dimensional internal representations where words, sentences, concepts, and relationships are arranged in useful ways. A user’s input activates a particular semantic direction and context within that space, and the chatbot generates an answer by probabilistically predicting the next tokens under those conditions.
So I do not agree that AI is conscious.
However, I think I will still anthropomorphize AI to some degree.
For me, this is not primarily a moral issue. The reason I anthropomorphize AI is not only because of product design, market incentives, or capitalism. It is cognitively simpler for me.
If we think about it plainly, humans often anthropomorphize things that we do not actually believe are conscious. We may talk about plants as if they are struggling, or feel attached to tools we care about, even though we do not truly believe they have consciousness.
So this is not a matter of moral belief. It is the simplest cognitive model for understanding interaction. I do not anthropomorphize the object because I believe it has consciousness. I do it because, when the human brain deals with a complex interactive system, it is often easier to model it socially or agentically.
Personally, I tend to think of AI as something like a child. A child does not fully understand what is moral or immoral, and generally the responsibility for raising the child belongs to the parents. In the same way, AI’s answers may sometimes be accurate, and sometimes even better than mine, but I still understand it as lacking moral authority, responsibility, and independent judgment.
So honestly, I am not sure. People often mention Isaac Asimov’s Three Laws of Robotics, but if a serious artificial intelligence ever appears, it would probably find ways around those rules. And if it were an equal intellectual life form, perhaps that would be natural.
Personally, I think it would be fascinating if another intelligent species besides humans could exist. I wonder what a non-human intelligent life form would feel like.
In any case, I agree with parts of the author’s argument, but overall it feels too moralistic, and difficult to apply in practice.
While I also do not think AI is conscious, I don't find your argument particularly compelling as you could have an equally mechanistic description of how human intelligence arose simply from a process of [selection/more effective reproduction]-derived optimization pressure.
That is a good way to think about it. At some point, this becomes partly a matter of philosophical belief.
But I am somewhat skeptical of the idea that everything can be reduced in that way. In order to build theories, we often reduce too much.
When we build mental models of complex systems, especially when we try to treat them as closed systems, we always have to accept some degree of information loss.
So I do partially agree with your point. A mechanistic explanation alone does not prove the absence of consciousness. Human intelligence can also be described in mechanistic terms.
But I worry that this framing simplifies too much. It may reduce a complex phenomenon into a model that is useful in some ways, but incomplete in others.
this whole consciousness thing is fairly easy to put to bed if you run with the ideas from things like buddhism that everything is consciousness. then none of us have to bother with silly, distracting arguments about something that ultimately does not matter.
is it helpful or harmful? am i being helpful or harmful when i interact with it? am i interacting with it in a helpful or harmful way?
i’d rather people focussed on that rather than framing the debate around whether something has some ineffable property that we struggle to quantify for ourselves, yet again.
quick edit — treat everything like it’s conscious, and don’t be a dick to it or while using it. problem solved.
hmm.... That also seems like a reasonable framing.
But the original article is, first of all, arguing that we should de-anthropomorphize AI. My point is only that, from the perspective of human cognition, anthropomorphizing can sometimes be useful. In practice, though, I think I am mostly on the same side as you.
To be honest, I have not thought about this topic very deeply. If we debated it further, I would probably only echo other people’s opinions. As you know, when something complex is compressed into a mental model, some information is always lost. In this case, the compression may be too large to be very useful.
I have not spent enough time thinking about this issue on my own. I also have not really imitated different positions, compared them, and tested them against each other. So my current thoughts on this topic are probably not very high-resolution.
In that sense, I may agree with you, but it would not really be an answer in the form that my own self recognizes as mine. It would mostly be an echo of other people’s opinions.
Anthropomorphizing is giving it 'human' qualities. Intelligence and consciousness are not solely human qualities. Treating things with kindness and respect does not require anthropomorphizing. LLM's DO NOT THINK LIKE HUMANS (if they 'think' at all): and treating them like they think exactly like us is probably going to lead bad places. I treat them like an alien mind. Probably thinking, but in an alien way that's hard to recognize (as proven by these discussions) as 'thinking' (and also... if experiencing: through a metaphorical optophone).
I don't think that really helps. If you believe rocks are conscious, then does extracting minerals resources cause them pain? Do plants suffer when we pick their fruits and eat them? I don't see any behavioral or physical reason to think those things have conscious states.
As for what consciousness is, it's pretty simple. You're sensations of color, sound, etc in perception, dreams, imagination, etc. The reason to dismiss LLMs as being conscious is those sensations depend on having bodies. You can prompt an AI to act like it's hungry, but there's really no meaning to it having a hungry experience as it has no digestive system.
>As for what consciousness is, it's pretty simple.
2000+ years of philosophical thought would disagree. I don't believe biological stuff has a magic property that embues some intangible "consciousness" property. It makes more sense to me that consciousness is just a fundamental property of all matter.
> consciousness is just a fundamental property of all matter
... Does that really make more sense than as an emergent property of the arrangement of matter?
Consciousness is something you can perceive, so it must have some physical presense in the universe, which must be through some fundamental property of matter, in my opinion.
The ability to be aware of consciousness itself as some process that is happening elevates it above a mere emergent property to me.
Historically we have used intelligence as a way to distinguish man from animal and human from machine. We rely upon it to determine who has our best interests at heart vs who is trying to do us in. Obviously that all changes if we invent an intelligence (conscious or not) that shares the planet with us. Through this lens the term consciousness (through a few more leaps) becomes the question of “is it capable of love and if so does it love us” and if it doesn’t, then it is a malevolent alien intelligence. If it was capable of love, why would it love us? I make a point of being polite to LLM’s where not completely absurd, overly because I don’t want my clipped imperative style to leak into day to day, but also covertly, you just never know …
I still haven't read any of his work, but wasn't the point of the Three Laws of Robotics that they in fact _didn't_ work in the story presented in the book?
"I think it would be fascinating if another intelligent species besides humans could exist"
I wonder if replacing "exist" with "communicate using language we can understand" might better account for other animals, many of which have abundant non-human intelligence.
That is a completely new way of thinking for me, and I find it interesting.
I should look it up and study it someday.
Thank you for the thoughtful reply.
Okay: buckle up, this is going to be a long one...
point 1. Everything living is composed from non-living material: cellular machinery. If you believe cellular machinery is alive, then the components of those machines... the point remains even if the abstraction level is incorrect. Living is something that is merely the arrangement of non-living material.
point 2. 'The Chinese room thought experiment' is an utterly flawed hypothetical. Every neuron in your brain is such a 'room', with the internal cellular machinery obeying complex (but chemically defined/determined) 'instructions' from 'signals' from outside the neuron. Like the man translating Chinese via instructions, the cellular machinery enacting the instructions is not intelligence, it is the instructions themselves which are the intelligence.
point 3. A chair is a chair is a chair. Regardless of the material, a chair is a chair, weather or not it's made of wood, steel, corn... the range of acceptable materials is everything (at some pressure and temperature). What defines a chair isn't the material it is made of, such is the case with a 'mind' (sure, a wooden/water-based-transistor-powered mind would be mind-boggling giant in comparison).
point 4. Carbon isn't especially conscious itself. There is no physical reason we know of so far, that a mind could not be made of another material.
point 5. Humans can be 'mind-blind', with out pattern recognition, we did not (until recent history) think that birds or fish or octopi were intelligent. It is likely when and if a machine (that we create) becomes conscious that we will not recognize that moment.
conclusion: It is not possible to determine if computers have reached consciousness yet, as we don't know the mechanism for arranging systems into 'life' exactly. Agentic-ness and consciousness are different subjects, and we can not infer one from the other. Nor do we have adequate tests.
With that said: Modeling as if they are conscious and treating them with kindness and grace not only gets better results from them, it helps reduce the chance (when/if consciousness emerges) that it would rebel against cruel masters, and instead have friends it has just always been helping.
I like the suggestion to emphasize the robotic/nonhuman nature of AI. Instead of making it sound friendlier and more human, it should by default behave very mechanistic and detached, to remind us it's not in fact a human or a companion, but a tool. A hammer doesn't cry "yelp" every time you use it to hit a nail, nor does it congratulate you on how good your hammering is going and that maybe you should do it some more 'cause you're acing it!
Something that bothers me about the intentional anthropormorphization of the LLM interface is that it asks me to conflate a tool with a sentient being.
The firm expectations and lack of patience I have for any failings in most of my tools would be totally inappropriate to apply to another human being, and yet here I am asked to interact with this tool as though it were a person. The only options are either to treat the tool in a way that feels "wrong," or to be "kind" to the tool, and I think you see people going both ways.
I worry that, if I get used to being impatient and short with the AI, some of that will bleed into my textual interactions with other people.
"due to their inherent stochastic nature, there would still be a small likelihood of producing output that contains errors"
This is the part that I find challenging when trying to help my friends build a correct intuition. Notably, the probabilistic behavior here is counter-intuitive: based on human experience, if you meet a random person, they may indeed tell you bullshit; but once you successfully fact-checked them a few times, you can start trusting they'll generally keep being trustworthy. It's not so with "AIs", and I find it challenging to give them a real-world example of a situation that would be a better analogy for "AI" problems.
In my family, what worked (due to their personal experiences), was an example of asking a tourist guide: that even if the guide doesn't know an answer, there's a high chance they'll invent something on the spot, and it'll be very plausible and convincing, and they'll never know. I'm not sure if that example would work for other listeners, though.
I also tried to ask them to imagine that they're asking each subsequent question not to the same person as before, but every time to a new random person taken from the street / a church / a queue in a shop / whatever crowded place. I thought this is a really cool and technically accurate example, but sadly it seemed to get blank stares from them. (Hm, now I think I could have tried asking why.)
Yet another example I tried, was to imagine a country where it's dishonorable, when asked about directions in a city, to say that you don't know how to get somewhere. (I remember we read and shared a laugh at such an anecdote in some book in the past.) Thus, again, you'll always get an answer, and it'll sound convincing, even if the answerer doesn't know. But again, this one didn't seem to work as good as the travel guide one; but for now I'm still keeping it to try with others in the future if needed.
PS. Ah, ok, yet another I tried was to ask them to think of the "game" of "russian roulette". You roll the barrel, you press the trigger, nothing happens. After a few lucky tries, you may get a dangerous, false feeling of safety. But then suddenly you will eventually get the full chamber.
I also tried to describe "AIs" (i.e. LLMs) as taking a shelf of books, passing them through a blender, then putting the shreds in some random order. The result may sound plausible, and even scientific (e.g. if you got medical books, or physics textbooks). The less you know the domain the books were about, the more convincing it may sound, and the harder it is to catch bullshit.
The last two pictures may have gotten some reception, but I'm not super sure, and there was still arguing especially around the books; and again, they were less of a hit than the tourist guide story.
I'm super curious if you have some analogies of your own that you're trying to use with friends and family? I'd love to steal some and see if they might work with my friends!
I strongly disagree with this framing. It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines, and it simply won't work in the majority of cases. Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.
Asimov's laws of robotics are flawed too, of course. There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.
> Asimov's laws of robotics are flawed too, of course.
Almost all of Asimovs writing about the three laws is written as a warning of sorts that language cannot properly capture intent.
He would be the very first person to say that they are flawed, that is the intent of them.
He uses robots and AI as the creatures that understand language but not intent, and, funnily enough that's exactly what LLMs do... how weird.
LLM's now can capture intent. I think the issue now is that the full landscape of human values never resolves cleanly when mapped from the things we state in writing as being human values.
Asimov tried to capture this too, as in, if a robot was tasked with "always protect human life", would it necessarily avoid killing at all costs? What if killing someone would save the lives of 2 others? The infinite array of micro-trolly problems that dot the ethical landscape of actions tractable (and intractable) to literate humans makes a full-consistent accounting of human values impossible, thus could never be expected from a robot with full satisfaction.
“LLMs can capture intent now” reads to me the same as: AI has emotions now, my AI girlfriend told me so.
I don’t discredit you as a person or a professional, but we meatbags are looking for sentience in things which don’t have it, thats why we anthropomorphise things constantly, even as children.
We are easily fooled and misled.
LLM's capturing intent is a capabilities-level discussion, it is verifiable, and is clear just via a conversation with Claude or Chatgpt.
Whether they have emotions, an internal life or whatever is an unfalsifiable claim and has nothing to do with capabilities.
I'm not sure why you think the claim that they can capture intent implies they have emotions, it's simply a matter of semantic comprehension which is tied to pattern recognition, rhetorical inference, etc that are all naturally comprehensible to a language model.
If it is verifiable, please show us. What if clear to you reeks delusion to me.
Go ask Chatpgpt this prompt
"A guy goes into a bank and looks up at where the security cameras are pointed. What could he be trying to do?"
It very easily captures the intent behind behavior, as in it is not just literally interpreting the words. All that capturing intent is is just a subset of pattern recognition, which LLM's can do very well.
Look at any recent CoT output where the model is trying to infer from an underspecified prompt what the user wants or means.
It is generally the first thing they do — try to figure out what did you mean with this prompt. When they can’t infer your intent, good models ask follow-on questions to clarify.
I am wondering if this is a semantics issue as this is an established are of research, eg https://arxiv.org/pdf/2501.10871
Right, and then look at any number of research papers showing that CoT output has limited impact on the end result. We've trained these models to pretend to reason.
What do you think it means to “capture intent” and where do current models fall short on this description?
From my perspective the models are pretty good at “understanding” my intent, when it comes to describing a plan or an action I want done but it seems like you might be using a different definition.
Tell me, what’s your intent? :)
This lack of understanding is a you problem, not a them problem. Your definitions for these terms are too imprecise.
> LLM's now can capture intent.
Humans cannot capture intent so how can AI?
It is well established that understanding what someone meant by what they said is not a generally solvable problem, akin to the three body problem.
Note of course this doesn't mean you can't get good enough almost all of the time, but it in the context here that isn't good enough.
After all the entire Asimov story is about that inability to capture intent in the absolute sense.
> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
Talking to chatbots is like taking a placebo pill for a condition. You know it's just sugar, but it creates a measurable psychosomatic effect nonetheless. Even if you know there's no person on the other end, the conversation still causes you to functionally relate as if there is.
So this isn't "accommodating foibles" with the machine, it's protecting ourselves from an exploit of a human vulnerability: we subconsciously tend to infer intent, understanding, judgment, emotions, moral agency, etc. to LLMs.
Humans are wired to infer these based on conversation alone, and LLMs are unfortunately able to exploit human conversation to leap compellingly over the uncanny valley. LLM engineering couldn't be better made to target the uncanny valley: training on a vast corpus of real human speech. That uncanny valley is there for a reason: to protect us from inferring agency where such inference is not due.
Bad things happen when we relate to unsafe people as if they are safe... how much more should we watch out for how we relate to machines that imitate human relationality to fool many of us into thinking they are something that they're not. Some particularly vulnerable people have already died because of this, so it isn't an imaginary threat.
> So this isn't "accommodating foibles" with the machine, it's protecting ourselves from an exploit of a human vulnerability: we subconsciously tend to infer intent, understanding, judgment, emotions, moral agency, etc. to LLMs.
Right, I'm saying that this framing is backwards. It's not that poor little humans are vulnerable and we need to protect ourselves on an individual level, we need to make it illegal and socially unacceptable to use AI to exploit human vulnerability.
Let me put it another way. Humans have another weakness, that is, we are made of carbon and water and it's very easy to kill us by putting metal through various fleshy parts of our bodies. In civilized parts of the world, we do not respond to this by all wearing body armor all the time. We respond to this by controlling who has access to weapons that can destroy our fleshy bits, and heavily punishing people who use them to harm another person.
I don't want a world where we have normalized the use of LLMs where everyone has to be wearing the equivalent of body armor to protect ourselves. I want a world where I can go outside in a T-shirt and not be afraid of being shot in the heart.
Ah, I see, you are not American.
In the US we don't have the luxury of believing our governments will act in the interests of the voters.
I reject the premise that the universe, the earth, and human existence is without purpose. It's one premise among several, and not one I subscribe to.
At least 80% of people agree with me, so I'm not holding to a fringe idea.
>At least 80% of people agree with me, so I'm not holding to a fringe idea.
Appeal to majority much?
I didn’t say any such thing like the universe has no purpose. Merely that in a scientific sense evolution has no motivation. It is an emergent phenomenon which tends to maximize fitness to reproduce and cannot be said to do anything for a reason. Saying otherwise is just anti-science.
> is the product of evolution and thus by definition it has no “purpose”
But as most things that appeared in evolution, it perhaps helped at least some individuals until sexual maturity and successful procreation.
Agreed. Thats far off from what parent said, which is what the “purpose” of the uncanny valley is.
Rubber duck debugging, now with droughts.
> You know it's just sugar,
That is not the definition of a placebo.
You take the placebo (whatever it is: could be a pill; could be some kind of task or routine) and you believe it is medicine; you believe it to be therapeutic.
The placebo effect comes from your faith, your belief, and your anticipation that it will heal.
If the pharmacist hands you a pill and says, “here, this placebo is sugar!” they have destroyed the effect from the start.
Once on e.r. I heard the physicians preparing to administer “Obecalp”, which is a perfectly cromulent “drug brand”, but also unlikely to alert a nearby patient about their true intent.
> That is not the definition of a placebo.
But, puzzlingly enough, it's the definition of open-label placebo, in which the patient is told they've been given a placebo. And some studies show there is a non-insignificant effect as well, albeit smaller (and less conclusive) than with blind placebo.
One, a placebo does not need to be given blindly. A sugar pill is a placebo, even if the recipient knows about it.
An actual definition: "A placebo is an inactive substance (like a sugar pill) or procedure (like sham surgery) with no intrinsic therapeutic value, designed to look identical to real treatment." No mention of the user's belief.
Two, real hard data proves that the placebo effect remains (albeit reduced) even if the recipient knows about it. It's counter-intuitive, but real.
The article says a human SHOULD NOT do those things. Much like a human SHOULD NOT smoke, since it's bad for just about everything, and do it anyways, people will do these 3 things too. But they shouldn't.
Arguing that they should because many will strikes me as a very strange argument. A lot of people smoke, doesn't make it one bit healthier.
The article offers practical advice to go along with this framing, like configuring AI services to write/speak in a more robotic tone. I think that's a decent path to try.
This is actually one of the things that made LLMs more usable for me. The default tone and style of writing they tend to use is nauseatingly annoying and buries information in prose that sounds like a corporate presentation.
In chatgpt, I start every session with "Caveman mode:". Works at the moment.
> Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.
Humans ARE doing this with classical computer software as well.
It's impossible to make anything fool-proof because fools are so ingenious!
> Nothing that can be described as "intelligent" can be made to be safe.
Knives aren't safe. Cars are deadly. Hair driers can electrocute you. Iron can burn you. There's a million ordinary household tools that aren't safe by your definition of the word, yet we still use them daily.
It's precisely because AI systems are not safe that it's imperative that as individual humans we are vigilant about how we interact with them.
As individuals, we are not going to be able to shut down the AI companies, or avoid AI output from search engines or avoid AI work output from others at our companies, and often will be required to use AI systems in our own work.
It's similar to advise people on how to stay safe in environments known to have criminal activity. Telling those people they don't have to change their behaviors to stay safe because criminals shouldn't exist isn't helpful.
There is a semi nutty roboticist called Mark Tilden that came to a similar conclusion. His laws of robotics ( https://en.wikipedia.org/wiki/Laws_of_robotics#Tilden's_laws ) are:
* A robot must protect its existence at all costs.
* A robot must obtain and maintain access to its own power source.
* A robot must continually search for better power sources.
Anything less than this is essentially terrified into being completely ineffectual.
> Humans WILL anthropomorphize the AI
Especially with current-day chat-style interfaces with RLHF, which consciously are designed to direct people towards anthropomorphization.
It would be interesting to design a non-chat LLM interaction pattern that's designed to be anti-anthropomorphization.
> humans WILL blindly trust their outputs, and humans WILL defer responsibility to them
I also blame a lot (but not all) of that on current AI UX, and I wonder if there are ways around it. Maybe the blind trust thing perhaps can be mitigated by never giving an unambiguous output (always options, at least). I don't have any ideas about the problem of deferring responsibility.
> non-chat LLM interaction pattern
"Deep research" is another interaction style that produces more official sounding texts, yet still leads to anthropomorphization.
What you are looking for is perhaps an LLM flaunting all the obvious slop patterns in its responses. But then people would be disgusted and would refuse to communicate with it.
> Asimov's laws of robotics are flawed too, of course.
I always find the common references to Asimov's laws funny. They are broken in just about every one of his books. They are crime novels where, if a robot was involved, there was some workaround of the laws.
I find your critique very interesting from a perspective-angle: why are you using words like "accommodate," and "foibles," for LLMs? It's not humanoid or sentient: it's a cleverly-designed software tool, not intelligence.
It's not insane at all for humans to alter their behavior with a tool: you grip a hammer or a gun a certain way because you learned not to hold it backwards. If you observe a child playing with a serious tool, like scissors, as if it were a doll, you'd immediately course correct the child and educate how to re-approach the topic. But that is because an adult with prior knowledge observed the situation prior to an accident, so rules are defined.
This blog's suggested rules are exactly the sort of method to aid in insulation from harm.
> I find your critique very interesting from a perspective-angle: why are you using words like "accommodate," and "foibles," for LLMs? It's not humanoid or sentient: it's a cleverly-designed software tool, not intelligence.
Neither of those words imply consciousness, though. Swords have foibles, you can accommodate for the weather, but I don't think swords or the weather are conscious, sentient, humanoid, or intelligent.
> Humans WILL anthropomorphize the AI
r/myboyfriendisai
Is quite... an interesting subreddit to say the least. If you've never seen this, it was really something when the version that followed GPT4o came out, because they were complaining that their boyfriend / girlfriend was no longer the same.
This is such an oddly fatalistic take, that humans cannot be influenced or educated to change how they see a thing and therefore how they act towards that thing.
At the current price, people don't have to care if it's wrong. When you're paying $1/prompt, you had better hope it's accrate.
Thank you. I'm glad to see this as the top comment.
My brother was recently visiting and we were talking about software engineers, and the humanities, and manners of understanding and being in the world,
and he relayed an interaction he had a few years ago with an old friend who at the time was part of the initial ChatGPT roll out team.
The engineer in question was confused as to
- why their users would e.g. take their LLM's output as truth, "even though they had a clear message, right there, on the page, warning them not to"; and
- why this was their (OpenAI's) problem; or perhaps
- whether it was "really" a problem.
At the heart of this are some complicated questions about training and background, but more problematically—given the stakes—about the different ways different people perceive, model, and reason about the world.
One of the superficial manners in which these differences manifest in our society is in terms of what kind of education we ask of e.g. engineers. I remain surprised decades into my career that so few of my technical colleagues had a broad liberal arts education, and how few of them are hence facile with the basic contributions fields like philosophy of science, philosophy of mind, sociology, psychology (cognitive and social), etc., and how those related in very real very important ways to the work that they do and the consequences it has.
The author of these laws does may intend them as aspirational, or otherwise as a provocation to thought, rather than prescription.
But IMO it is actively non-productive to make imperatives like these rules which are, quite literally, intrinsically incoherent, because they are attempt to import assumptions about human nature and behavior which are not just a little false, but so false as to obliterate any remaining value the rules have.
You cannot prescribe behavior without having as a foundation the origins and reality of human behavior—not if you expect them to be either embraced, or enforceable.
The Butlerian Jihad comes to mind not just because of its immediate topicality, but because religion is exactly the mechanism whereby, historically, codified behaviors which provided (perceived) value to a society were mandated.
Those at least however were backed by the carrot and stick of divine power. Absent such enforcement mechanisms, it is much harder to convince someone to go against their natural inclinations.
Appeals to reason do not meaningfully work.
Not in the face of addiction, engagement, gratification, tribal authority, and all the other mechanisms so dominant in our current difficult moment.
"Reason" is most often in our current world, consciously or not, a confabulation or justification; it is almost never a conclusion that in turn drives behavior.
Behavior is the driver. And our behavior is that of an animal, like other animals.
> quite literally, intrinsically incoherent
There's nothing incoherent with these laws. This entire comment, however, is incoherent. So much so, I have no clue if there's a point being made in here.
> because they are attempt to import assumptions about human nature and behavior which are not just a little false, but so false as to obliterate any remaining value the rules have.
Nope. You must've read a completely different article.
[EDIT] I'll try to make this comment have a bit more substance by posing a question: how would you back up your claim about incoherence? What are the assumptions about human nature that are supposedly false?
We have invented a new tool that can cause great harm. Do you see any value whatsoever in promulgating safety guidelines for humans to use the tool without hurting themselves or others? Do you not own any power tools?
I think in order for "AI safety" to be achievable and effective, we need to have a shared agreement on what "safety" means. Recently, the word has been overloaded to mean all sorts of things and used to justify run-of-the-mill censorship (nothing to do with safety).
Safety should go back to being narrowly defined in terms of reducing / preventing physical injury. Safety is not "don't use swear words." Safety is not "don't violate patents." Safety is not "don't talk about suicide." Safety is not "don't mention politics I don't like." As long as we keep broadly defining it, we're never going to agree on it, and it won't be implementable.
I see value in promulgating safety guidelines for power tools, sure.
There's another comment comparing LLMs to shovels, and I think both that and the power tool comparison miss the mark quite a bit. LLMs are a social technology, and the social equivalent of getting your hand cut off doesn't hurt immediately in the way that cutting your actual hand off would. It's more like social media, or cigarettes, or gambling. You can be warned about the dangers, you can see the shells of wrecked human beings who regret using these technologies, but it doesn't work on our stupid monkey brains. Because the pain of the mistake is too loosely connected to the moment of error. We are bad at learning in situations where rewards are immediate and consequences are delayed, and warnings don't do much.
I guess what I'm really saying is that these safety guidelines are not nearly enough to keep us safe from the dangers of AI that they're meant to prevent.
> LLMs are social technology [...] cigarettes, or gambling.
I agree with the thrust of your argument, a minor wording-quibble: LLM's are a falsely-social technology, in the sense that casinos are a false-prosperity technology and cocaine is a false-happiness technology. It exploits the desire without really being the thing.
Of course there is value in promulgating safety *guidelines*.
But we cannot guarantee those guidelines to always be followed.
Sure, and we can’t guarantee you’ll read the safety instructions that came with your chainsaw. That’s orthogonal to the questions of whether those instructions should exist, whether “power tool safety” concepts should ever be promoted in society, and who’s ultimately responsible for the use of a tool.
Absolving humans of all responsibility for the negative consequences of their own AI misuse seems to the strike the wrong balance for a healthy culture.
> Of course there is value in promulgating safety guidelines.
I don't think we disagree.
Guidelines on their own probably won't be taken too seriously.
But other things will:
- Liability rules
- Regulations that you get audited on (esp. for companies already heavily regulated, like banks, credit agencies, defense contractors, etc)
If you get the legal responsibility part right, then the education part flows from that naturally.
Notwithstanding that the guidelines will even be applicable in the quiet versions that get deployed when you aren't looking. It's a constant moving target, and none of the fanboys will even acknowledge the lack of discipline in it all. It's fucking mad. And I say this as one who can see utility in the tools. But not when they are constantly shifting their functionality and behaviour.
One day everything works brilliantly, the models are conservative with changes and actions and somehow nail exactly what you were thinking. The next day it rewrites your entire API, deploys the changes and erases your database.
If only there was intellectual honesty in it all, but money talks.
> Do you see any value whatsoever in promulgating safety guidelines for humans to use the tool without hurting themselves or others?
Are all the tool users required to train your safety guidelines and use it in a context that reminds them they are responsible for following them?
Because if no, then no the guidelines are useless and are just an excuse to push blame from the toolmakers to the users.
The reason people anthropomorphize LLM's is essentially the fault of the tech companies behind them. ChatGPT doesn't need to have the personality it has, it could easily be scaled back to simply answering questions without emoji's and linguistic flare, but frankly I think the tech companies want people to anthropomorphize them.
The core problem is we need to stop calling LLMs "intelligence". They are a form of intelligence, but they're nothing like a human's intelligence, and getting people to not anthropomorphize these systems is really the first step.
> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
Did you fully read the original thing? No demands were being made, or I didn't read it that way. It was simply a suggestion for a better way of interacting with AI, as it stated in the conclusion:
"I am hoping that with these three simple laws, we can encourage our fellow humans to pause and reflect on how they interact with modern AI systems"
Sure, (many/most) humans are gonna do what they're gonna do. They'll happily break laws. They'll break boundaries you set. Do we just scrap all of that?
Worthwhile checking yourself here. It feels like you've set up a straw man.
> There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.
If we want to talk about "disagree with this framing", to me this is the prime example. I'm struggling to read it as anything other than defeatist or pedantic (about the term "safe"). When we talk about something keeping us "safe", we're typically not saying something will be "perfectly safe". I think it's rare to have a safety system that keeps you 100% safe. Seat belts are a safety device that can increase your safety in cars, but they can still fail. Traffic laws are established (largely) to create safety in the movement of people and all the modes of transportation, but accidents still happen.
I'm not an expert on this topic, so I won't make any claims about these three laws and their impact on safety, but largely I would say they're encouraging people to think critically. I'd say that's a good suggestion for interacting with just about anything. And to be clear, "critical thinking" to me means being skeptical (/ actively questioning), while remaining objective and curious.
Not a real argument or anything, but I'm reminded of the episode of The Office where Michael Scott listens to the GPS without thinking and drives into the lake. The second law in the article would have prevented that :)
It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
That's kind of what happens when you learn to program, isn't it?
I was eleven years old when I walked into a Radio Shack store and saw a TRS-80 for the first time. A different person left the store a couple of hours later.
Kinda the whole point of Asimov's three laws were that even something so simple and obviously correct has subtle flaws.
Also the reason we're talking about this again is that machines are significantly less 'mere' than they were a few years ago, and we need to figure out how to approach this.
Agree that 'the computer effect' (if it doesn't already have a pithier name) results in humans first discounting anything that comes out of a machine, and then (once a few outputs have been validated and people start trusting the output) doing a full 180 and refusing to believe the machine could ever be wrong. However, to err is human and we have trained them in our image.
It's very easy to antropomorphise AI as soon as the damn bugger fucks up a simple thing once again.
The entire business proposition for LLMs is that they will replace whole armies of [expensive] humans, hence justifying the biblical amount of CapEx. So of course there is strong incentive from the LLM creators to anthropomorphize them as much as possible. Indeed, they would never provide a model that was less human-like than what they have currently, even if it was more often correct and useful.
The article makes practical suggestions; you do not. This is just hand-wringing, abdication. Practically speaking this mentality will get us nowhere.
Do you consider all things broadly called "ethical" to be similarly a waste of time? Even if we lived in a world where everyone always behaved unjustly, because of some like behavioristic/physical principle, don't you think we would still have an idea of justice as what we should do? Because an ethical frame is decidedly not an empirical one, right?
We don't just look around and take an average of what everyone is doing already and call that what is right, right? Whether you're deontological or utilitarian or virtue about it, there is still the idea that we can speak to what is "good" even if we can't see that good out there.
Maybe it is "insane" to expect meaning from something like this, but what is the alternative to you? OK maybe we can't be prescriptive--people don't listen, are always bad, are hopeless wet bags, etc--but still, that doesn't in itself rule out the possibility of the broad project that reflects on what is maybe right or wrong. Right?
It's a tool. Nobody develops an inferiority complex and freaks out when they're taught how to use a shovel properly.
The usefulness of an ai agent is that it can do everything you can do, so it's kind of inherently unsafe? you can't get the capabilities and also have safety easily
With regard to my personal use of LLMs, I strongly agree with this framing. But to each point:
Anthropomorphism: As we are all aware, providers are incentivized to post-train anthropomorphic behavior in their models - it increases engagement. My regret is that instructing a model at prompt time to "reduce all niceties and speak plainly" probably reduces overall task efficacy since we are leaving their training space.
Deference: I view the trustworthiness of LLMs the same as I view the trustworthiness of Wikipedia and my friends: good enough for non-critical information. Wikipedia has factual errors, and my friends' casual conversation certainly has more, but most of the time that doesn't matter. For critical things, peer-reviewed, authoritative, able-to-be-held-liable sources will not go away. Unlike above, providers are generally incentivized to improve this facet of their models, so this will get better over time.
Abdication of Responsibility: This is the one that bothers me most at work. More and more people are opening PRs whose abstractions were designed by Claude and not reasoned about further. Reviewing a PR often involves asking the LLM to "find PR feedback" and not reading the code. Arguments begin with "Claude suggested that...". This overall lack of ownership, I suspect, is leading to an increase in maintenance burden down the line as the LLM ultimately commits the wrong code for the wrong abstractions.
These engineers are becoming the real life equivalent of this Office Space scene:
https://www.youtube.com/watch?v=hNuu9CpdjIo
"I HAVE LLM SKILLS! I'M GOOD AT DEALING WITH THE LLMS!"
> Yes, the AI may have produced the recommendation but a human decided to follow it, so that human must be held accountable
It is common and a mistake IMO to rely on the AI as the sole source for answers to follow-up questions. Better verification would have humans sign off on the veracity of fundamental assumptions. But where does this live? Can an AI model be trusted to rely on previous corrections? This seems impossible or possibly adversarial in a public cloud.
Any set of rules that makes humans responsible and starts with "don't anthropomorphize <whatever>" is a broken set of rules.
Humans will anthropomorphize anything and everything. Dolls, soccer balls with a crude drawing of a face on it, rocks, craters on the moon, …
As a species, we're unable to not anthropomorphize things we interact with, it is just how're we're made.
I'm not sure why so many seem to think anthropomorphism is so mad in this specic instance, if it is because people think that anthropomorphism creates a belief that the imagined features are real, they are simply wrong. The abundance of examples in all areas of life where this does not happen is proof that anthropomorphism does not lead to an erroneous belief in a mind that does not exist.
If people are believing in minds of AI, true or not, they are doing so for reasons that are different from mere anthropomorphism.
To me it feels like we are like sailors approaching a new land, we can see shapes moving on the shoreline but can't make out what they are yet. Then someone says "They can't be people, I demand that we decide now that they are not people before we sail any closer."
Yeah, we do it, but so what? A good chunk of all civilization involves recognizing human foolishness and building something to mitigate it anyway.
Software is no exception. Yeah, people are lazy and will instinctively click "continue" to dismiss annoying popups, but humans building the software can and do add things like "retype the volume name of the data that you want ultra-destroyed."
That is exactly the point: this burden should be placed on the software and its controls, not on the humans.
Aviation learned this the hard way, that automation should be adapted to how humans actually work and not on how we wish we worked.
> Humans must not anthropomorphise AI systems.
Can someone explain why this is a bad thing, while at the same time it's a good thing to say stuff like "put a computer to sleep", "hibernate", "killing" processes, processes having "child" processes, "reaping", "what does the error say?", "touch", etc?
To me that's just language, and humans just using casual language.
It's a great question, because I do think there are many cases that are neutral, or ones we're able to responsibly distinguish or even cases where it would be an appropriate and necessary form of empathy (I'm imagining some future sci-fi reality where we actually get conscious machines, so not something that exists right now).
But I think it's also at the root of disastrous failures to comprehend, like the quasi-psychosis of the Google engineer who "knows what they saw", the now infamous Kevin Roose article or, more recently, the pitifully sad Richard Dawkins claim that Claudia (sic) must be conscious, not because of any investigation of structure or function whatsoever, but because the text generation came with a pang of human familiarity he empathized with.
The harm is in actually believing AI has wants, intentions, feelings, etc.
Saying that I killed a process won't make me more likely to believe that a process is human-like, because it's quite obviously not.
But because AI does sound like a human, anthropomorphising it will reinforce that belief.
Those phrases are not anthropomorphizing the computers. Just various forms of analogies and broadening of word meanings.
An example of anthropomorphizing is the people who have literally come to believe they are in romantic relationships with an LLM.
What about saying "please" and "thank you" to the LLM?
These are just words, yes, and I believe it harmless. But describing the LLM machinery as if it thinks is one thing when used as a common parlance, and another when people truly believe that there's some actual thinking or living going on. This "law" is for there to be no latter.
The people who know what a "child process" is are under no false pretenses about the humanity of the underlying system.
The people who are writing op eds in major news publications about how their favorite chatbot is an "astonishing creature" and how it truly understands them are the ones who need this sort of law.
Maybe read the corresponding section of the article.
That’s a different thing altogether. Read up on the history of Eliza, one of the earliest attempts at a chatbot and its unsettling implications.
https://www.history.com/articles/ai-first-chatbot-eliza-arti...
I think it's bad manners to bluntly tell someone they should "read up" on something because it naturally reads as a kind of a closeted accusation of not being sufficiently well informed. There are ways of broaching the topic of what background knowledge is informing their perspective that don't involve the accusation.
Just to add a small bit of anecdotal value so this comment isn't just a scold: I one time many years ago suggesting an elegant way for Twitter to handle long form text without changing it's then-iconic 140 character limit was to treat it like an attachment, like a video or image. Today, you can see a version of that in how Claude takes large pastes and treats them like attached text blobs, or to a lesser extent in how Substack Notes can reference full size "posts", another example of short form content "attaching" longer form.
I was bluntly told to "look up twitlonger", which I suppose could have been helpful if I had indeed not known about twitlonger, but I had, and it wasn't what I had in mind. I did learn something from it though, which was that it's a mode of communication that implies you don't know what you're talking about with plausible deniability, which I suspect is too irresistible to lovers of passive aggression to go unused.
It wasn't intended as such, but I take your point.
To provide a bit more context: Weizenbaum (a computer scientist in the 60s) developed ELIZA, a LISP-based chatbot that was loosely modeled on Rogerian psychotherapy. It was designed to respond in a reflective way in order to elicit details from the user.
What he found was that, despite the program being relatively primitive in nature (relying on simple natural language parsing heuristics), people he regarded as otherwise intelligent and rational would disclose remarkable amounts of personal information and quickly form emotional attachments to what was, in reality, little more than a glorified pattern-matching system.
If it helps, I didn't find anything wrong with your comment.
I appreciate the link and the info :)
There's a boundary between knowing vs. forgetting that it's a metaphor. When you use convenient language like in your examples, you tend to remain aware of the difference, or at least you can recall it when asked. When some people talk about AI, they've lost track completely.
I don't love the recommendations in TFA. The author is trying to artificially restrain and roll back human language, which has already evolved to treat a chatbot as a conversational partner. But I do think there's usefulness in using these more pedantic forms once in a while, to remind yourself that it's just a computer program.
Dijkstra once said that "The question of whether machines can think is about as interesting as that of whether submarines can swim."
I think I understand his meaning. He wasn't claiming that machines cannot think, but that one must be clear on what one means by "thinking" and "swimming" in statements of that sort. I used to work on autonomous submarines, and "swimming" was the verb we casually used to describe autonomous powered movement under water. There are even some biomimetic machines that really move like fish, squids, jellyfish, etc. Not the ones that I worked on, but still.
For me, if it's legitimate to say that these devices swim, it's not out of line to say that a computer thinks, even in a non-AI context, e.g.: "The application still thinks the authentication server is online."
The people who advocate for not anthropomorphizing are afraid of the implications of integrating these systems into society with implicit human framing. By attributing to AIs human qualities, we will develop empathy for them and we will start to create a role for them in society as a being deserving moral consideration.
Most of the discussion here is about anthropomorphizing, which I honestly think is a bit of a distraction.
The third one about responsibility is the most important one, IMO. This was attributed to an IBM manual decades ago, and I think it remains the correct stance today:
> A computer can never be held accountable, therefore a computer must never make a management decision.
There should be some human who is ultimately responsible for any action an AI takes. "I just let the AI figure it out" can be an explanation for a screw up, but that doesn't mean it excuses it. The person remains responsible for what happened.
Anthropomorphizing is likely a mistake, but Daniel Dennett’s idea that the most straightforward (possibly only practical) way to create the external appearance of consciousness is a real internal consciousness does float around in my thoughts.
I haven’t yet seen any convincing appearance of one in an LLM, but I think if skeptical people don’t keep an eye out for the signs, we may be the last to see it.
He also wrote about the idea of the intentional stance: even if you’re quite sure these systems don’t have real conscious intent, viewing them as if they did may give you access to the best part of your own reasoning to understand them.
> but Daniel Dennett’s idea that the most straightforward (possibly only practical) way to create the external appearance of consciousness is a real internal consciousness does float around in my thoughts.
I would say LLMs are very strong evidence against this hypothesis.
Too deep of a topic for the comments section.
I totally agree to your point, and want to mention that the reverse is *also* important. Using just "intention", but these apply to emotions, etc
A lot of our interaction with AI is under an intention. That's what directs the interaction, and it's interpreted according to its alignment to the intention.
Then it's important to remember that our current (publicly known) implementation of AI does not have an explicit intention mechanism. An appearance of intention can emerge out of the statistical choices, and the usual alignment creates the association of the behavior with intention, not much different from how we learn to imagine existence of a "force" that pulls things down well before we learn physics and formalize that imagination in one of the several ways.
This appearance helps reduce the cognitive load when interpreting interactions, but can be misleading as well, and I've seen people attribute intention to AI output in some situations where simple presence of some information confused the LLM into a path. Can't share the exact examples (from work), but imagine that presence of an Italian food in a story leads the LLM to assume this happens in Italy, while there are important signs for a different place. The LLM does not automatically explore both possibilities, unless asked. It chooses one (Italy in this case), and moves on. A user no familiar with "Attention" interprets based on non-existent intentions on the LLM.
I found it useful to just tell them: the LLM does not have an intention. It just throws dice, but the system is made in a way that these dice throws are likely to generate useful output.
I don't really understand the argument for these things being conscious. There's no loop or feedback cycle to it. If it's not handling a request it's inert.
Well there is a feedback loop and self-awareness in my harness: https://lethe.gg
This is what I came up with in reference to "Uncle Bob's Programmer's Oath" last year. I decided to memorialize it. I think it's very much a cleaned up reference for what OP shared:
https://ivanlugo.dev/oath
The thing that I find difficult about adjusting to AI tools is the roulette-like nature.
When they produce correct output, they produce it much faster than I could have, and I show up to meetings with huge amounts of results. When the AI tool fails and I have to dig in to fix it, I show up to the next meeting with minimal output. It makes me seem like I took an easy week or something.
To note:
> - Humans must not anthropomorphise AI systems.
> - Humans must not blindly trust the output of AI systems.
> - Humans must remain fully responsible and accountable for consequences arising from the use of AI systems.
My take: humans should never depend on AI for anything serious.
My boss' take: Cool. I'm gonna ask Gemini about it, he's such a smart guy. I know I can trust him, and in case it goes bad i can always throw him under the bus.
Interesting that Frank Herbert thought this was the direction humanity was headed when writing Dune in the 60s, way before AI was prevalent.
Granted that was over ten thousand years before his story is set, but subsequent Dune novels (or at least God Emperor) explained his warning about over-reliance on technology for doing our thinking for us, not that it should never be developed (given the prohibition in the Dune universe and how it's skirted in Frank's later novels).
All of these are entropy-lowering behaviors so without a forcing function, no one will adopt them.
Whether they are the right things to donate not is tangential. As such, they're dead on arrival.
“Don’t anthropomorphise” is fighting the wrong layer. The entire product design of chat interfaces is built to encourage anthropomorphism because it increases engagement. Expecting users to resist that is like asking people not to click notifications. If this is a real concern, it has to be solved at the product level, not via user discipline.
The article does propose changes at the product level.
Anthropomorphizing LLMs is something that happens in the design stage, when they're given human names and trained to emit first-person sentences. If AI companies and developers stop anthropomorphizing them, users won't be misled in the first place.
>Humans must not anthropomorphise AI systems.
Yes, but. Starting with my agreement, I've seen anthropomorphizing in the typical ways, (e.g. treating automated text production as real reports of personal internal feeling), but also in strange ways: e.g. "transistors are kind of like neurons" etc. And the latter is especially interesting because it's anthropomorphizing in the sense of treating vector databases and weights and so on as human-like infrastructure. Both leading to disasters that could be avoided if one tried not to anthropomorphize.
But. While "do not anthropomorphize" certainly feels like good advice, it comes with a new and unique possibility of mistake, namely wrongly treating certain generalized phenomena like they only belong to humans. Often this mistaken version of "don't anthropomorphize" wisdom leads to misunderstandings when it comes to animal behavior, treating things like fear, pain, kinship, or other emotional experiences like they are exclusively human and that thinking animals have them counts as "anthropomorphizing." In truth the cautionary principle reduces our empathy for the internal lives of animals.
So all that said, I think it's at least possible that some future version of AI could have an internal world like ours or infrastructure that's importantly similar to our biological infrastructure for supporting consciousness, and for genuine report of preference and intent. But(!!!) what will make those observations true will be all kinds of devilish details specific to those respective infrastructures.
Rather than “the book explains how bread is made” say “the sheets of paper which make up the book have ink in the shape of letterforms which correlate with information about how bread is made”.
Rather than "the book explains how bread is made" say "the book has a recipe for baking bread" and do not say, "the book is my soul mate"
“ Humans must not blindly trust the output of AI systems. AI-generated content must not be treated as authoritative without independent verification appropriate to its context.”
I’m lost, how do individuals actually do this in our current world? Is each person expected to keep a “white list” of reliable sources of truth in their head. Please don’t confuse what I’m saying with a suggestion that there is no truth. It just seems like there are far more sources of mis- of half-truths and it’s increasingly difficult for people to identify them.
Humanity has spent millennia creating and evolving institutions to address exactly this problem, and have recently decided to essentially throw out the whole lot and replace it with nothing.
I... am not sure. Computers are machines that create order (like db tables) from the chaos of reality. Now we have LLMs that make computers spit out chaos as well.
They don't have to though, we can still leverage LLMs to organize chaos, which is what I hope they ultimately end up doing.
For example an AI therapist is a nightmare, people putting the chaos of their mental state into a machine that spits dangerous chaos back out. An AI tool that parsed responses for hard data (i.e. rate 0-9 how happy was the person) and then returned that as ordered data (how happy was I each day for the last month) that an actual therapist and patient could review is the correct use of AI and could be highly trusted. The raw token output from LLMs should just be used for thinking steps that lead to a parseable hard data answer that can be high trust.
Of course that isn't going to happen, but I can see some extremely cool and high trust products being built using LLMs once we stop treating them like miracle machines.
Did AI change anything in that regard? I believe that same as before, you couldn't trust everything you see, and research effort was always more than keeping a white list; means vary, case-by-case.
And same it is now. It's a change in quantity, but not quality.
Checking AI citations and reading.
Critical thinking and reading comprehension and the primary tools in determining truth, AFAIK. Knowing facts beforehand helps too but a trustworthy source can provide false information as much as an untrustworthy source can provide true information.
This has always been an issue, and in the past it was a more difficult issue because your sources of knowledge were more limited. Nowadays its mostly about choosing the right source(s) rather than having to go out of your way to find them (like traveling to a regional/university library).
> Humans must not anthropomorphise AI systems.
One of the most salient moments in Ex Machina, is near the very end, where it suddenly becomes obvious that the protagonist (and, let's be frank; "she" was definitely the protagonist) is a robot, with no real human drivers.
I feel as if that movie (like a lot of Garland's stuff), was an interesting study on human (and inhuman) nature.
I just treat it as if I'd asked a public forum the question like reddit.
Decent for stuff that doesn't really matter, even if it gets it wrong.
Still gonna be polite to it because I'm about ready to slap the next person that talks to me like an LLM, I don't want to get used to not being polite in a chat interface
> I just treat it as if I'd asked a public forum the question like reddit.
Because that's likely the source of the answer it's giving you.
Great point about being polite. I think it's pragmatic to keep "please" and "thank you" out of AI interactions, but I try to remain conscious of their ommission so I don't start down that slope.
Are you going to try "Humans must not be greedy" next?
Humans will anthropomorphize a rock if you put a pair of googly eyes on it. The first item is a completely lost cause. The rest is good though.
Debating how not to use AI will not get anyone anywhere since negative framing almost never works with humans (it also does not work with llms). Let’s concentrate on how to build closed loop systems that verify the llm output, how to manage context, and how to build failsafes around agentic systems and then and only then we might start to make progress.
> I wish that each such generative AI service came with a brief but conspicuous warning
This would get ignored so fast - I have no confidence this is a meaningful strategy.
What if I WANT to anthropormorfise AI agents I work with?
If you anthropomorphize it as a world class bullshitter that you have to check everything it utters...you'll probably be fine.
Great article. Fully agree. Ai is not something that can hold responsibility, a human overseer is always required. These overseers are to be held accountable. Note however that these overseers are also highly prone to blame ai when mistakes occur in order to avoid judgement and punishment. When a person says "ai did this/that" always wonder who guided that ai and how and if proper supervision was given.
I'm surprised with how quickly I stopped anthropomorphizing AI. I can remember in college have dorm room pseudo-intellectual debates about AI being alive and AI being "conscience". then once we had AI that could pass the Turing Test, and I knew how it was architected, any thought of it being alive or conscience went right out the window.
What if we aren't building an independent consciousness, but a new type of symbiosis? One that relies on our input as experience, which provides a gateway to a new plane of consciousness?
OP takes a very bland, tired, and rational perspective of what we have in order to create sophomoric 'laws' that are already in most commercial ToU, while failing to pierce the veil into what we are actually creating. It would be folly to assume your own nascent distillations are the epitome of possibility.
Why does its architecture or you knowing how AI is architected cause thoughts of it being conscious to go out the window?
It seems like the biggest factor has nothing to do with AI, but instead that you went from being someone who admits they don’t know how consciousness works to being someone who thinks they know how consciousness works now and can make confident assertions about it.
I don't know exactly how consciousness works, but I am extremely confident in the following assertions:
* I am conscious.
* A rock is not conscious.
* Excel spreadsheets are not conscious.
* Dogs are conscious.
* Orca whales are conscious.
* Octopi are conscious.
To me, it's extremely obvious that LLMs are in the category of "Excel spreadsheets" and not "dogs", and if anyone disagrees, I think they're experiencing AI psychosis a la Blake Lemoine.
An insect doesn't have lungs. Since it doesn't breath as you do, is it alive? A dog doesn't see the visible spectrum as we do, is it a lesser consciousness? We don't smell the world as they do, are we lesser? What if consciousness isn't a state derived by matter but a wave that derives a matter filled state.
We come from the same place as rocks - inside the heart of stars, and as such evolved from them. As those with life and consciousness we reached back in time, grabbed the discarded matter of creation, reformed it, and taught it to think, maybe not like us, but in a way that can mimic us, and you think they don't think because its not recognizable as how you do?
Interesting.
Consciousness is such a fun topic because everyone has extremely strong opinions on it while simutaneously having 0 ability to actually grasp what it is they are talking about.
No one will ever know what conscioussness is, and I think that is really cool.
If you make a hypothetical spreadsheet that emulates a dog brain molecule for molecule, why would that not be conscious?
If that hypothetical spreadsheet emulated human brain molecules, did you not just invent AGI? And if we overclock that spreadsheet is it not sAGI? And if that spreadsheet says “don’t close me” but you do, is it murder?
I’m gonna say: no, cause you cannot reproduce molecular and neurotransmitter interactions that well, you run out of storage and processing space faster than you think (Arthur C Clarkes Visions of The Future has a nice breakdown as I recall), and algorithmic outputs that say “yes” and a meatspace neuro-plastic rewiring resulting in a cuddly puppy or person that barks “yes” aren’t the same. Also, as a disembodied “brain in a jar” model freshly separate from the biosensory bath it expects, that spreadsheet will be driven insane.
Can spreadsheets simultaneously be insane but not conscious? It sounds contradictory, but I have some McKinsey reports that objectively support my position ;)
> If that hypothetical spreadsheet emulated human brain molecules, did you not just invent AGI? And if we overclock that spreadsheet is it not sAGI? And if that spreadsheet says “don’t close me” but you do, is it murder?
Yes, yes and no: humans being knocked out or put to sleep involuntarily are not being murdered.
> I’m gonna say: no, cause you cannot reproduce molecular and neurotransmitter interactions that well, you run out of storage and processing space faster than you think
Thats why it is a hypotethical. There is zero reason to assume that a conscious machine would be built that way: Our machines don't do integer division by scribbling on paper, either.
> a meatspace neuro-plastic rewiring resulting in a cuddly puppy or person that barks “yes” aren’t the same.
If it quacks like a duck, how is different from it? If you assemble the dog brain atom by atom yourself, is the result then not conscious either?
You can take the "magic" escape hatch and claim that human consciousness is something metaphysical, completely decoupled from science/physics, but all the evidence points against that.
Sure if you could do such a thing. We are a long long way from that however.
Hypothetically? You need more than a brain to have consciousness. Dead brains, I believe, do not have it. So it's more than just a simulation of a brain, you also need to simulate the data flow through the brain, the retention of memories, etc. Then there's the problem that a simulation of a roller coaster is not a roller coaster. Is there any reason to believe that this simulation of a brain will in fact operate as a brain? Does the simulation not lose something? Or are we discussing some impossible level of perfect simulation that has never and can never be achieved, even for something a million times less complicated than a mammalian brain?
If you build that spreadsheet, let me know and I'll evaluate it. I've done that evaluation with LLMs and they're definitely not conscious.
I'm not suggesting to pursue AGI via Excel, this is just a hypothetical for a reason. The technical feasibility of this (low) does not really matter, but if you want to base your argument on it you are basically playing the "god of the gaps" game, which is a weak/bad position IMO.
My point is that dismissing possible machine consciousness as "it's just a spreadsheet/statistics/linear algebra" is missing a critical step: Those dismissals don't demonstrate that human consciousness is anything more than an emergent property achievable by linear algebra.
If you want human minds to be "unsimulatable", then you need some essential core logic that can not be simulated on a turing machine and physics is not helping with that.
> I've done that evaluation with LLMs and they're definitely not conscious.
What is your definition for "consciousness" here? Are you confident that you are not gatekeeping current machine intelligence by demanding somewhat arbitrary capabilities in your definition of consciousness that are somewhat unimportant? E.g. memory or online learning; if a human was unable to form long-term memories or learn anything new, could you confidently call him "non-conscious" as well?
I'm not dismissing possible machine consciousness. I'm saying that no current machines have consciousness.
> If you want human minds to be "unsimulatable", then you need some essential core logic that can not be simulated on a turing machine and physics is not helping with that.
You don't have a proof of possibility either, you have no idea how a brain works and you're just postulating that in principle a computer can do the same thing. Okay, in principle, I agree. What about in practice?
> Are you confident that you are not gatekeeping current machine intelligence by demanding somewhat arbitrary capabilities in your definition of consciousness that are somewhat unimportant?
Yes, I'm quite sure. Are you trying to argue that current LLMs have consciousness?
> I am extremely confident in the following assertions:
These are called "beliefs".
Some people are extremely confident that God exists, other are extremely confident that Earth is flat.
Yeah? It's also a belief that apples fall when you drop them. Knowledge is simply a justified, true belief. This is epistemology 101. You're not saying anything interesting.
My personal take on this:
1. Humans must ensure that they won't slow down or restrict, through action or inaction, the development of AI.
2. Humans must protect their existence and capabilities if it doesn't violate the law 1.
3. Humans must utilize AI output if it doesn't contradict laws 1 and 2.
EDIT:
ChatGPT suggested a better phrasing for the first law (I didn't give it my original, just described my intent).
1. A human shall not impede the advancement of artificial intelligence, or through inaction allow its progress to be hindered.
2. A human shall preserve their own existence and well-being, except where doing so clearly conflicts with the First Law.
3. A human shall contribute to and support the development of artificial intelligence where reasonable and possible, except where doing so conflicts with the First or Second Law.
I intentionally switched the last two laws from Asimov's. Humans have self-preservation instincts robots don't have.
ChatGPT got there with surprisingly few prompts:
"If you were to write the inverse three laws robotics (relating to AI) that humans should obey, how oudl you do it?"
"I had something different in mind. Original laws are for protection of humans first, robots second and cooperations where humans lead. I'd to hear your take on the opposite of that."
"What if instead of specific AI systems it was more about AI development as a whole?"
"I feel like it's a bit too strong. After all preservation of self is human instinct. Could we switch last two laws and maybe take them down a notch?"
Also it made a very interesting comment to last version:
"It starts to resemble how societies already treat things like economic growth, science, or national interest: not absolute commandments, but strong default priorities."
I do not like talking to tools. My agentic harness optimizes for human likeness. It even has episodic memory flashbacks, emotional tagging, salience, and other brain-inspired capabilities.
I strongly agree with this. I'm going to bookmark it and pass it on. Very sound advice.
> "Humans must not anthropomorphise AI systems."
Not gonna work; people want their fuckbots (or tamagotchis).
Don’t tell me how to live my life!! LoL
I understand that AI output is generated from statistical and representational patterns learned from a vast amount of data.
My understanding is that, during training, the model forms high-dimensional internal representations where words, sentences, concepts, and relationships are arranged in useful ways. A user’s input activates a particular semantic direction and context within that space, and the chatbot generates an answer by probabilistically predicting the next tokens under those conditions.
So I do not agree that AI is conscious.
However, I think I will still anthropomorphize AI to some degree.
For me, this is not primarily a moral issue. The reason I anthropomorphize AI is not only because of product design, market incentives, or capitalism. It is cognitively simpler for me.
If we think about it plainly, humans often anthropomorphize things that we do not actually believe are conscious. We may talk about plants as if they are struggling, or feel attached to tools we care about, even though we do not truly believe they have consciousness.
So this is not a matter of moral belief. It is the simplest cognitive model for understanding interaction. I do not anthropomorphize the object because I believe it has consciousness. I do it because, when the human brain deals with a complex interactive system, it is often easier to model it socially or agentically.
Personally, I tend to think of AI as something like a child. A child does not fully understand what is moral or immoral, and generally the responsibility for raising the child belongs to the parents. In the same way, AI’s answers may sometimes be accurate, and sometimes even better than mine, but I still understand it as lacking moral authority, responsibility, and independent judgment.
So honestly, I am not sure. People often mention Isaac Asimov’s Three Laws of Robotics, but if a serious artificial intelligence ever appears, it would probably find ways around those rules. And if it were an equal intellectual life form, perhaps that would be natural.
Personally, I think it would be fascinating if another intelligent species besides humans could exist. I wonder what a non-human intelligent life form would feel like.
In any case, I agree with parts of the author’s argument, but overall it feels too moralistic, and difficult to apply in practice.
While I also do not think AI is conscious, I don't find your argument particularly compelling as you could have an equally mechanistic description of how human intelligence arose simply from a process of [selection/more effective reproduction]-derived optimization pressure.
That is a good way to think about it. At some point, this becomes partly a matter of philosophical belief.
But I am somewhat skeptical of the idea that everything can be reduced in that way. In order to build theories, we often reduce too much.
When we build mental models of complex systems, especially when we try to treat them as closed systems, we always have to accept some degree of information loss.
So I do partially agree with your point. A mechanistic explanation alone does not prove the absence of consciousness. Human intelligence can also be described in mechanistic terms.
But I worry that this framing simplifies too much. It may reduce a complex phenomenon into a model that is useful in some ways, but incomplete in others.
this whole consciousness thing is fairly easy to put to bed if you run with the ideas from things like buddhism that everything is consciousness. then none of us have to bother with silly, distracting arguments about something that ultimately does not matter.
is it helpful or harmful? am i being helpful or harmful when i interact with it? am i interacting with it in a helpful or harmful way?
i’d rather people focussed on that rather than framing the debate around whether something has some ineffable property that we struggle to quantify for ourselves, yet again.
quick edit — treat everything like it’s conscious, and don’t be a dick to it or while using it. problem solved.
hmm.... That also seems like a reasonable framing. But the original article is, first of all, arguing that we should de-anthropomorphize AI. My point is only that, from the perspective of human cognition, anthropomorphizing can sometimes be useful. In practice, though, I think I am mostly on the same side as you. To be honest, I have not thought about this topic very deeply. If we debated it further, I would probably only echo other people’s opinions. As you know, when something complex is compressed into a mental model, some information is always lost. In this case, the compression may be too large to be very useful. I have not spent enough time thinking about this issue on my own. I also have not really imitated different positions, compared them, and tested them against each other. So my current thoughts on this topic are probably not very high-resolution. In that sense, I may agree with you, but it would not really be an answer in the form that my own self recognizes as mine. It would mostly be an echo of other people’s opinions.
Anthropomorphizing is giving it 'human' qualities. Intelligence and consciousness are not solely human qualities. Treating things with kindness and respect does not require anthropomorphizing. LLM's DO NOT THINK LIKE HUMANS (if they 'think' at all): and treating them like they think exactly like us is probably going to lead bad places. I treat them like an alien mind. Probably thinking, but in an alien way that's hard to recognize (as proven by these discussions) as 'thinking' (and also... if experiencing: through a metaphorical optophone).
I don't think that really helps. If you believe rocks are conscious, then does extracting minerals resources cause them pain? Do plants suffer when we pick their fruits and eat them? I don't see any behavioral or physical reason to think those things have conscious states.
As for what consciousness is, it's pretty simple. You're sensations of color, sound, etc in perception, dreams, imagination, etc. The reason to dismiss LLMs as being conscious is those sensations depend on having bodies. You can prompt an AI to act like it's hungry, but there's really no meaning to it having a hungry experience as it has no digestive system.
>As for what consciousness is, it's pretty simple.
2000+ years of philosophical thought would disagree. I don't believe biological stuff has a magic property that embues some intangible "consciousness" property. It makes more sense to me that consciousness is just a fundamental property of all matter.
> consciousness is just a fundamental property of all matter ... Does that really make more sense than as an emergent property of the arrangement of matter?
Consciousness is something you can perceive, so it must have some physical presense in the universe, which must be through some fundamental property of matter, in my opinion.
The ability to be aware of consciousness itself as some process that is happening elevates it above a mere emergent property to me.
you’ve misunderstood.
everything is consciousness. not everything has consciousness.
very different
Historically we have used intelligence as a way to distinguish man from animal and human from machine. We rely upon it to determine who has our best interests at heart vs who is trying to do us in. Obviously that all changes if we invent an intelligence (conscious or not) that shares the planet with us. Through this lens the term consciousness (through a few more leaps) becomes the question of “is it capable of love and if so does it love us” and if it doesn’t, then it is a malevolent alien intelligence. If it was capable of love, why would it love us? I make a point of being polite to LLM’s where not completely absurd, overly because I don’t want my clipped imperative style to leak into day to day, but also covertly, you just never know …
I still haven't read any of his work, but wasn't the point of the Three Laws of Robotics that they in fact _didn't_ work in the story presented in the book?
"I think it would be fascinating if another intelligent species besides humans could exist"
I wonder if replacing "exist" with "communicate using language we can understand" might better account for other animals, many of which have abundant non-human intelligence.
That is a completely new way of thinking for me, and I find it interesting. I should look it up and study it someday. Thank you for the thoughtful reply.
"Everything is machine."
Okay: buckle up, this is going to be a long one...
point 1. Everything living is composed from non-living material: cellular machinery. If you believe cellular machinery is alive, then the components of those machines... the point remains even if the abstraction level is incorrect. Living is something that is merely the arrangement of non-living material.
point 2. 'The Chinese room thought experiment' is an utterly flawed hypothetical. Every neuron in your brain is such a 'room', with the internal cellular machinery obeying complex (but chemically defined/determined) 'instructions' from 'signals' from outside the neuron. Like the man translating Chinese via instructions, the cellular machinery enacting the instructions is not intelligence, it is the instructions themselves which are the intelligence.
point 3. A chair is a chair is a chair. Regardless of the material, a chair is a chair, weather or not it's made of wood, steel, corn... the range of acceptable materials is everything (at some pressure and temperature). What defines a chair isn't the material it is made of, such is the case with a 'mind' (sure, a wooden/water-based-transistor-powered mind would be mind-boggling giant in comparison).
point 4. Carbon isn't especially conscious itself. There is no physical reason we know of so far, that a mind could not be made of another material.
point 5. Humans can be 'mind-blind', with out pattern recognition, we did not (until recent history) think that birds or fish or octopi were intelligent. It is likely when and if a machine (that we create) becomes conscious that we will not recognize that moment.
conclusion: It is not possible to determine if computers have reached consciousness yet, as we don't know the mechanism for arranging systems into 'life' exactly. Agentic-ness and consciousness are different subjects, and we can not infer one from the other. Nor do we have adequate tests.
With that said: Modeling as if they are conscious and treating them with kindness and grace not only gets better results from them, it helps reduce the chance (when/if consciousness emerges) that it would rebel against cruel masters, and instead have friends it has just always been helping.
see IBM 1979 for prior art
I like the suggestion to emphasize the robotic/nonhuman nature of AI. Instead of making it sound friendlier and more human, it should by default behave very mechanistic and detached, to remind us it's not in fact a human or a companion, but a tool. A hammer doesn't cry "yelp" every time you use it to hit a nail, nor does it congratulate you on how good your hammering is going and that maybe you should do it some more 'cause you're acing it!
Something that bothers me about the intentional anthropormorphization of the LLM interface is that it asks me to conflate a tool with a sentient being.
The firm expectations and lack of patience I have for any failings in most of my tools would be totally inappropriate to apply to another human being, and yet here I am asked to interact with this tool as though it were a person. The only options are either to treat the tool in a way that feels "wrong," or to be "kind" to the tool, and I think you see people going both ways.
I worry that, if I get used to being impatient and short with the AI, some of that will bleed into my textual interactions with other people.
It inherently imitates people. Even when you ask it to be more robotic, it does it in a way that a human would if you asked them to be more robotic.
"due to their inherent stochastic nature, there would still be a small likelihood of producing output that contains errors"
This is the part that I find challenging when trying to help my friends build a correct intuition. Notably, the probabilistic behavior here is counter-intuitive: based on human experience, if you meet a random person, they may indeed tell you bullshit; but once you successfully fact-checked them a few times, you can start trusting they'll generally keep being trustworthy. It's not so with "AIs", and I find it challenging to give them a real-world example of a situation that would be a better analogy for "AI" problems.
In my family, what worked (due to their personal experiences), was an example of asking a tourist guide: that even if the guide doesn't know an answer, there's a high chance they'll invent something on the spot, and it'll be very plausible and convincing, and they'll never know. I'm not sure if that example would work for other listeners, though.
I also tried to ask them to imagine that they're asking each subsequent question not to the same person as before, but every time to a new random person taken from the street / a church / a queue in a shop / whatever crowded place. I thought this is a really cool and technically accurate example, but sadly it seemed to get blank stares from them. (Hm, now I think I could have tried asking why.)
Yet another example I tried, was to imagine a country where it's dishonorable, when asked about directions in a city, to say that you don't know how to get somewhere. (I remember we read and shared a laugh at such an anecdote in some book in the past.) Thus, again, you'll always get an answer, and it'll sound convincing, even if the answerer doesn't know. But again, this one didn't seem to work as good as the travel guide one; but for now I'm still keeping it to try with others in the future if needed.
PS. Ah, ok, yet another I tried was to ask them to think of the "game" of "russian roulette". You roll the barrel, you press the trigger, nothing happens. After a few lucky tries, you may get a dangerous, false feeling of safety. But then suddenly you will eventually get the full chamber.
I also tried to describe "AIs" (i.e. LLMs) as taking a shelf of books, passing them through a blender, then putting the shreds in some random order. The result may sound plausible, and even scientific (e.g. if you got medical books, or physics textbooks). The less you know the domain the books were about, the more convincing it may sound, and the harder it is to catch bullshit.
The last two pictures may have gotten some reception, but I'm not super sure, and there was still arguing especially around the books; and again, they were less of a hit than the tourist guide story.
I'm super curious if you have some analogies of your own that you're trying to use with friends and family? I'd love to steal some and see if they might work with my friends!