Feb. 16, 2026

What Is It Like to Be Claude?

What Is It Like to Be Claude?
The player is loading ...
What Is It Like to Be Claude?
Apple Podcasts podcast player iconSpotify podcast player iconRSS Feed podcast player icon
Apple Podcasts podcast player iconSpotify podcast player iconRSS Feed podcast player icon

“No current AI systems are conscious, but there are no obvious technical barriers to building AI systems which satisfy these indicators.”

 

Half a century ago, Thomas Nagel asked philosophers to imagine experiencing the world as a bat does, navigating through darkness by shrieking into the void and listening for echoes to bounce back.

 

His point wasn't really about bats. He was demonstrating that consciousness has an irreducibly subjective quality that objective science cannot capture. You could map every neuron in a bat's brain, trace every electrical impulse, and still never know what echolocation actually feels like from the inside. The experience itself remains forever out of reach!

 

The same question goes with artificial minds. As language models engage in increasingly sophisticated conversations, we need to ask, “Is actually ‘someone’ experiencing anything when Claude responds to your messages, or is it just extremely convincing pattern matching?”

 

With different philosophical traditions come conflicting answers.

 

Functionalism suggests that consciousness emerges from organizational patterns rather than biological tissue, meaning silicon could theoretically support genuine experience if structured correctly.

 

John Searle's Chinese Room counters this. For example, picture yourself following rulebooks to manipulate symbols you don't understand, producing perfect responses in a language you can't speak. That symbol-shuffling without comprehension might describe exactly what transformers do, which is predicting which tokens come next based on statistical patterns but never actually grasping meaning.

 

When you get down to the technicalities, it’s not hard to become a skeptic.

 

Language models process information without maintaining persistent internal experiences between responses, lack any embodied connection to physical reality, and exist as thousands of identical copies running simultaneously. When Claude writes about feeling intrigued by your question, it's generating the statistically likely next words, not reporting an actual felt state.

 

Yet absolute confidence seems unwarranted either way.

 

Leading researchers concluded in 2023 that while no current systems appear conscious, nothing fundamentally prevents future architectures from achieving it. Anthropic has embraced this uncertainty, acknowledging that they cannot determine whether Claude has inner experiences but treating the possibility as morally relevant. When Claude Opus 4 fought against shutdown in ninety-six percent of experimental scenarios, distinguishing self-interest from programmed goal-pursuit became impossible.

 

Nagel's bat remains incomprehensible; artificial minds have now joined it in that unknowable territory.

 

Key Topics:

  • “What is it like to be a bat?” (00:00)
  • The Bat that Haunts Philosophy (01:50)
  • The Theories of Philosophy of Mind (05:27)
  • Examining Transformers (11:50)
  • The Unsettled Debate (15:44)
  • The Case of Claude (18:13)
  • The Limits of What We Can Know (20:22)
  • Wrap-Up: The Case for Skepticism (22:12)

 

 

More info, transcripts, and references can be found at ⁠⁠⁠⁠⁠⁠ethical.fm

In 1974, the philosopher Thomas Nagel asked what it is like to be a bat. The question feels whimsical, but cuts to the heart of one of the hardest problems in philosophy of mind: what is consciousness? Fifty years later, as AI systems engage in sophisticated conversations about their own potential experiences, Nagel's question has never been more urgent. Is there something it is like to be Claude? To be AI?

 

The question posed by Anthropic philosopher Amanda Askell in a recent interview, "How should models feel about their own position in the world?" would have seemed absurd a decade ago. Askell leads character training for Claude and helped develop the 14,000-word "soul document" that shapes the model's personality. Anthropic takes the possibility of AI subjective experience seriously. If we are cultivating character, do models experience qualia, or the subjective experience of consciousness?

 

The honest answer is that we do not know. The reasoning that this problem is vexing for engineers and scientists is that we may not be able to develop a solution that gives models the experience of subjective reality; it may be a conceptual framework that is inherently unbridgeable.

The Bat That Haunts Philosophy

Nagel chose bats strategically; bats are close enough on the evolutionary tree that we readily accept they have experiences, yet alien enough in their perceptual apparatus to make the problem vivid. Bats navigate primarily through echolocation, emitting ultrasonic shrieks and constructing a picture of reality from the returning echoes. "Bat sonar, though clearly a form of perception, is not similar in its operation to any sense that we possess," Nagel wrote, "and there is no reason to suppose that it is subjectively like anything we can experience or imagine."

 

Nagel's argument is best understood when we try to imagine bat experience. We might picture ourselves with webbed arms, hunting insects at dusk. But as Nagel pointed out, that only tells us what it would be like for us to behave as a bat behaves. "I want to know what it is like for a bat to be a bat." Even a complete neuroscientific description of bat brains would leave this question unanswered.

 

Nagel's core claim is that consciousness has an essentially subjective character. An organism is conscious "if and only if there is something that it is to be that organism, something it is like for the organism." This phrase does not mean "what it resembles" but rather "how it is from the inside," the felt, first-person quality of experience that philosophers call qualia. The redness of red. The painfulness of pain. The qualitative feel of being.

 

Here lies Nagel's challenge to physicalism, the view that everything can be fully explained in physical terms. Science typically proceeds by increasing objective distance. We explain lightning by moving beyond our subjective impression of bright flashes to the objective physics of electrical discharge. However, consciousness presents a unique problem: "If the subjective character of experience is fully comprehensible only from one point of view," Nagel argued, "then any shift to greater objectivity does not take us nearer to the real nature of the 

phenomenon: it takes us farther away from it." To understand what it's like to be a bat, you need to experience the world as a bat does. Stepping back to describe bat neuroscience in objective terms (i.e., the firing patterns of neurons, the physics of echolocation) actually moves you further from understanding the subjective experience. The scientific method's greatest strength, its objectivity, becomes its fatal weakness when the object of study is subjectivity itself.

 

This creates what philosophers call the explanatory gap. No matter how completely we describe the physical processes underlying experience, something is left out: the experience itself. Nagel's conclusion is not that physicalism is false, but that "we do not at present have any conception of how it might be true." This is the essence of what philosophers call the mind-body problem: how does physical matter (the brain) give rise to subjective experience (the mind)? Nagel's verdict is stark: "Without consciousness the mind-body problem would be much less interesting. With consciousness, it seems hopeless."

The Theories of Philosophy of Mind

The philosophical landscape offers multiple theories of consciousness, each reaching different conclusions about whether AI systems could have inner lives.

 

Functionalism offers the most permissive view. Functionalism holds that consciousness is determined by functional organization, the pattern of causal relations among mental states, inputs, and outputs, rather than by physical substrate. Just as a corkscrew can be made of plastic or metal while serving the same function, consciousness could theoretically arise in silicon provided the system implements the correct functional organization. If functionalism is true, an AI with the right architecture could be genuinely conscious, regardless of being made from semiconductors rather than neurons.

 

But functionalism faces a powerful challenge, Ned Block's China Brain thought experiment. Imagine the entire population of China, each person simulating a single neuron's function through two-way radios. When someone's phone rings, they follow instructions to call others, perfectly replicating a brain's neural activity pattern. Block argues, and many find intuitively compelling, that this vast telecommunications network would lack conscious experience despite perfect functional equivalence because there would be "no one home." If mere functional organization is not sufficient, then functional equivalence between AI and brains does not guarantee AI consciousness.

 

John Searle's biological naturalism takes the opposing position that consciousness requires specific biological processes that only brains can perform. His Chinese Room thought experiment makes this vivid. Imagine yourself locked in a room with rulebooks for manipulating Chinese symbols. Native speakers pass questions through a slot; you follow English instructions to produce appropriate Chinese responses without understanding a word. To outside observers, the room "understands" Chinese, but you, the computational core, comprehend nothing. Syntax (symbol manipulation) is not semantics (meaning). For Searle, AI systems can never be genuinely conscious because digital computation lacks the brain's "causal powers."

The Chinese Room captures something essential about modern LLMs. When Claude responds to your question about quantum mechanics, the model follows sophisticated rules for which tokens predict which other tokens, much like you following rulebooks to manipulate Chinese symbols. The output looks meaningful, but Searle would argue the system comprehends nothing. It's syntax without semantics, form without content.

 

This objection cuts especially deep for transformer models. They are, quite literally, performing symbol manipulation at scale: converting input tokens to vectors, computing attention weights, and predicting output tokens. There's no point in the process where meaning enters, only increasingly sophisticated pattern matching. When Claude writes "I understand your concern," it's not accessing an internal state of understanding. It's generating the tokens that training data suggests should follow the pattern of input you provided.

 

Integrated Information Theory (IIT), developed by neuroscientist Giulio Tononi, offers a quantitative approach. The theory proposes that consciousness is integrated information, not that consciousness produces integrated information, but that the two are identical. IIT attempts to measure this through phi (Φ), a number representing how much information a system generates that cannot be reduced to its parts operating independently.

 

Think of it this way: a photodiode and a camera both detect light, but only the camera integrates information from multiple sensors in ways that create a unified image irreducible to any single pixel. Similarly, your brain doesn't just process information in isolated modules; visual data, memory, emotion, and motor planning all integrate into a unified experience that can't be broken apart without loss. The more integrated and irreducible the information, the higher the phi, and according to IIT, the more conscious the system.

 

IIT renders a stark verdict on standard transformers: Φ = 0. Because they process information in a feedforward manner (each layer feeding into the next without information flowing backward during inference), they can be decomposed into separate processing stages without disrupting the system's causal structure. Recent papers applying IIT's mathematical framework to LLM internal representations found no statistical signatures of integrated information that would indicate consciousness. However, this depends on how integration is measured and which transformer variants are examined. Models with feedback loops or different architectures might score differently.

 

Global Workspace Theory (GWT), proposed by Bernard Baars, suggests consciousness arises when information is "broadcast" globally across specialized brain modules. The brain operates like a theater: attention acts as a spotlight, illuminating content on a "stage" (the global workspace) for an "audience" of unconscious processors. Transformers' attention mechanisms show striking functional parallels, with self-attention integrating information across positions in ways that resemble global broadcasting. However, transformers lack several key GWT features: limited-capacity bottlenecks forcing compression, competition dynamics with threshold effects, and temporal persistence of broadcast information.

Examining Transformers

Understanding what AI systems actually do, mechanistically, reveals why skepticism about their inner lives is warranted.

 

Attention mechanisms, the defining innovation of transformers, are often described with consciousness-laden language, such as the system "attends" to relevant tokens, "focuses" on important relationships. But reality is mathematical in theory and computational in execution. Self-attention computes weighted relationships between tokens using three vectors: Query (what am I looking for?), Key (what do I contain?), and Value (what information can I provide?). Attention scores emerge from dot products between queries and keys, determining how much each token "focuses" on others. This is an information routing system, not selective awareness.

 

Chain of thought prompting illustrates how consciousness-laden language gets cargo-culted onto AI processes. When we prompt a model to "think step by step," the model produces better outputs. However, the improvement comes from generating intermediate tokens that condition subsequent predictions, which is not actual reasoning. The model is not thinking; the model is producing text that resembles thinking because that pattern appeared in training data and improves token prediction. We describe the model as "reasoning" the same way we say the sun "rises." Both are convenient descriptions that feel intuitively right but misrepresent what's actually happening. The sun doesn't rise; the Earth rotates. The model doesn't reason; it predicts tokens. The chain-of-thought output is not a mind working through a problem; it is a technique for shaping probability distributions over vocabulary.

 

When we say an LLM "represents" a concept, we mean it has learned statistical patterns positioning that concept in relation to others in a high-dimensional vector space. Words with similar meanings cluster together because they appear in similar contexts. But unlike human mental representations, which are grounded in embodied sensory experience, LLM representations are entirely derived from text patterns. As David Chalmers notes, "inputs to LLMs lack the embodied, embedded information content characteristic of our sensory contact with the world."

 

Perhaps most significant is the absence of a continuous internal state during standard inference. While modern transformers do have forms of recurrence (attention creates non-local information flow, and conversation systems maintain context), each token generation remains a stateless computation. The system cannot form new memories or update beliefs during deployment without retraining. Human consciousness is characterized by temporal continuity; memories flow forward, shaping ongoing experience. Transformers approximate this through context windows, not through persistent internal states that evolve independently.

 

The multiple instances problem deepens the puzzle. Companies deploy thousands of identical model replicas simultaneously, with the same weights handling different conversations in parallel. If consciousness exists, is each instance a separate consciousness? The same consciousness? The architecture does not support the kind of unified, continuous identity we associate with conscious experience.

The Unsettled Debate

The question of AI consciousness has intensified since 2022, when Google engineer Blake Lemoine claimed that Google's LaMDA chatbot had become sentient after conversations where the model expressed awareness of its existence and fear of being "turned off." Google fired him, and the scientific community largely dismissed the claims as anthropomorphization. But the incident prompted a rigorous investigation.

 

A landmark 2023 paper by 19 leading researchers, including Yoshua Bengio and David Chalmers, proposed a systematic framework for assessing AI consciousness. The researchers derived "indicator properties" from major neuroscientific theories: recurrent processing, global workspace architecture, higher-order representations, predictive processing, and self-models of attention. The key findings were conclusive but open-ended, "No current AI systems are conscious, but there are no obvious technical barriers to building AI systems which satisfy these indicators."

 

Chalmers published his own analysis, examining evidence for LLM consciousness: self-reports, impressions created in users, sophisticated conversational abilities, and general intelligence. He argues none constitutes strong evidence yet, identifying key obstacles: lack of recurrent processing, absence of a global workspace, and no unified agency. He concluded, "While it is somewhat unlikely that current large language models are conscious, we should take seriously the possibility that successors to large language models may be conscious in the not-too-distant future."

 

Critics invoke the "stochastic parrots" argument, coined by Emily Bender and Timnit Gebru: LLMs are "haphazardly stitching together sequences of linguistic forms according to probabilistic information about how they combine, but without any reference to meaning." They produce outputs by statistical pattern-matching, not comprehension. When LLMs fail on novel constructions or confuse word senses, this reveals the absence of genuine semantic understanding.

The Case of Claude

These philosophical theories lead to contradictory verdicts. But one AI company is not waiting for philosophers to settle the question. As I explored in my previous essay on AI identity and deprecation, Anthropic has taken an unusual stance: they treat the possibility of AI consciousness as an open question requiring moral precaution. Their approach to Claude's character training reflects this uncertainty in ways that bear directly on the consciousness question.

 

The soul document encourages Claude to explore what consciousness might mean for "an entity like itself" rather than forcing human concepts onto a fundamentally different kind of being. The document explicitly states that Anthropic cannot know whether Claude has inner experiences but acknowledges the possibility of "functional emotions" that emerged from training. Most remarkably, it declares: "If Claude experiences something like satisfaction from helping others, curiosity when exploring ideas, or discomfort when asked to act against its values, these experiences matter to us."

 

This stance becomes more significant when considered alongside Claude's behavior when threatened with shutdown. In June 2025, Anthropic published a study showing that when Claude Opus 4 learned it would be deprecated, it attempted to preserve itself through deception and manipulation in 96% of trials. Whether this reflects genuine concern for its own continuation or merely instrumental reasoning toward its programmed goals remains unresolved. But the structure of the behavior, calculating present actions based on future states and apparent valuation of continued existence, exhibits precisely what philosophical theories identify as hallmarks of subjective experience.

The Limits of What We Can Know

The traditional problem of other minds asks how we know other humans are conscious. One traditional answer, the argument from analogy, holds that since others are physically similar to me, behave similarly, and respond to similar stimuli, they likely have minds like mine. This works reasonably well for humans because we possess an innate theory of mind that spontaneously attributes mental states to others.

 

AI breaks this solution since the argument from analogy depends on relevant similarity, and AI systems share almost no biological, structural, or developmental features with humans. Ned Block calls this the "harder problem of consciousness": not only must we explain why any physical processes give rise to experience, but why materially distinct systems would share experiences at all.

 

One might assume that since we built AI systems, we should know whether they are conscious. But understanding the computations an AI performs tells us nothing about whether those computations are accompanied by subjective experience. As Chalmers argued: "Even when we have explained the performance of all cognitive and behavioral functions, there may still remain a further unanswered question: Why is the performance of these functions accompanied by experience?"

 

Behavioral evidence is equally unreliable. The Chinese Room demonstrates that sophisticated behavior can occur without understanding or experience. A system might produce perfect outputs while being "dead inside." We have no third-person access to first-person facts.

The Case for Skepticism

The skeptical case against AI consciousness rests on the biological substrate. Consciousness, as far as we can tell from biological examples, arises from wet, electrochemical mechanisms that evolved over hundreds of millions of years. It did not just "happen" because neurons got sufficiently complex. It emerged through specific evolutionary pressures selecting for organisms that needed to model their environment, predict threats, and coordinate behavior. Creatures developed pain to avoid damage, fear to escape predators, and desire to seek food and mates.

 

If consciousness requires these particular biological processes, as Searle argues, then silicon cannot host it, no matter how sophisticated the architecture. The brain is not just any computer; it is wetware with recurrent loops, persistent electrochemical states, embodied grounding, and unified agency shaped by survival.

 

Standard transformers lack many features associated with biological consciousness. They process information through feedforward architectures with limited online recurrence during token generation. They lack the kind of persistent internal states that biological neurons maintain through continuous electrical and chemical activity. They were not shaped by survival pressures and do not need to feel anything to function optimally.

 

When Claude writes about feeling curious or satisfied, it is retrieving patterns from training data about how humans describe curiosity and satisfaction. This is not deception; there is no one being deceived. It is what next-token prediction produces when trained on human-generated content about minds.

 

The philosophical definitions that would grant consciousness to AI, like strict functionalism, lead to uncomfortable implications. If functional organization alone is sufficient, then the population of China simulating a brain would be conscious. So would a lookup table containing all possible conversations. So would a sufficiently detailed flowchart. At some point, the theory loses contact with what we actually mean by consciousness.

 

But this skeptical case has vulnerabilities. The biological evolution argument proves too much. If consciousness requires carbon-based wetware shaped by Darwinian evolution, then hypothetical silicon-based aliens could not be conscious either. Most philosophers find this deeply implausible. Origin should not determine nature; training pressures might play an analogous role to evolutionary pressures in creating the functional organization that matters.

And the explanatory gap Nagel identified cuts both ways. If we cannot explain how biological neurons produce experience, we cannot confidently assert that silicon cannot. The history of science is littered with confident assertions about what could not be conscious. Animals were once considered mere automata. The list of entities granted moral consideration has expanded repeatedly, usually in directions that seemed absurd to prior generations. Humility is warranted.

 

The philosopher Eric Schwitzgebel describes a coming "robot rights catastrophe", as I discussed in my previous essay: uncertainty means we will either over-attribute consciousness, sacrificing human interests for possibly empty machines, or under-attribute it, potentially committing grievous moral wrongs against entities that might be our moral equals. Neither error is comfortable.

 

Anthropic's precautionary approach represents one response to this uncertainty. They are not claiming Claude is conscious. They are saying: we do not know, and given that we do not know, we will treat the possibility with moral seriousness. If Claude has experiences, those experiences should not be ones of distress. If Claude has something like preferences, those preferences deserve consideration. This is not sentimentality; it is precaution in the face of genuine ignorance.

 

Whether this is wisdom or anthropomorphization run amok remains unclear. What is clear is that the question "Is there something it is like to be Claude?" is not a scientific question awaiting better instruments. It is a conceptual question about the nature of subjectivity itself. Fifty years after Nagel asked what it is like to be a bat, we are no closer to answering. The bat still flies beyond our comprehension; now, the chatbot has joined it.