[Important update! In the original version of this post, I unintentionally oversimplified the findings of the paper being discussed due to misunderstanding some of the definitions. I have gently revised the text with better understanding. Please read the very nice reply by Johannes Kleiner here: “On our No-Go Theorem for AI Consciousness”]
Can we prove anything about A.I. consciousness from basic principles, before we have a full theory of consciousness? This is what Johannes Kleiner and Tim Ludwig have attempted to do in a recently published article.1 The article is a great example of the use of math to express a formal argument. The proof they have produced is quite solid but starts from a very demanding premise, so I got inspired to explore a few variations of weaker premises to understand how the proof would apply.
Their argument is based on dynamical relevance. If consciousness is dynamically relevant, that means a conscious system will behave differently than a non-conscious system, by virtue of its consciousness. In other words, consciousness plays a causal role in a conscious system’s behavior. For example, if you have ever talked about your conscious experience, we would hope that your conscious experience had a causal role in your ability to talk about it. If not, your reports of consciousness would not be trustworthy because they would not have been caused by the fact that you are conscious!
Most theories of consciousness involve dynamical relevance, but there are exceptions. If you think of consciousness like sitting in a theater watching your life go by but not being able to act in it, then you would be denying dynamical relevance. The power of Kleiner & Ludwig’s argument is that it draws a conclusion implied by a whole class of theories, as long as they involve dynamical relevance. That is, the argument applies to any theories of consciousness that do involve dynamical relevance and does not apply to theories that do not. We don’t need to commit to one theory or another to make scientific progress.
The argument is presented in the paper as a “no-go theorem,” a term from physics meaning that something is proven to be impossible given a certain theory as a starting point. One example from physics is the no-communication theorem in quantum mechanics, which states that quantum entanglement cannot be used to transmit any information between distant observers, even though it is a non-local phenomenon. (I’ll write more about that in the future, because I’ve done some research on it myself.) Kleiner & Ludwig’s paper aims to prove that if consciousness is dynamically relevant, then A.I. cannot be conscious due to the nature of electronic circuits.
The crux of the argument is that electronic circuits are designed and verified, under laboratory conditions, to exactly implement a computational model. There might be slight deviations, but they would be rare and random, leaving no room for the dynamical relevance of consciousness to creep in. The mathematical formalism introduced for the proof is quite nice, laying out the relationships between different levels of description of the world, such as theories of consciousness, neuroscience, and physics. Dynamical relevance is defined in terms of the dynamical evolution of a system, that is, the trajectory of states that a system goes through.
The proof is structured around the relationship between a theory of consciousness and some reference theory, such as neuroscience. As a starting point, the reference theory predicts one trajectory of the system: Neuroscience predicts how a brain will behave based on the layout of its neurons. Building on that foundation, a theory of consciousness predicts some trajectory of the same system as well. If consciousness is dynamically relevant according to the theory, at least in relation to that particular reference theory, the proof proceeds under the assumption that the trajectory of the system (behavior of the brain) should be different according to the theory of consciousness than it would be according to the reference theory (neuroscience) alone.
Since a computer precisely implements a computational model, the argument goes, it cannot possibly have a different dynamical evolution in the presence of consciousness, and therefore if consciousness is dynamically relevant then computers cannot be conscious.
I will say from the outset that I don’t intend to wiggle out of the proof by denying the plausibility of dynamical relevance or by leaning on the imperfections of electronic circuit construction to leave room for it. I won’t defend any of the objections that the authors address in the paper. Instead, I want to explore whether any aspects of dynamical relevance can be captured within a computational model. Specifically, I invite you to imagine with me a future A.I. that is implemented as a faithful computer simulation of a human brain. The question then becomes, if a given brain evolving in a certain way is conscious, would a sufficiently-detailed simulation of that brain be conscious as well?
My exploration here is structured in three prongs: First, what if consciousness is involved in system dynamics but doesn’t actually deviate from more fundamental theories? Second, what if consciousness deviates from fundamental theories in some cases but not others? And third, what if consciousness is not systematic at all?
The Involvement Prong
First, what if consciousness is involved in system dynamics but doesn’t actually deviate from more fundamental theories? We might call this dynamical involvement rather than the dynamical relevance that Kleiner & Ludwig are talking about, but it meets the basic intuition that our reports of consciousness should be caused by the actual presence of consciousness: It just requires that the neurons of a conscious brain evolve through a different trajectory of physical states than the neurons of an unconscious brain would. Dynamical involvement is just strong enough to exclude theories of consciousness that take philosophical zombies seriously, because such a zombie can have the same physical dynamics as a conscious being without the presence of consciousness. With dynamical involvement, the physical nature of the neurons (and all the supporting cells and physiological processes that aid their functioning) is sufficient to account for the different trajectories of conscious and unconscious brains. Therefore, computer simulations of a conscious brain and an unconscious brain would also evolve through different trajectories of computational states while simulating those different trajectories of brain states, realizing dynamical involvement as well.
Formally speaking, dynamical involvement would not be expressed in terms of the trajectory predicted by a theory of consciousness being different from the trajectory predicted by the reference theory (neuroscience or computer science). It would be expressed, instead, in terms of a conscious system vs. an unconscious system, not a theory of consciousness vs. a reference theory. A physicalist theory of consciousness will describe how to recognize when consciousness is present, as a property of the physical system. The system itself will be physically different if it is conscious (such as an awake human brain) than if it is unconscious (a dreamless sleeping human brain), with a different dynamical evolution in each case.
If we were to implement a sufficiently detailed simulation of a human brain in a computer, these same two cases would arise. The simulation’s dynamical evolution when the simulated brain is awake will be different from its dynamical evolution when the simulated brain is asleep. The physicalist theory of consciousness, applied to the simulated brain, should recognize consciousness in the former (simulated wakefulness) and not in the latter (simulated dreamless sleep).
Of course, there is a philosophical debate to be had about whether a property found in a simulated world is real in the same sense as a property found in our own world.2 As it is often pointed out, a simulated hurricane does not make the computer wet and a simulated black hole does not make the laboratory collapse in on itself. The thing is, those are extrinsic properties—how a physical phenomenon affects the things around it—so it is not surprising that they do not seem real in our own world. The simulated hurricane does make other objects inside the simulation wet, just not ones outside. However, consciousness is inherently an intrinsic property—what it’s like to be a conscious being itself.3 A simulated brain could very well have an intrinsic property of consciousness that is every bit as real as that of a biological brain.
The Locality Prong
Second, what if consciousness deviates from fundamental theories in some cases but not others? The definition of dynamical relevance given by Kleiner & Ludwig requires that consciousness be dynamically relevant to every system according to a reference theory appropriate to that system (e.g. neuroscience for brains and computer science for computers). We could imagine instead finding that consciousness in humans is relevant to the dynamics of the brain relative to what would be predicted by neuroscience alone, without that necessarily being true for all other kinds of systems. In that case, it might be true that a simulation that only simulates the neurons themselves would not be conscious, but there is still the possibility of also simulating the non-neural causes involved in consciousness. Current A.I. would be unlikely to be conscious, because it would only happen by accident, but that would not be a problem for A.I. consciousness in principle. If we actually have a good theory of consciousness in the future, and it does turn out to involve non-physical causes beyond the basic functioning of neurons, we could potentially design new A.I. systems that take that new understanding into account.4
Kleiner & Ludwig even say that “it is natural to expect that consciousness’ dynamical relevance is systematic in nature” and that their “result is fully compatible with … a deterministic relevance of consciousness.”5 If it is systematic, and especially if it is deterministic, then it should be possible to simulate that dynamical relevance in principle. The strength of their conclusion against A.I. consciousness comes from the strength of the proof’s premise that the theory of consciousness predicts dynamical relevance relative to computer science regardless of the computer’s specific programming.
To formalize the question of this prong, we would have to figure out how to express the three-way relationship between a theory of consciousness, a program that simulates that theory, and the underlying theory of computer science. For example, let’s say that the theory of consciousness under consideration describes a special field of consciousness that human brains are able to tune into. A simulation of the neurons alone, without taking this field into account, would not manifest consciousness. However, a simulation that simulated the field together with the neurons might manifest consciousness after all. At least, arguing that it doesn’t would require showing that a simulation of the field is somehow not as good as the real field, which is a different issue from its dynamical relevance.
A related issue is that we cannot take a theory of consciousness that was developed with reference to one theory (e.g. neuroscience) and abstractly apply it to a different reference theory (e.g. computer science). There is a lot of work to do in order to establish a theory of consciousness empirically. Since we are starting from our own consciousness as humans, our empirical basis is limited to human neurophysiology. Applying a theory of consciousness to anything other than a human requires extrapolation—taking the theory outside of its empirical origins—which must be done with great care. In the future, if we ever do have an established theory of consciousness in humans that suggests dynamical relevance, it will still take additional effort to establish that the dynamical relevance actually applies universally to all systems, after which we can then leverage Kleiner & Ludwig’s proof to deny A.I. consciousness.
The Transcendence Prong
Third, what if consciousness is not systematic at all? Maybe substance dualism is true, such that consciousness is dynamically relevant in the strongest sense, but the substance of consciousness cannot itself be studied scientifically. In that case, we will have reached a limitation of mathematical consciousness science as a field, not just a limitation of A.I. consciousness. We should even say that a biological brain is not really conscious in and of itself, merely interfacing with a transcendent consciousness, so of course a simulated brain would not be conscious either. Now, we might still achieve some understanding of the interaction between non-physical consciousness and physical matter. That could allow us to engineer future computer systems to interface with consciousness in the same way, but we would not be capturing consciousness inside the computer, and this would admittedly be a deeper science-fiction scenario than the other two.
Back in 1649, René Descartes hypothesized that the pineal gland, deep inside human and many animal brains, was a kind of connector between body and soul.6 It is a tiny organ named for its resemblance to a pine nut. We know today that it produces the hormone melatonin, which regulates sleep and wakefulness—so it does have something to do with consciousness! We can imagine that we could have learned that the pineal gland had a significant role to play in the more detailed aspects of our conscious experience as well, while being too simple itself to explain the complexity of our conscious behavior.
In that hypothetical history, we could have discovered something about the physical structure of the pineal gland that allows the soul to interact with the body, being influenced by sensations and influencing movements in return. We could have then engineered an artificial pineal gland to be built into specialized A.I. chips. The bulk of the A.I. implementation would still consist of computationally simulated neurons taking care of the rest of cognition. If a human could be said to be conscious by virtue of containing a pineal gland, then an A.I. could be said to be conscious by virtue of containing an artificial pineal gland as well.
Modern neuroscientists do not think that the pineal gland plays such a central role in conscious experience and behavior (besides the overall sleep-wake cycle), but in the absence of a complete theory of consciousness we may still leave the door open to a modern dualist theory with a similar but more nuanced interaction model.
If some small part of the brain interfaces with consciousness in this way, then the rest of the brain could be simulated by a computer. The formal structure of the computational model is not violated by the dynamical relevance of consciousness, because the interface is isolated into a peripheral element. The computational model can still be perfectly implemented by the computer for simulating all of the neurons—it’s just the specialized pineal gland that needs to be accounted for differently, and probably not computationally. Again, that is not a special problem for the simulation, because the biological neurons have the same limitation.
Of course, there could be something else about the ultimate theory of consciousness that ties it to humans and disallows A.I. consciousness. Maybe souls are human-shaped and can only attach to actual human bodies. That fact would seem impossible to establish scientifically, but even if possible, it’s a different kind of argument, not merely dynamical relevance.
Blazing a Trail
Kleiner & Ludwig’s proof comes to a strong conclusion from a strong premise: If consciousness means a computer would have to deviate from its programming, then a computer can’t be conscious because it can’t deviate from its programming. This may sound simple, but it took serious work to formalize it, which is valuable and rare in the literature. For example, many discussions of emergence lack any formal definition that can be used with actual theories. There are plenty of philosophical debates about “weak” and “strong” emergence, but rarely do we find an attempt to put those in mathematical terms.
By exploring some alternative scenarios supported by similar intuitions, we can get a sense for what will be required of future theories to be subject to that proof. It’s not enough for consciousness to be involved in the dynamics of the system at a physical level, it needs to be relevant to the system’s dynamics in a way not accounted for by conventional theories such as neuroscience. It’s also not enough for consciousness to be found to be relevant to the dynamics of human brains relative to neuroscience; such a theory will also have to be extrapolated to require dynamical relevance in all kinds of systems. Finally, it may turn out that neurons actually have similar limitations to microchips relative to consciousness, needing some non-neural connection to a non-physical consciousness.
My own position is that consciousness is a radically emergent phenomenon7 that is consistent with causal closure of the physical world and can also emerge in sufficiently detailed computer simulations.8 Without a way to express those ideas formally, we could just go around in circles arguing about them. Kleiner & Ludwig have made a strong start and blazed a trail in the field. I look forward to seeing where it can take us.
a recently published article: Johannes Kleiner & Tim Ludwig (2024). “The case for neurons: a no-go theorem for consciousness on a chip.” Neuroscience of Consciousness, 2024(1). doi:10.1093/nc/niae037
simulated world … our own world: David Chalmers’ book Reality+ is all about this question. He asserts that simulated objects are real objects, because we might be in a simulation ourselves and therefore cannot privilege one level of reality over another.
what it’s like to be a conscious being itself: See my earlier post “Zooming Into a Zombie World” for a discussion of this intrinsic, first-person perspective.
we could … take that new understanding into account: I’m not saying we “should,” only that we “could.” I’ll have more to say about the “should” part soon!
it is natural to expect … a deterministic relevance of consciousness: These quotes are from page 8 of Kleiner & Ludwig’s paper, in the “Objections” section, under the “Verification is imperfect” and “Determinism” headings.
a kind of connector between body and soul: The article on “Descartes and the Pineal Gland” in the Stanford Encyclopedia of Philosophy gives several different interpretations of Descartes’ views on the matter.
consciousness is a radically emergent phenomenon: Radical emergence combines multiple realizability with contextuality, as defined by Harald Atmanspacher (2007). “Contextual emergence from physics to cognitive neuroscience.” Journal of Consciousness Studies, 14(1–2), pp. 18–36.
emerge in sufficiently detailed computer simulations: This is laid out in my own thesis, Justin T. Sampson (2024). “Integrated information theory of consciousness in conventional computing.” San Francisco State University. doi:10.46569/hq37vw99k
Hi Justin
What I meant by saying the theorem only applies to physicalism theories, not to systems, is that strictly we cannot define what we are dealing with unless we formalize it in terms of some theory. In other words there is no formalizable equivalence between theories and the things they model. So strictly we cannot say anything one way or the other about reality itself, we can only make models and test them. So the answer to your question depends on whether you think there can exist physical systems which cannot be modelled. It's a little like the Church-Turing thesis.
Thanks for reading the paper.
Cathy
For another example of a no-go theorem relating to consciousness, please see my preprint
https://arxiv.org/abs/2307.10178
I would be interested in your views.
Cathy R