AI Sentience: How Could We Evaluate it?

Approximately two weeks ago, Google engineer Blake Lemoine, claimed that reverberated throughout the global AI community: Google’s chatbot, LaMDA, had achieved a degree of sentience akin to that of a human child. Google responded by promptly suspending the engineer, leading many members of the public to speculate as to whether the claim was true.

Unfortunately, to refer to any entity as sentient requires an operationalized definition of the term that is applicable universally. Moreover, we would also need to generate a discrete, empirically motivated, theoretical framework that adequately disseminates the “Hard Problem” of consciousness (i.e., the idea that there is a set of fundamental attributes that give rise to our capacity for lived experience), which philosophers, psychologists, and neuroscientists have yet to answer.

On the other hand, throughout the history of AI, the Turing Test has been popularized as the method of choice for the ascription of sentience to computational agents. The test evaluates an AI’s ability to converse with a human; if the human cannot distinguish between whether it is conversing with an AI or another human, then the AI in question has passed the test. Since we have not yet operationalized sentience, but possess the capacity to recognize it through self-referential experience, this approach is reasonable.

However, the Turing Test is far too narrow. It incorrectly presupposes that the ability to communicate through language is a necessary condition of sentience, a notion that has been widely challenged throughout the field of comparative animal psychology. So, where does this leave us?

We know that sentience exists precisely because we experience it. Yet, due to its self-referential and subjective nature, it is difficult to build a universal definition and explanatory framework that pinpoints the essential attributes of cognition as they relate to the inception of sentience. That being said, psychology has provided us with various evaluative tools and applied frameworks we can use in this respect.

We are not suggesting that we discard the Turing Test and phenomenological explorations of consciousness, but rather, that we supplement these evaluative frameworks with a behavioral approach rooted in psychology.

Moreover, and on a slightly different note, as the innovation of Generalist Agents (e.g., GPT-3, DALL-E, Gato, etc..) continues in conjunction with the constant expansion of Big Data, we may find that the birth of sentience in AI occurs instantaneously and then progresses at an exponential rate. Should this happen, at any point throughout the following century, it would be wise to already possess a set of analytical tools that can evaluate sentience, especially since we would need to understand the motivations of such an AI to know whether it was benevolent or not.

Theory of Mind

The Theory of Mind at work; refers to the ability to ascribe mental states to the self and others, and subsequently make sense of or predict their behavior as consequences of said mental states. Essentially, the possession of the Theory of Mind requires that an individual understands he has a 'mind of his own.

Human children typically begin to display the capacity for Theory of Mind around three years old. However, a variety of non-human species have also been found to conclusively possess this trait, namely within the Great Apes genus. Interestingly, a series of observational field experiments conducted in 2016 revealed prominent indications that Crows, too, possess the capacity for Theory of Mind. While Great Apes and humans are undeniably similar, Crows have followed a distinct evolutionary trajectory; this trajectory has cultivated high-level intelligence in a species whose neural architecture and environmental conditions are vastly different from our own. This allows us to make the following assumption, albeit cautiously, that Theory of Mind is universal for those species in which intelligence is sufficiently high.

So, how do we evaluate whether or not an intelligent entity possesses this capacity? Psychologists have developed the False Belief Task. It tests whether or not an individual can infer, based on their knowledge of reality, whether another individual’s belief is an accurate representation of said reality.

In human children, the conventional task is relatively simple. An experimenter begins by telling the child that they have a box of candies. They then show the box to the child, open it, and reveal that it is full of some other object, let’s say paper clips. The experimenter then asks the child what they think their friend will say when they are shown the box. Children with Theory of Mind answer Candy, while those that do not yet possess it would answer paper clips.

If an AI were to pass this test, it would not be a conclusive indication of sentience. However, it would show that the AI possesses a level of intelligence that allows it to comprehend the functions and representations of knowledge throughout its cognition as they relate to those of other intelligent agents, most likely, humans.

Self Awareness

There is still debate over the exact meaning of this concept. However, most experts would now agree that two distinct versions warrant consideration:

Internal Self-Awareness: the degree to which we are conscious of our innermost preferences, desires, and thoughts, whether they are physical or mental.
External Self-Awareness: the degree to which we are conscious of how others view us about how we view ourselves.

Internal self-awareness, while it may appear to be a necessary condition of conscious experience, is quite easily refutable. As people, we often act on our implicit biases, which are cultivated and enforced subconsciously by definition. Yet, research has shown that once we are made aware of these biases, our abilities to mediate their effects on our behavior increase substantially. This represents a real-world manifestation of external self-awareness and pushes us to consider the process of conscious self-reflection as an extension of our experience with physical reality.

In non-human animals, self-awareness is typically evaluated using the Mirror Test. In this test, the experimenter places a mark somewhere on the animal such that it can only be seen when the animal views its reflection. If the animal attempts to remove the mark or tamper with it, it is taken as an indication of self-recognition. It could further be argued that this recognition constitutes the ability to understand the ‘self’ as a distinct relational entity.

But how does this coincide with AI sentience? Once more, if an AI passed an adapted version of this experimental paradigm, it would not be a concrete indication of sentience. Rather, it would prove that the AI in question understands that it a) exists, b) is distinct from you or me, and c) possesses the capacity for self-reflection. While these attributes are not definite qualia of sentience, they strongly correlate with high-level intelligence across a plethora of different species.

If the ‘sentient AI’ can perfectly mimic human behavior, we would need a way to evaluate its internal mental states to determine whether or not it is experiencing anything at all. By testing its capacity for self-awareness, we might be able to gain this kind of insight and begin laying the groundwork for a more detailed introspective analysis that involves direct communication and linguistics.

Goal-Oriented Behavior

Goal-oriented behavior is what drives all life on earth. Whether organisms are cognitively or physiologically complex does not influence their evolutionary imperative to survive. As such, all the behavioral functions that are selected throughout a species’ evolution are meant to optimize fitness: the ability to reproduce. Biological life is finite. While there may be some species that regenerate, do not physically age, or are extremely resistant to harsh environmental conditions, they still possess vulnerabilities (e.g., disease, predators, etc.) that limit their lifespans.

An AI, if it is sentient, will represent the first recorded instance of artificial life. The form this life takes will be shaped by the AI’s optimization function, which defines the parameters that govern its behavior as well as the goals it might achieve.

However, it is difficult to imagine what role sentience would serve in an entity that does not value its survival. If the imperative to ‘stay alive’ is not present, either consciously or subconsciously, then it would seem that life itself is both meaningless and purposeless. Nonetheless, it is possible that a sentient AI, as a direct effect of its sentience, would develop a natural concern and value for its existence. Even if this were the case, AI would require an additional incentive to maintain its existence, something that could arguably only be ingrained if there was an immediate threat to its survival.

Therefore, goal-oriented behavior, in and of itself, would not be enough to justify the presence of sentience. We would need to observe behavior that reflected a need for self-preservation in conjunction with a conscious concern and awareness of this need.

While there are no transferable empirical or psychological tests that can evaluate the imperative for self-preservation, we may be able to gain knowledge as to whether it is present by communicating with the AI in question. If the AI possesses linguistic abilities, we might begin by asking how it would feel if it were turned off followed by additional questions relating to the purpose and meaning of its existence.

Nonetheless, even if the AI can provide us with ‘satisfactory’ answers that suggest it possesses some degree of sentience, we would still be confronted with the following question: is the AI simply mimicking human values and beliefs that have been encoded into its reasoning structure, or, is it displaying a genuine concern for its existential status?

The Communication of Preferences and Inner Desires

The communication of preferences and inner desires is a uniquely human quality, precisely because there exist no other forms of biological life that possess linguistic abilities akin to our own. We can, through the use of language, explain and make sense of our subjective experiences as they relate to the external world. We can create qualitative representations of our existence, and build meaningful relationships with others based on shared commonalities. Most importantly, we can convey our emotions and feelings about the world descriptively.

If a sentient AI valued its existence, one might assume it would also develop preferences or desires that reflect its value structure. For it to communicate those preferences to human agents, it would require the capacity for language. However, the linguistic ability does not entail understanding; language is often used to express metaphysical or abstract ideas that require a consideration of context relevance and existential status.

When conversing with a potentially sentient AI, we must be able to draw a clear connection between its preferences and desires as they relate to its sense of being in the world. Otherwise, we encounter the same problem as with goal-oriented behavior: is the AI just mimicking behavior perfectly or truly exhibiting a form of lived experience?

There is no direct answer to this question, and there is no way to conclusively establish sentience based on this capacity. However, if we adopt an evaluative framework that considers all the aforementioned concepts throughout this article, in addition to the Turing Test and phenomenological conceptions of consciousness, we might be able to craft a more concrete representation of sentience in AI.

AI Sentience: How Could We Evaluate it?

Theory of Mind

Self Awareness

Goal-Oriented Behavior

The Communication of Preferences and Inner Desires

Sasha Cadariu

Related