This is the html version of the file http://www-formal.stanford.edu/jmc/ailogic.pdf.
G o o g l e automatically generates html versions of documents as we crawl the web.
To link to or bookmark this page, use the following url: http://www.google.com/search?q=cache:ebrNIDCyBJgJ:www-formal.stanford.edu/jmc/ailogic.pdf+%22artificial+intelligence%22+%22common+sense%22+site:edu+pdf&hl=en&client=firefox-a
Google is not affiliated with the authors of this page nor responsible for its content.
These search terms have been highlighted: artificial intelligence common sense
These terms only appear in links pointing to this page: pdf
Page 1
ARTIFICIAL INTELLIGENCE, LOGIC
AND FORMALIZING COMMON SENSE
John McCarthy
Computer Science Department
Stanford University
Stanford, CA 94305
jmc@cs.stanford.edu
http://www-formal.stanford.edu/jmc/
1990
1 Introduction
This is a position paper about the relations among artificial intelligence (AI),
mathematical logic and the formalization of common-sense knowledge and
reasoning. It also treats other problems of concern to both AI and philosophy.
I thank the editor for inviting it. The position advocated is that philosophy
can contribute to AI if it treats some of its traditional subject matter in
more detail and that this will advance the philosophical goals also. Actual
formalisms (mostly first order languages) for expressing common-sense facts
are described in the references.
Common-sense knowledge includes the basic facts about events (including
actions) and their effects, facts about knowledge and how it is obtained, facts
about beliefs and desires. It also includes the basic facts about material
objects and their properties.
One path to human-level AI uses mathematical logic to formalize common-
sense knowledge in such a way that common-sense problems can be solved
by logical reasoning. This methodology requires understanding the common-
sense world well enough to formalize facts about it and ways of achieving
goals in it. Basing AI on understanding the common-sense world is different
1
Page 2
from basing it on understanding human psychology or neurophysiology. This
approach to AI, based on logic and computer science, is complementary to
approaches that start from the fact that humans exhibit intelligence, and
that explore human psychology or human neurophysiology.
This article discusses the problems and difficulties, the results so far, and
some improvements in logic and logical languages that may be required to
formalize common sense. Fundamental conceptual advances are almost cer-
tainly required. The object of the paper is to get more help for AI from
philosophical logicians. Some of the requested help will be mostly philosoph-
ical and some will be logical. Likewise the concrete AI approach may fertilize
philosophical logic as physics has repeatedly fertilized mathematics.
There are three reasons for AI to emphasize common-sense knowledge
rather than the knowledge contained in scientific theories.
(1) Scientific theories represent compartmentalized knowledge. In pre-
senting a scientific theory, as well as in developing it, there is a common-sense
pre-scientific stage. In this stage, it is decided or just taken for granted what
phenomena are to be covered and what is the relation between certain formal
terms of the theory and the common-sense world. Thus in classical mechan-
ics it is decided what kinds of bodies and forces are to be used before the
differential equations are written down. In probabilistic theories, the sample
space is determined. In theories expressed in first order logic, the predicate
and function symbols are decided upon. The axiomatic reasoning techniques
used in mathematical and logical theories depend on this having been done.
However, a robot or computer program with human-level intelligence will
have to do this for itself. To use science, common sense is required.
Once developed, a scientific theory remains imbedded in common sense.
To apply the theory to a specific problem, common-sense descriptions must
be matched to the terms of the theory. For example, d =
1
2
gt
2
does not in
itself identify d as the distance a body falls in time t and identify g as the
acceleration due to gravity. (McCarthy and Hayes 1969) uses the situation
calculus discussed in that paper to imbed the above formula in a formula
describing the common-sense situation, for example
dropped(x,s) ∧ height(x,s) = h ∧ d =
1
2
gt
2
∧ d < h
⊃
∃s (F(s,s ) ∧ time(s ) = time(s) + t ∧ height(x,s ) = h − d).
(1)
Here x is the falling body, and we are presuming a language in which
2
Page 3
the functions height, time, etc. are formalized in a way that corresponds to
what the English words suggest. s and s denote situations as discussed in
that paper, and F(s,s ) asserts that the situation s is in the future of the
situation s.
(2) Common-sense reasoning is required for solving problems in the common-
sense world. From the problem solving or goal-achieving point of view, the
common-sense world is characterized by a different informatic situation than
that within any formal scientific theory. In the typical common-sense infor-
matic situation, the reasoner doesn’t know what facts are relevant to solving
his problem. Unanticipated obstacles may arise that involve using parts of
his knowledge not previously thought to be relevant.
(3) Finally, the informal metatheory of any scientific theory has a common-
sense informatic character. By this I mean the thinking about the structure of
the theory in general and the research problems it presents. Mathematicians
invented the concept of a group in order to make previously vague parallels
between different domains into a precise notion. The thinking about how to
do this had a common-sense character.
It might be supposed that the common-sense world would admit a con-
ventional scientific theory, e.g. a probabilistic theory. But no one has yet
developed such a theory, and AI has taken a somewhat different course that
involves nonmonotonic extensions to the kind of reasoning used in formal
scientific theories. This seems likely to work better.
Aristotle, Leibniz, Boole and Frege all included common-sense knowledge
when they discussed formal logic. However, formalizing much of common-
sense knowledge and reasoning proved elusive, and the twentieth century
emphasis has been on formalizing mathematics. Some important philoso-
phers, e.g. Wittgenstein, have claimed that common-sense knowledge is un-
formalizable or mathematical logic is inappropriate for doing it. Though it is
possible to give a kind of plausibility to views of this sort, it is much less easy
to make a case for them that is well supported and carefully worked out. If a
common-sense reasoning problem is well presented, one is well on the way to
formalizing it. The examples that are presented for this negative view bor-
row much of their plausibility from the inadequacy of the specific collections
of predicates and functions they take into consideration. Some of their force
comes from not formalizing nonmonotonic reasoning, and some may be due
to lack of logical tools still to be discovered. While I acknowledge this opin-
ion, I haven’t the time or the scholarship to deal with the full range of such
arguments. Instead I will present the positive case, the problems that have
3
Page 4
arisen, what has been done and the problems that can be foreseen. These
problems are often more interesting than the ones suggested by philosophers
trying to show the futility of formalizing common sense, and they suggest
productive research programs for both AI and philosophy.
In so far as the arguments against the formalizability of common-sense
attempt to make precise intuitions of their authors, they can be helpful in
identifying problems that have to be solved. For example, Hubert Dreyfus
(1972) said that computers couldn’t have “ambiguity tolerance” but didn’t
offer much explanation of the concept. With the development of nonmono-
tonic reasoning, it became possible to define some forms of ambiguity toler-
ance and show how they can and must be incorporated in computer systems.
For example, it is possible to make a system that doesn’t know about possi-
ble de re/de dicto ambiguities and has a default assumption that amounts to
saying that a reference holds both de re and de dicto. When this assumption
leads to inconsistency, the ambiguity can be discovered and treated, usually
by splitting a concept into two or more.
If a computer is to store facts about the world and reason with them,
it needs a precise language, and the program has to embody a precise idea
of what reasoning is allowed, i.e. of how new formulas may be derived from
old. Therefore, it was natural to try to use mathematical logical languages to
express what an intelligent computer program knows that is relevant to the
problems we want it to solve and to make the program use logical inference in
order to decide what to do. (McCarthy 1959) contains the first proposals to
use logic in AI for expressing what a program knows and how it should reason.
(Proving logical formulas as a domain for AI had already been studied by
several authors).
The 1959 paper said:
The advice taker is a proposed program for solving problems
by manipulating sentences in formal languages. The main differ-
ence between it and other programs or proposed programs for ma-
nipulating formal languages (the Logic Theory Machine of Newell,
Simon and Shaw and the Geometry Program of Gelernter) is that
in the previous programs the formal system was the subject mat-
ter but the heuristics were all embodied in the program. In this
program the procedures will be described as much as possible
in the language itself and, in particular, the heuristics are all so
described.
4
Page 5
The main advantages we expect the advice taker to have is
that its behavior will be improvable merely by making state-
ments to it, telling it about its symbolic environment and what
is wanted from it. To make these statements will require little if
any knowledge of the program or the previous knowledge of the
advice taker. One will be able to assume that the advice taker
will have available to it a fairly wide class of immediate logical
consequences of anything it is told and its previous knowledge.
This property is expected to have much in common with what
makes us describe certain humans as having common sense. We
shall therefore say that a program has common sense if it auto-
matically deduces for itself a sufficiently wide class of immediate
consequences of anything it is told and what it already knows.
The main reasons for using logical sentences extensively in AI are better
understood by researchers today than in 1959. Expressing information in
declarative sentences is far more modular than expressing it in segments of
computer program or in tables. Sentences can be true in much wider contexts
than specific programs can be useful. The supplier of a fact does not have to
understand much about how the receiver functions, or how or whether the
receiver will use it. The same fact can be used for many purposes, because
the logical consequences of collections of facts can be available.
The advice taker prospectus was ambitious in 1959, would be considered
ambitious today and is still far from being immediately realizable. This is
especially true of the goal of expressing the heuristics guiding the search for
a way to achieve the goal in the language itself. The rest of this paper is
largely concerned with describing what progress has been made, what the
obstacles are, and how the prospectus has been modified in the light of what
has been discovered.
The formalisms of logic have been used to differing extents in AI. Most
of the uses are much less ambitious than the proposals of (McCarthy 1959).
We can distinguish four levels of use of logic.
1. A machine may use no logical sentences—all its “beliefs” being implicit
in its state. Nevertheless, it is often appropriate to ascribe beliefs and goals
to the program, i.e. to remove the above sanitary quotes, and to use a
principle of rationality—It does what it thinks will achieve its goals. Such
ascription is discussed from somewhat different points of view in (Dennett
1971), (McCarthy 1979a) and (Newell 1981). The advantage is that the intent
5
Page 6
of the machine’s designers and the way it can be expected to behave may be
more readily described intentionally than by a purely physical description.
The relation between the physical and the intentional descriptions is most
readily understood in simple systems that admit readily understood descrip-
tions of both kinds, e.g. thermostats. Some finicky philosophers object to
this, contending that unless a system has a full human mind, it shouldn’t be
regarded as having any mental qualities at all. This is like omitting the num-
bers 0 and 1 from the number system on the grounds that numbers aren’t
required to count sets with no elements or one element. Indeed if your main
interest is the null set or unit sets, numbers are irrelevant. However, if your
interest is the number system you lose clarity and uniformity if you omit
0 and 1. Likewise, when one studies phenomena like belief, e.g. because
one wants a machine with beliefs and which reasons about beliefs, it works
better not to exclude simple cases from the formalism. One battle has been
over whether it should be forbidden to ascribe to a simple thermostat the
belief that the room is too cold. (McCarthy 1979a) says much more about
ascribing mental qualities to machines, but that’s not where the main action
is in AI.
2. The next level of use of logic involves computer programs that use
sentences in machine memory to represent their beliefs but use other rules
than ordinary logical inference to reach conclusions. New sentences are often
obtained from the old ones by ad hoc programs. Moreover, the sentences
that appear in memory belong to a program-dependent subset of the logical
language being used. Adding certain true sentences in the language may even
spoil the functioning of the program. The languages used are often rather
unexpressive compared to first order logic, for example they may not admit
quantified sentences, or they may use a different notation from that used
for ordinary facts to represent “rules”, i.e. certain universally quantified
implication sentences. Most often, conditional rules are used in just one
direction, i.e. contrapositive reasoning is not used. Usually the program
cannot infer new rules; rules must have all been put in by the “knowledge
engineer”. Sometimes programs have this form through mere ignorance, but
the usual reason for the restriction is the practical desire to make the program
run fast and deduce just the kinds of conclusions its designer anticipates. We
believe the need for such specialized inference will turn out to be temporary
and will be reduced or eliminated by improved ways of controlling general
inference, e.g. by allowing the heuristic rules to be also expressed as sentences
6
Page 7
as promised in the above extract from the 1959 paper.
3. The third level uses first order logic and also logical deduction. Typ-
ically the sentences are represented as clauses, and the deduction methods
are based on J. Allen Robinson’s (1965) method of resolution. It is common
to use a theorem prover as a problem solver, i.e. to determine an x such that
P(x) as a byproduct of a proof of the formula ∃xP(x). This level is less used
for practical purposes than level two, because techniques for controlling the
reasoning are still insufficiently developed, and it is common for the program
to generate many useless conclusions before reaching the desired solution.
Indeed, unsuccessful experience (Green 1969) with this method led to more
restricted uses of logic, e.g. the STRIPS system of (Nilsson and Fikes 1971).
The commercial “expert system shells”, e.g. ART, KEE and OPS-5,
use logical representation of facts, usually ground facts only, and separate
facts from rules. They provide elaborate but not always adequate ways of
controlling inference.
In this connection it is important to mention logic programming, first
introduced in Microplanner (Sussman et al., 1971) and from different points
of view by Robert Kowalski (1979) and Alain Colmerauer in the early 1970s.
A recent text is (Sterling and Shapiro 1986). Microplanner was a rather
unsystematic collection of tools, whereas Prolog relies almost entirely on one
kind of logic programming, but the main idea is the same. If one uses a
restricted class of sentences, the so-called Horn clauses, then it is possible
to use a restricted form of logical deduction. The control problem is then
much eased, and it is possible for the programmer to anticipate the course
the deduction will take. The price paid is that only certain kinds of facts are
conveniently expressed as Horn clauses, and the depth first search built into
Prolog is not always appropriate for the problem.
Even when the relevant facts can be expressed as Horn clauses supple-
mented by negation as failure, the reasoning carried out by a Prolog program
may not be appropriate. For example, the fact that a sealed container is ster-
ile if all the bacteria in it are dead and the fact that heating a can kills a
bacterium in the can are both expressible as Prolog clauses. However, the
resulting program for sterilizing a container will kill each bacterium individ-
ually, because it will have to index over the bacteria. It won’t reason that
heating the can kills all the bacteria at once, because it doesn’t do universal
generalization.
Here’s a Prolog program for testing whether a container is sterile. The
7
Page 8
predicate symbols have obvious meanings.
not(P) :- P, !, fail.
not(P).
sterile(X) :- not(nonsterile(X)).
nonsterile(X) :-
bacterium(Y), in(Y,X), not(dead(Y)).
hot(Y) :- in(Y,X), hot(X).
dead(Y) :- bacterium(Y), hot(Y).
bacterium(b1).
bacterium(b2).
bacterium(b3).
bacterium(b4).
in(b1,c1).
in(b2,c1).
in(b3,c2).
in(b4,c2).
hot(c1).
Giving Prolog the goal sterile(c1) and sterile(c2) gives the answers yes
and no respectively. However, Prolog has indexed over the bacteria in the
containers.
The following is a Prolog program that can verify whether a sequence
of actions, actually just heating it, will sterilize a container. It involves
introducing situations analogous to those discussed in (McCarthy and Hayes
1969).
not(P) :- P, !, fail.
not(P).
sterile(X,S) :- not(nonsterile(X,S)).
nonsterile(X,S) :-
bacterium(Y), in(Y,X), not(dead(Y,S)).
hot(Y,S) :- in(Y,X), hot(X,S).
dead(Y,S) :- bacterium(Y), hot(Y,S).
bacterium(b1).
bacterium(b2).
bacterium(b3).
bacterium(b4).
in(b1,c1).
in(b2,c1).
in(b3,c2).
in(b4,c2).
8
Page 9
hot(C,result(heat(C),S)).
When the program is given the goals sterile(c1,heat(c1,s0)) and sterile(c2,heat(c1,s0))
it answers yes and no respectively. However, if it is given the goal sterile(c1,s),
it will fail because Prolog lacks what logic programmers call “constructive
negation”.
The same facts as are used in the first Prolog program can be expressed
in in a first order language as follows.
(∀X)(sterile(X) ≡ (∀Y )(bacterium(Y ) ∧ in(Y,X) ⊃ dead(Y ))),
(∀XY )(hot(X) ∧ in(Y,X) ⊃ hot(Y )),
(∀Y )(bacterium(Y ) ∧ hot(Y ) ⊃ dead(Y )),
and
hot(a).
However, from them we can prove sterile(a) without having to index over
the bacteria.
Expressibility in Horn clauses, whether supplemented by negation as fail-
ure or not, is an important property of a set of facts and logic programming
has been successfully used for many applications. However, it seems unlikely
to dominate AI programming as some of its advocates hope.
Although third level systems express both facts and rules as logical sen-
tences, they are still rather specialized. The axioms with which the programs
begin are not general truths about the world but are sentences whose mean-
ing and truth is limited to the narrow domain in which the program has to
act. For this reason, the “facts” of one program usually cannot be used in a
database for other programs.
4. The fourth level is still a goal. It involves representing general facts
about the world as logical sentences. Once put in a database, the facts
can be used by any program. The facts would have the neutrality of purpose
characteristic of much human information. The supplier of information would
not have to understand the goals of the potential user or how his mind works.
The present ways of “teaching” computer programs by modifying them or
directly modifying their databases amount to “education by brain surgery”.
A key problem for achieving the fourth level is to develop a language for a
general common-sense database. This is difficult, because the common-sense
9
Page 10
informatic situation is complex. Here is a preliminary list of features and
considerations.
1. Entities of interest are known only partially, and the information about
entities and their relations that may be relevant to achieving goals cannot
be permanently separated from irrelevant information. (Contrast this with
the situation in gravitational astronomy in which it is stated in the informal
introduction to a lecture or textbook that the chemical composition and
shape of a body are irrelevant to the theory; all that counts is the body’s
mass, and its initial position and velocity.)
Even within gravitational astronomy, non-equational theories arise and
relevant information may be difficult to determine. For example, it was
recently proposed that periodic extinctions discovered in the paleontological
record are caused by showers of comets induced by a companion star to the
sun that encounters and disrupts the Oort cloud of comets every time it
comes to perihelion. This theory is qualitative because neither the orbit of
the hypothetical star nor those of the comets are available.
2. The formalism has to be epistemologically adequate, a notion intro-
duced in (McCarthy and Hayes 1969). This means that the formalism must
be capable of representing the information that is actually available, not
merely capable of representing actual complete states of affairs.
For example, it is insufficient to have a formalism that can represent
the positions and velocities of the particles in a gas. We can’t obtain that
information, our largest computers don’t have the memory to store it even if
it were available, and our fastest computers couldn’t use the information to
make predictions even if we could store it.
As a second example, suppose we need to be able to predict someone’s
behavior. The simplest example is a clerk in a store. The clerk is a complex
individual about whom a customer may know little. However, the clerk can
usually be counted on to accept money for articles brought to the counter,
wrap them as appropriate and not protest when the customer then takes
the articles from the store. The clerk can also be counted on to object if
the customer attempts to take the articles without paying the appropriate
price. Describing this requires a formalism capable of representing infor-
mation about human social institutions. Moreover, the formalism must be
capable of representing partial information about the institution, such as a
three year old’s knowledge of store clerks. For example, a three year old
doesn’t know the clerk is an employee or even what that means. He doesn’t
10
Page 11
require detailed information about the clerk’s psychology, and anyway this
information is not ordinarily available.
The following sections deal mainly with the advances we see as required
to achieve the fourth level of use of logic in AI.
2 Formalized Nonmonotonic Reasoning
It seems that fourth level systems require extensions to mathematical logic.
One kind of extension is formalized nonmonotonic reasoning, first proposed
in the late 1970s (McCarthy 1977, 1980, 1986), (Reiter 1980), (McDermott
and Doyle 1980), (Lifschitz 1989a). Mathematical logic has been monotonic
in the following sense. If we have A p and A ⊂ B, then we also have B p.
If the inference is logical deduction, then exactly the same proof that
proves p from A will serve as a proof from B. If the inference is model-
theoretic, i.e. p is true in all models of A, then p will be true in all models
of B, because the models of B will be a subset of the models of A. So we
see that the monotonic character of traditional logic doesn’t depend on the
details of the logical system but is quite fundamental.
While much human reasoning is monotonic, some important human common-
sense reasoning is not. We reach conclusions from certain premisses that we
would not reach if certain other sentences were included in our premisses.
For example, if I hire you to build me a bird cage, you conclude that it is
appropriate to put a top on it, but when you learn the further fact that my
bird is a penguin you no longer draw that conclusion. Some people think
it is possible to try to save monotonicity by saying that what was in your
mind was not a general rule about birds flying but a probabilistic rule. So
far these people have not worked out any detailed epistemology for this ap-
proach, i.e. exactly what probabilistic sentences should be used. Instead AI
has moved to directly formalizing nonmonotonic logical reasoning. Indeed it
seems to me that when probabilistic reasoning (and not just the axiomatic
basis of probability theory) has been fully formalized, it will be formally
nonmonotonic.
Nonmonotonic reasoning is an active field of study. Progress is often
driven by examples, e.g. the Yale shooting problem (Hanks and McDer-
mott 1986), in which obvious axiomatizations used with the available rea-
soning formalisms don’t seem to give the answers intuition suggests. One
direction being explored (Moore 1985, Gelfond 1987, Lifschitz 1989a) in-
11
Page 12
volves putting facts about belief and knowledge explicitly in the axioms
—even when the axioms concern nonmental domains. Moore’s classical ex-
ample (now 4 years old) is “If I had an elder brother I’d know it.”
Kraus and Perlis (1988) have proposed to divide much nonmonotonic rea-
soning into two steps. The first step uses Perlis’s (1988) autocircumscription
to get a second order formula characterizing what is possible. The second
step involves default reasoning to choose what is normally to be expected
out of the previously established possibilities. This seems to be a promising
approach.
(Ginsberg 1987) collects the main papers up to 1986. Lifschitz (1989c)
summarizes some example research problems of nonmonotonic reasoning.
3 Some Formalizations and their Problems
(McCarthy 1986) discusses several formalizations, proposing those based on
nonmonotonic reasoning as improvements of earlier ones. Here are some.
1. Inheritance with exceptions. Birds normally fly, but there are excep-
tions, e.g. ostriches and birds whose feet are encased in concrete. The first
exception might be listed in advance, but the second has to be derived or
verified when mentioned on the basis of information about the mechanism of
flying and the properties of concrete.
There are many ways of nonmonotonically axiomatizing the facts about
which birds can fly. The following axioms using a predicate ab standing for
“abnormal” seem to me quite straightforward.
(1)
(∀x)(¬ab(aspect1(x)) ⊃ ¬flies(x))
Unless an object is abnormal in aspect1, it can’t fly.
It wouldn’t work to write ab(x) instead of ab(aspect1(x)), because we
don’t want a bird that is abnormal with respect to its ability to fly to be
automatically abnormal in other respects. Using aspects limits the effects of
proofs of abnormality.
(2)
(∀x)(bird(x) ⊃ ab(aspect1(x))).
(3)
(∀x)(bird(x) ∧ ¬ab(aspect2(x)) ⊃ flies(x)).
Unless a bird is abnormal in aspect2, it can fly.
When these axioms are combined with other facts about the problem,
the predicate ab is then to be circumscribed, i.e. given its minimal extent
12
Page 13
compatible with the facts being taken into account. This has the effect
that a bird will be considered to fly unless other axioms imply that it is
abnormal in aspect2. (2) is called a cancellation of inheritance axiom, because
it explicitly cancels the general presumption that objects don’t fly. This
approach works fine when the inheritance hierarchy is given explicitly. More
elaborate approaches, some of which are introduced in (McCarthy 1986)
and improved in (Haugh 1988), are required when hierarchies with indefinite
numbers of sorts are considered.
2. (McCarthy 1986) contains a similar treatment of the effects of actions
like moving and painting blocks using the situation calculus. Moving and
painting are axiomatized entirely separately, and there are no axioms saying
that moving a block doesn’t affect the positions of other blocks or the colors
of blocks. A general “common-sense law of inertia”
(∀pes)(holds(p,s) ∧ ¬ab(aspect1(p,e,s))
⊃ holds(p,result(e,s))),
(2)
asserts that a fact p that holds in a situation s is presumed to hold in the
situation result(e,s) that results from an event e unless there is evidence
to the contrary. Unfortunately, Lifschitz (1985 personal communication)
and Hanks and McDermott (1986) showed that simple treatments of the
common-sense law of inertia admit unintended models. Several authors have
given more elaborate treatments, but in my opinion, the results are not yet
entirely satisfactory. The best treatment so far seems to be that of (Lifschitz
1987).
4 Ability, Practical Reason and Free Will
An AI system capable of achieving goals in the common-sense world will have
to reason about what it and other actors can and cannot do. For concreteness,
consider a robot that must act in the same world as people and perform tasks
that people give it. Its need to reason about its abilities puts the traditional
philosophical problem of free will in the following form. What view shall we
build into the robot about its own abilities, i.e. how shall we make it reason
about what it can and cannot do? (Wishing to avoid begging any questions,
by reason we mean compute using axioms, observation sentences, rules of
inference and nonmonotonic rules of conjecture.)
13
Page 14
Let A be a task we want the robot to perform, and let B and C be
alternate intermediate goals either of which would allow the accomplishment
of A. We want the robot to be able to choose between attempting B and
attempting C. It would be silly to program it to reason: “I’m a robot and
a deterministic device. Therefore, I have no choice between B and C. What
I will do is determined by my construction.” Instead it must decide in some
way which of B and C it can accomplish. It should be able to conclude
in some cases that it can accomplish B and not C, and therefore it should
take B as a subgoal on the way to achieving A. In other cases it should
conclude that it can accomplish either B or C and should choose whichever
is evaluated as better according to the criteria we provide it.
(McCarthy and Hayes 1969) proposes conditions on the semantics of any
formalism within which the robot should reason. The essential idea is that
what the robot can do is determined by the place the robot occupies in the
world—not by its internal structure. For example, if a certain sequence of
outputs from the robot will achieve B, then we conclude or it concludes that
the robot can achieve B without reasoning about whether the robot will
actually produce that sequence of outputs.
Our contention is that this is approximately how any system, whether
human or robot, must reason about its ability to achieve goals. The basic
formalism will be the same, regardless of whether the system is reasoning
about its own abilities or about those of other systems including people.
The above-mentioned paper also discusses the complexities that come up
when a strategy is required to achieve the goal and when internal inhibitions
or lack of knowledge have to be taken into account.
5 Three Approaches to Knowledge and Belief
Our robot will also have to reason about its own knowledge and that of other
robots and people.
This section contrasts the approaches to knowledge and belief character-
istic of philosophy, philosophical logic and artificial intelligence. Knowledge
and belief have long been studied in epistemology, philosophy of mind and in
philosophical logic. Since about 1960, knowledge and belief have also been
studied in AI. (Halpern 1986) and (Vardi 1988) contain recent work, mostly
oriented to computer science including AI.
It seems to me that philosophers have generally treated knowledge and
14
Page 15
belief as complete natural kinds. According to this view there is a fact to
be discovered about what beliefs are. Moreover, once it is decided what the
objects of belief are (e.g. sentences or propositions), the definitions of belief
ought to determine for each such object p whether the person believes it or
not. This last is the completeness mentioned above. Of course, only human
and sometimes animal beliefs have mainly been considered. Philosophers
have differed about whether machines can ever be said to have beliefs, but
even those who admit the possibility of machine belief consider that what
beliefs are is to be determined by examining human belief.
The formalization of knowledge and belief has been studied as part of
philosophical logic, certainly since Hintikka’s book (1964), but much of the
earlier work in modal logic can be seen as applicable. Different logics and
axioms systems sometimes correspond to the distinctions that less formal
philosophers make, but sometimes the mathematics dictates different dis-
tinctions.
AI takes a different course because of its different objectives, but I’m
inclined to recommend this course to philosophers also, partly because we
want their help but also because I think it has philosophical advantages.
The first question AI asks is: Why study knowledge and belief at all?
Does a computer program solving problems and achieving goals in the common-
sense world require beliefs, and must it use sentences about beliefs? The an-
swer to both questions is approximately yes. At least there have to be data
structures whose usage corresponds closely to human usage in some cases.
For example, a robot that could use the American air transportation system
has to know that travel agents know airline schedules, that there is a book
(and now a computer accessible database) called the OAG that contains this
information. If it is to be able to plan a trip with intermediate stops it has to
have the general information that the departure gate from an intermediate
stop is not to be discovered when the trip is first planned but will be avail-
able on arrival at the intermediate stop. If the robot has to keep secrets, it
has to know about how information can be obtained by inference from other
information, i.e. it has to have some kind of information model of the people
from whom it is to keep the secrets.
However, none of this tells us that the notions of knowledge and belief to
be built into our computer programs must correspond to the goals philoso-
phers have been trying to achieve. For example, the difficulties involved in
building a system that knows what travel agents know about airline schedules
are not substantially connected with questions about how the travel agents
15
Page 16
can be absolutely certain. Its notion of knowledge doesn’t have to be com-
plete; i.e. it doesn’t have to determine in all cases whether a person is to
be regarded as knowing a given proposition. For many tasks it doesn’t have
to have opinions about when true belief doesn’t constitute knowledge. The
designers of AI systems can try to evade philosophical puzzles rather than
solve them.
Maybe some people would suppose that if the question of certainty is
avoided, the problems formalizing knowledge and belief become straightfor-
ward. That has not been our experience.
As soon as we try to formalize the simplest puzzles involving knowledge,
we encounter difficulties that philosophers have rarely if ever attacked.
Consider the following puzzle of Mr. S and Mr. P.
Two numbers m and n are chosen such that 2 ≤ m ≤ n ≤ 99. Mr. S is
told their sum and Mr. P is told their product. The following dialogue ensues:
Mr. P: I don’t know the numbers.
Mr. S: I knew you didn’t know them. I don’t know them
either.
Mr. P: Now I know the numbers.
Mr. S: Now I know them too.
In view of the above dialogue, what are the numbers?
Formalizing the puzzle is discussed in (McCarthy 1989). For the present
we mention only the following aspects.
1. We need to formalize knowing what, i.e. knowing what the numbers
are, and not just knowing that.
2. We need to be able to express and prove non-knowledge as well as
knowledge. Specifically we need to be able to express the fact that as far as
Mr. P knows, the numbers might be any pair of factors of the known product.
3. We need to express the joint knowledge of Mr. S and Mr. P of the
conditions of the problem.
4. We need to express the change of knowledge with time, e.g. how
Mr. P’s knowledge changes when he hears Mr. S say that he knew that Mr. P
didn’t know the numbers and doesn’t know them himself. This includes
inferring what Mr. S and Mr. P still won’t know.
16
Page 17
The first order language used to express the facts of this problem involves
an accessibility relation A(w1,w2,p,t), modeled on Kripke’s semantics for
modal logic. However, the accessibility relation here is in the language itself
rather than in a metalanguage. Here w1 and w2 are possible worlds, p is a
person and t is an integer time. The use of possible worlds makes it convenient
to express non-knowledge. Assertions of non-knowledge are expressed as the
existence of accessible worlds satisfying appropriate conditions.
The problem was successfully expressed in the language in the sense that
an arithmetic condition determining the values of the two numbers can be de-
duced from the statement. However, this is not good enough for AI. Namely,
we would like to include facts about knowledge in a general purpose common-
sense database. Instead of an ad hoc formalization of Mr. S and Mr. P, the
problem should be solvable from the same general facts about knowledge
that might be used to reason about the knowledge possessed by travel agents
supplemented only by the facts about the dialogue. Moreover, the language
of the general purpose database should accommodate all the modalities that
might be wanted and not just knowledge. This suggests using ordinary logic,
e.g. first order logic, rather than modal logic, so that the modalities can be
ordinary functions or predicates rather than modal operators.
Suppose we are successful in developing a “knowledge formalism” for our
common-sense database that enables the program controlling a robot to solve
puzzles and plan trips and do the other tasks that arise in the common-sense
environment requiring reasoning about knowledge. It will surely be asked
whether it is really knowledge that has been formalized. I doubt that the
question has an answer. This is perhaps the question of whether knowledge
is a natural kind.
I suppose some philosophers would say that such problems are not of
philosophical interest. It would be unfortunate, however, if philosophers were
to abandon such a substantial part of epistemology to computer science. This
is because the analytic skills that philosophers have acquired are relevant to
the problems.
6 Reifying Context
We propose the formula holds(p,c) to assert that the proposition p holds in
context c. It expresses explicitly how the truth of an assertion depends on
context. The relation c1 ≤ c2 asserts that the context c2 is more general
17
Page 18
than the context c1.
1
Formalizing common-sense reasoning needs contexts as objects, in order
to match human ability to consider context explicitly. The proposed database
of general common-sense knowledge will make assertions in a general context
called C0. However, C0 cannot be maximally general, because it will surely
involve unstated presuppositions. Indeed we claim that there can be no
maximally general context. Every context involves unstated presuppositions,
both linguistic and factual.
Sometimes the reasoning system will have to transcend C0, and tools will
have to be provided to do this. For example, if Boyle’s law of the dependence
of the volume of a sample of gas on pressure were built into C0, discovery of its
dependence on temperature would have to trigger a process of generalization
that might lead to the perfect gas law.
The following ideas about how the formalization might proceed are tenta-
tive. Moreover, they appeal to recent logical innovations in the formalization
of nonmonotonic reasoning. In particular, there will be nonmonotonic “in-
heritance rules” that allow default inference from holds(p,c) to holds(p,c ),
where c is either more general or less general than c.
Almost all previous discussion of context has been in connection with
natural language, and the present paper relies heavily on examples from nat-
ural language. However, I believe the main AI uses of formalized context will
not be in connection with communication but in connection with reasoning
about the effects of actions directed to achieving goals. It’s just that natural
language examples come to mind more readily.
As an example of intended usage, consider
holds(at(he,inside(car)),c17).
Suppose that this sentence is intended to assert that a particular person is in
a particular car on a particular occasion, i.e. the sentence is not just being
used as a linguistic example but is meant seriously. A corresponding English
sentence is “He’s in the car” where who he is and which car and when is
determined by the context in which the sentence is uttered. Suppose, for
simplicity, that the sentence is said by one person to another in a situation
in which the car is visible to the speaker but not to the hearer and the time
at which the the subject is asserted to be in the car is the same time at which
the sentence is uttered.
1
1996: In subsequent papers the notation ist(c, p) was used.
18
Page 19
In our formal language c17 has to carry the information about who he is,
which car and when.
Now suppose that the same fact is to be conveyed as in example 1, but the
context is a certain Stanford Computer Science Department 1980s context.
Thus familiarity with cars is presupposed, but no particular person, car or
occasion is presupposed. The meanings of certain names is presupposed,
however. We can call that context (say) c5. This more general context
requires a more explicit proposition; thus, we would have
holds(at(“Timothy McCarthy”,inside((ιx)(iscar(x) ∧
∧ belongs(x,“John McCarthy”)))),c5).
(3)
A yet more general context might not identify a specific John McCarthy, so
that even this more explicit sentence would need more information. What
would constitute an adequate identification might also be context dependent.
Here are some of the properties formalized contexts might have.
1. In the above example, we will have c17 ≤ c5, i.e. c5 is more general
than c17. There will be nonmonotonic rules like
(∀c1 c2 p)(c1 ≤ c2) ∧ holds(p,c1) ∧ ¬ab1(p,c1,c2) ⊃ holds(p,c2)
(4)
and
(∀c1 c2 p)(c1 ≤ c2) ∧ holds(p,c2) ∧ ¬ab2(p,c1,c2) ⊃ holds(p,c1).
(5)
Thus there is nonmonotonic inheritance both up and down in the generality
hierarchy.
2. There are functions forming new contexts by specialization. We could
have something like
c19 = specialize(he = Timothy McCarthy,belongs(car,John McCarthy),c5).
(6)
We will have c19 ≤ c5.
3. Besides holds(p,c), we may have value(term,c), where term is a term.
The domain in which term takes values is defined in some outer context.
4. Some presuppositions of a context are linguistic and some are factual.
In the above example, it is a linguistic matter who the names refer to. The
19
Page 20
properties of people and cars are factual, e.g. it is presumed that people fit
into cars.
5. We may want meanings as abstract objects. Thus we might have
meaning(he,c17) = meaning(“Timothy McCarthy”,c5).
6. Contexts are “rich” entities not to be fully described. Thus the “nor-
mal English language context” contains factual assumptions and linguistic
conventions that a particular English speaker may not know. Moreover, even
assumptions and conventions in a context that may be individually accessible
cannot be exhaustively listed. A person or machine may know facts about a
context without “knowing the context”.
7. Contexts should not be confused with the situations of the situation
calculus of (McCarthy and Hayes 1969). Propositions about situations can
hold in a context. For example, we may have
holds(Holds1(at(I,airport),result(drive-to(airport,
result(walk-to(car),S0))),c1).
(7)
This can be interpreted as asserting that under the assumptions embodied
in context c1, a plan of walking to the car and then driving to the airport
would get the robot to the airport starting in situation S0.
8. The context language can be made more like natural language and
more extensible if we introduce notions of entering and leaving a context.
These will be analogous to the notions of making and discharging assump-
tions in natural deduction systems, but the notion seems to be more general.
Suppose we have holds(p,c). We then write
enter c.
This enables us to write p instead of holds(p,c). If we subsequently infer q,
we can replace it by holds(q,c) and leave the context c. Then holds(q,c) will
itself hold in the outer context in which holds(p,c) holds. When a context is
entered, there need to be restrictions analogous to those that apply in natural
deduction when an assumption is made.
One way in which this notion of entering and leaving contexts is more
general than natural deduction is that formulas like holds(p,c1) and (say)
holds(notp,c2) behave differently from c1 ⊃ p and c2 ⊃ ¬p which are their
natural deduction analogs. For example, if c1 is associated with the time 5pm
20
Page 21
and c2 is associated with the time 6pm and p is at(I,office), then holds(p,c1)∧
holds(not p,c2) might be used to infer that I left the office between 5pm and
6pm. (c1 ⊃ p)∧(c2 ⊃ ¬p) cannot be used in this way; in fact it is equivalent
to ¬c1 ∨ ¬c2.
9. The expression Holds(p,c) (note the caps) represents the proposition
that p holds in c. Since it is a proposition, we can assert holds(Holds(p,c),c ).
10. Propositions will be combined by functional analogs of the Boolean
operators as discussed in (McCarthy 1979b). Treating propositions involving
quantification is necessary, but it is difficult to determine the right formal-
ization.
11. The major goals of research into formalizing context should be to
determine the rules that relate contexts to their generalizations and special-
izations. Many of these rules will involve nonmonotonic reasoning.
7 Remarks
The project of formalizing common-sense knowledge and reasoning raises
many new considerations in epistemology and also in extending logic. The
role that the following ideas might play is not clear yet.
7.1 Epistemological Adequacy often Requires Approx-
imate Partial Theories
(McCarthy and Hayes 1969) introduces the notion of epistemological ade-
quacy of a formalism. The idea is that the formalism used by an AI system
must be adequate to represent the information that a person or program
with given opportunities to observe can actually obtain. Often an episte-
mologically adequate formalism for some phenomenon cannot take the form
of a classical scientific theory. I suspect that some people’s demand for a
classical scientific theory of certain phenomena leads them to despair about
formalization. Consider a theory of a dynamic phenomenon, i.e. one that
changes in time. A classical scientific theory represents the state of the phe-
nomenon in some way and describes how it evolves with time, most classically
by differential equations.
What can be known about common-sense phenomena usually doesn’t
permit such complete theories. Only certain states permit prediction of the
21
Page 22
future. The phenomenon arises in science and engineering theories also, but
I suspect that philosophy of science sweeps these cases under the rug. Here
are some examples.
(1) The theory of linear electrical circuits is complete within its model
of the phenomena. The theory gives the response of the circuit to any time
varying voltage. Of course, the theory may not describe the actual physics,
e.g. the current may overheat the resistors. However, the theory of sequential
digital circuits is incomplete from the beginning. Consider a circuit built
from NAND-gates and D flipflops and timed synchronously by an appropriate
clock. The behavior of a D flipflop is defined by the theory when one of its
inputs is 0 and the other is 1 when the inputs are appropriately clocked.
However, the behavior is not defined by the theory when both inputs are 0
or both are 1. Moreover, one can easily make circuits in such a way that
both inputs of some flipflop get 0 at some time.
This lack of definition is not an oversight. The actual signals in a dig-
ital circuit are not ideal square waves but have finite rise times and often
overshoot their nominal values. However, the circuit will behave as though
the signals were ideal provided the design rules are obeyed. Making both
inputs to a flipflop nominally 0 creates a situation in which no digital theory
can describe what happens, because the behavior then depends on the actual
time-varying signals and on manufacturing variations in the flipflops.
(2) Thermodynamics is also a partial theory. It tells about equilibria and
it tells which directions reactions go, but it says nothing about how fast they
go.
(3) The common-sense database needs a theory of the behavior of clerks in
stores. This theory should cover what a clerk will do in response to bringing
items to the counter and in response to a certain class of inquiries. How he
will respond to other behaviors is not defined by the theory.
(4) (McCarthy 1979a) refers to a theory of skiing that might be used by ski
instructors. This theory regards the skier as a stick figure with movable joints.
It gives the consequences of moving the joints as it interacts with the shape of
the ski slope, but it says nothing about what causes the joints to be moved in
a particular way. Its partial character corresponds to what experience teaches
ski instructors. It often assigns truth values to counterfactual conditional
assertions like, “If he had bent his knees more, he wouldn’t have fallen”.
22
Page 23
7.2 Meta-epistemology
If we are to program a computer to think about its own methods for gath-
ering information about the world, then it needs a language for expressing
assertions about the relation between the world, the information gathering
methods available to an information seeker and what it can learn. This leads
to a subject I like to call meta-epistemology. Besides its potential appli-
cations to AI, I believe it has applications to philosophy considered in the
traditional sense.
Meta-epistemology is proposed as a mathematical theory in analogy to
metamathematics. Metamathematics considers the mathematical properties
of mathematical theories as objects. In particular model theory as a branch of
metamathematics deals with the relation between theories in a language and
interpretations of the non-logical symbols of the language. These interpre-
tations are considered as mathematical objects, and we are only sometimes
interested in a preferred or true interpretation.
Meta-epistemology considers the relation between the world, languages
for making assertions about the world, notions of what assertions are consid-
ered meaningful, what are accepted as rules of evidence and what a knowl-
edge seeker can discover about the world. All these entities are considered as
mathematical objects. In particular the world is considered as a parameter.
Thus meta-epistemology has the following characteristics.
1. It is a purely mathematical theory. Therefore, its controversies, assum-
ing there are any, will be mathematical controversies rather than controver-
sies about what the real world is like. Indeed metamathematics gave many
philosophical issues in the foundations of mathematics a technical content.
For example, the theorem that intuitionist arithmetic and Peano arithmetic
are equi-consistent removed at least one area of controversy between those
whose mathematical intuitions support one view of arithmetic or the other.
2. While many modern philosophies of science assume some relation
between what is meaningful and what can be verified or refuted, only spe-
cial meta-epistemological systems will have the corresponding mathematical
property that all aspects of the world relate to the experience of the knowl-
edge seeker.
This has several important consequences for the task of programming a
knowledge seeker.
A knowledge seeker should not have a priori prejudices (principles) about
23
Page 24
what concepts might be meaningful. Whether and how a proposed concept
about the world might ever connect with observation may remain in suspense
for a very long time while the concept is investigated and related to other
concepts.
We illustrate this by a literary example. Moliere’s play La Malade Imag-
inaire includes a doctor who explains sleeping powders by saying that they
contain a “dormitive virtue”. In the play, the doctor is considered a pompous
fool for offering a concept that explains nothing. However, suppose the doctor
had some intuition that the dormitive virtue might be extracted and concen-
trated, say by shaking the powder in a mixture of ether and water. Suppose
he thought that he would get the same concentrate from all substances with
soporific effect. He would certainly have a fragment of scientific theory sub-
ject to later verification. Now suppose less—namely, he only believes that a
common component is behind all substances whose consumption makes one
sleepy but has no idea that he should try to invent a way of verifying the
conjecture. He still has something that, if communicated to someone more
scientifically minded, might be useful. In the play, the doctor obviously sins
intellectually by claiming a hypothesis as certain. Thus a knowledge seeker
must be able to form new concepts that have only extremely tenuous relations
with their previous linguistic structure.
7.3 Rich and poor entities
Consider my next trip to Japan. Considered as a plan it is a discrete object
with limited detail. I do not yet even plan to take a specific flight or to fly on
a specific day. Considered as a future event, lots of questions may be asked
about it. For example, it may be asked whether the flight will depart on time
and what precisely I will eat on the airplane. We propose characterizing the
actual trip as a rich entity and the plan as a poor entity. Originally, I thought
that rich events referred to the past and poor ones to the future, but this
seems to be wrong. It’s only that when one refers to the past one is usually
referring to a rich entity, while the future entities one refers to are more often
poor. However, there is no intrinsic association of this kind. It seems that
planning requires reasoning about the plan (poor entity) and the event of its
execution (rich entity) and their relations.
(McCarthy and Hayes 1969) defines situations as rich entities. However,
the actual programs that have been written to reason in situation calculus
might as well regard them as taken from a finite or countable set of discrete
24
Page 25
states.
Possible worlds are also examples of rich entities as ordinarily used in
philosophy. One never prescribes a possible world but only describes classes
of possible worlds.
Rich entities are open ended in that we can always introduce more prop-
erties of them into our discussion. Poor entities can often be enumerated, e.g.
we can often enumerate all the events that we consider reasonably likely in
a situation. The passage from considering rich entities in a given discussion
to considering poor entities is a step of nonmonotonic reasoning.
It seems to me that it is important to get a good formalization of the
relations between corresponding rich and poor entities. This can be regarded
as formalizing the relation between the world and a formal model of some
aspect of the world, e.g. between the world and a scientific theory.
8 Acknowledgements
I am indebted to Vladimir Lifschitz and Richmond Thomason for useful
suggestions. Some of the prose is taken from (McCarthy 1987), but the
examples are given more precisely in the present paper, since Daedalus allows
no formulas.
The research reported here was partially supported by the Defense Ad-
vanced Research Projects Agency, Contract No. N00039-84-C-0211.
9 References
Dennett, D.C. (1971): “Intentional Systems”, Journal of Philosophy, vol.
68, No. 4, Feb. 25.
Dreyfus, Hubert L. (1972): What Computers Can’t Do: the Limits of
Artificial Intelligence, revised edition 1979, New York : Harper & Row.
Fikes, R, and Nils Nilsson, (1971): “STRIPS: A New Approach to the
Application of Theorem Proving to Problem Solving”, Artificial Intelligence,
Volume 2, Numbers 3,4, January, pp. 189-208.
Gelfond, M. (1987): “On Stratified Autoepistemic Theories”, AAAI-87 1,
207-211.
Ginsberg, M. (ed.) (1987): Readings in Nonmonotonic Reasoning, Mor-
gan Kaufmann, 481 pp.
25
Page 26
Green, C., (1969): “Application of Theorem Proving to Problem Solving,”
First International Joint Conference on Artificial Intelligence, pp. 219-239.
Halpern, J. (ed.) (1986): Reasoning about Knowledge, Morgan Kauf-
mann, Los Altos, CA.
Hanks, S. and D. McDermott (1986): “Default Reasoning, Nonmono-
tonic Logics, and the Frame Problem”, AAAI-86, pp. 328-333.
Haugh, Brian A. (1988): “Tractable Theories of Multiple Defeasible In-
heritance in Ordinary Nonmonotonic Logics”, Proceedings of the Seventh Na-
tional Conference on Artificial Intelligence (AAAI-88), Morgan Kaufmann.
Hintikka, Jaakko (1964): Knowledge and Belief; an Introduction to the
Logic of the Two Notions, Cornell Univ. Press, 179 pp.
Kowalski, Robert (1979): Logic for Problem Solving, North-Holland, Am-
sterdam.
Kraus, Sarit and Donald Perlis (1988): “Names and Non-Monotonic-
ity”, UMIACS-TR-88-84, CS-TR-2140, Computer Science Technical Report
Series, University of Maryland, College Park, Maryland 20742.
Lifschitz, Vladimir (1987): “Formal theories of action”, The Frame Prob-
lem in Artificial Intelligence, Proceedings of the 1987 Workshop, reprinted in
(Ginsberg 1987).
Lifschitz, Vladimir (1989a): Between Circumscription and Autoepistemic
Logic, to appear in the Proceedings of the First International Conference on
Principles of Knowledge Representation and Reasoning, Morgan Kaufmann.
Lifschitz, Vladimir (1989b): “Circumscriptive Theories: A Logic-based
Framework for Knowledge Representation,” this collection.
Lifschitz, Vladimir (1989c): “Benchmark Problems for Formal Nonmono-
tonic Reasoning”, Non-Monotonic Reasoning, 2nd International Workshop,
Grassau, FRG, Springer-Verlag.
McCarthy, John (1959): “Programs with Common Sense”, Proceedings of
the Teddington Conference on the Mechanization of Thought Processes, Her
Majesty’s Stationery Office, London.
McCarthy, John and P.J. Hayes (1969): “Some Philosophical Problems
from the Standpoint of Artificial Intelligence”, D. Michie (ed.), Machine
Intelligence 4, American Elsevier, New York, NY.
McCarthy, John (1977): “On The Model Theory of Knowledge” (with
M. Sato, S. Igarashi, and T. Hayashi), Proceedings of the Fifth International
Joint Conference on Artificial Intelligence, M.I.T., Cambridge, Mass.
26
Page 27
McCarthy, John (1977): “Epistemological Problems of Artificial Intelli-
gence”, Proceedings of the Fifth International Joint Conference on Artificial
Intelligence, M.I.T., Cambridge, Mass.
McCarthy, John (1979a): “Ascribing Mental Qualities to Machines”,
Philosophical Perspectives in Artificial Intelligence, Ringle, Martin (ed.),
Harvester Press, July 1979.
McCarthy, John (1979b): “First Order Theories of Individual Concepts
and Propositions”, Michie, Donald (ed.), Machine Intelligence 9, (University
of Edinburgh Press, Edinburgh).
McCarthy, John (1980): “Circumscription—A Form of Non-Monotonic
Reasoning”, Artificial Intelligence, Volume 13, Numbers 1,2, April.
McCarthy, John (1983): “Some Expert Systems Need Common Sense”,
Computer Culture: The Scientific, Intellectual and Social Impact of the Com-
puter, Heinz Pagels (ed.), vol. 426, Annals of the New York Academy of
Sciences.
McCarthy, John (1986): “Applications of Circumscription to Formalizing
Common Sense Knowledge”, Artificial Intelligence, April 1986.
McCarthy, John (1987): “Mathematical Logic in Artificial Intelligence”,
Daedalus, vol. 117, No. 1, American Academy of Arts and Sciences, Winter
1988.
McCarthy, John (1989): “Two Puzzles Involving Knowledge”, Formaliz-
ing Common Sense, Ablex 1989.
McDermott, D. and J. Doyle, (1980): “Non-Monotonic Logic I”, Arti-
ficial Intelligence, Vol. 13, N. 1
Moore, R. (1985): “Semantical Considerations on Nonmonotonic Logic”,
Artificial Intelligence 25 (1), pp. 75-94.
Newell, Allen (1981): “The Knowledge Level”. AI Magazine, Vol. 2, No.
2.
Perlis, D. (1988): “Autocircumscription”, Artificial Intelligence, 36 pp. 223-
236.
Reiter, Raymond (1980): “A Logic for Default Reasoning”, Artificial
Intelligence, Volume 13, Numbers 1,2, April.
Russell, Bertrand (1913): “On the Notion of Cause”, Proceedings of the
Aristotelian Society, 13, pp. 1-26.
27
Page 28
Robinson, J. Allen (1965): “A Machine-oriented Logic Based on the
Resolution Principle”, JACM, 12(1), pp. 23-41.
Sterling, Leon and Ehud Shapiro (1986): The Art of Prolog, MIT Press.
Sussman, Gerald J., Terry Winograd, and Eugene Charniak (1971):
“Micro-planner Reference Manual”, Report AIM-203A, Artificial Intelligence
Laboratory, Massachusetts Institute of Technology, Cambridge.
Vardi, Moshe (1988): Conference on Theoretical Aspects of Reasoning
about Knowledge, Morgan Kaufmann, Los Altos, CA.
Department of Computer Science
Stanford University
Stanford, CA 94305
28
This is the html version of the file http://www.media.mit.edu/~lieber/Lieberary/Common-Sense/Beating-Common-Sense/Beating-Common-Sense.pdf.
G o o g l e automatically generates html versions of documents as we crawl the web.
To link to or bookmark this page, use the following url: http://www.google.com/search?q=cache:Vk7UuQZluNcJ:www.media.mit.edu/~lieber/Lieberary/Common-Sense/Beating-Common-Sense/Beating-Common-Sense.pdf+%22artificial+intelligence%22+%22common+sense%22+site:edu+pdf&hl=en&client=firefox-a
Google is not affiliated with the authors of this page nor responsible for its content.
These search terms have been highlighted: artificial intelligence common sense
These terms only appear in links pointing to this page: pdf
Page 1
Abstract
A long-standing dream of artificial intelligence
has been to put common sense knowledge into
computers—enabling machines to reason about
everyday life. Some projects, such as Cyc, have
begun to amass large collections of such knowl-
edge. However, it is widely assumed that the use
of common sense in interactive applications will
remain impractical for years, until these collec-
tions can be considered sufficiently complete
and common sense reasoning sufficiently robust.
Recently, at the MIT Media Lab, we have had
some success in applying common sense knowl-
edge in a number of intelligent Interface Agents,
despite the admittedly spotty coverage and unre-
liable inference of today's common sense knowl-
edge systems. This paper will survey several of
these applications and reflect on interface design
principles that enable successful use of common
sense knowledge.
1
Introduction
1
Things fall down, not up. Weddings have a bride and a
groom. If someone yells at you, they're probably angry.
One of the reasons that computers seem dumber than
humans is that they don't have common sense—a myriad
of simple facts about everyday life and the ability to
make use of that knowledge easily when appropriate. A
long-standing dream of Artificial Intelligence has been to
put that kind of knowledge into computers, but applica-
tions of common sense knowledge have been slow in
coming.
Researchers like Minsky [2000] and Lenat [1995], rec-
ognizing the importance of common sense knowledge,
have proposed that common sense constitutes the bottle-
neck for making intelligent machines, and they advocate
working directly to amass large collections of such
knowledge and heuristics for using it.
Considerable progress has been made over the last few
years. There are now large knowledge bases of common
sense knowledge and better ways of using it then we have
had before. We may have gotten too used to putting
common sense in that category of "impossible" problems
and overlooked opportunities to actually put this kind of
knowledge to work. We need to explore new interface
designs that don't require complete solutions to the com-
mon sense problem, but can make good use of partial
knowledge and human-computer collaboration.
As the complexity of computer applications grows, it
may be that the only way to make applications more
helpful and avoid stupid mistakes and annoying interrup-
tions is to make use of common sense knowledge.Cell
phones should know enough to switch to vibrate mode if
you're at the symphony. Calendars should warn you if
you try to schedule a meeting at 2 AM or plan to take a
vegetarian to a steak house. Cameras should realize that
if you took a group of pictures within a span of two
hours, at around the same location, they are probably of
the same event.
Initial experimentation with using common sense en-
countered significant obstacles. First, despite the vast
amount of effort put into common sense knowledge
bases, coverage is still sparse relative to the amount of
knowledge humans typically bring to bear. Second, infer-
ence with such knowledge is still unreliable, due to
vagueness, exceptional cases, logical paradoxes, and
other problems.
2
Question-Answering versus Interface
Agent Applications
Many early attempts at applying common sense fell into
the category of question-answering, story understanding,
or information retrieval kind of problems. The hope was
that use of common sense inference would improve re-
sults beyond what was possible with simple keyword
matching or statistical methods.
For example, in a retrieval demo of Cyc [Lenat, 1995],
one could ask "Show me a picture of someone who is
disappointed", and receive a picture of the second fin-
isher in the Boston Marathon, by a chain of reasoning
like: A marathon is a contest; The goal of a contest is to
be first; If you do not achieve your goals, then you will
be disappointed. When it works, this is great. But direct
Beating Common Sense into Interactive Applications
Henry Lieberman, Hugo Liu, Push Singh, Barbara Barry
MIT Media Lab, 20 Ames St., Cambridge, MA 02139 USA
Page 2
question-answering places very exacting demands on a
system.
First, the user is expecting a direct answer. If the an-
swer is good the user will be happy, if the answer is not,
the user will be critical of the system. If the accuracy
falls below a certain threshold in the long term, the user
will give up using the system completely. Second, the
system only gets one shot at finding the correct answer,
and it must do so quickly enough to maintain the feeling
of interactivity (no more than a few seconds).
Over the last few years, we have been exploring the
domain of Intelligent Interface Agents [Maes, 1994]. An
interface agent is an AI program that attaches itself to a
conventional interactive application (text or graphical
editor, Web browser, spreadsheet, etc.) and both watches
the user's interactions, and is capable of operating the
interface as would the user. The jobs of the agent are to
provide help, assistance, suggestions, automation of
common tasks, adaptation and personalization of the in-
terface.
Our experience has been that Interface Agents can use
common sense knowledge much more effectively than
direct question-answering applications, because they
place fewer demands on the system. Since all the capa-
bilities of the interactive application remain available for
the user to use in a conventional manner, it is no big deal
if common sense knowledge does not cover a particular
situation. If a common sense inference turns out wrong,
the user is often no worse off then they would be without
any assistance.
The user is not expecting a direct answer to every ac-
tion, only that the agent will come up with something
helpful every once in a while. Since the agent operates in
a continuous, long-term manner, if it cannot respond im-
mediately, it can gather further evidence and perhaps
deliver a meaningful interaction in the future. If the
agent's knowledge is not sufficient, it can ask the user to
fill in the gaps.
In short, the use of common sense in Interface Agents
can be made fail-soft. Interface agents are often proac-
tive, “pushing” information rather than “pulling” it as
query-response systems do, and it is easier to make the
former kind of agents fail-soft.
3
Applications of Common Sense in In-
terface Agents
The remainder of this paper will survey several of our
lab’s recent projects in this area, to illustrate the princi-
ples above. Except where noted, these applications were
built using knowledge drawn from Open Mind Common
Sense (OMCS, see sidebar), a common sense knowledge
base of over 675,000 natural language assertions built
from the contributions of over 13,000 people over the
World Wide Web [Singh et al., 2002]. Many of these
applications made use of early versions of OMCSNet, a
semantic network of 280,000 relations extracted from the
OMCS corpus with 20 link types covering taxonomic,
meronomic, temporal, spatial, causal, functional, and
other kinds of relations.
3.1
Common Se nse in an Ag ent fo r Digi-
tal Photo graphy
Figure 1. Telling stories with ARIA
In ARIA (Annotation and Retrieval Integration Agent,
Figure 1) [Lieberman et al., 2001], we attempt to lever-
age common sense knowledge to semi-automatically an-
notate photos and proactively suggest relevant photos
[Lieberman & Liu, 2002a]. ARIA observes a user as s/he
types a story, parses the text in real time, and continu-
ously displays a relevance-ordered list of photos. When
the user inserts photos in text, the system automatically
annotates the photos with relevant keywords.
Common sense knowledge is used to inform semantic
recognition agents, which recognize people, places, and
events in the text. These recognition agents extract ap-
propriate annotations to be added to photos inserted in
the text. In retrieval, common sense knowledge is com-
piled into a semantic network, and associative reasoning
helps to bridge semantic gaps (e.g. connect text about
“wedding” to a photo annotated with “bride”) [Liu &
Lieberman, 2002b]. The system also learns from per-
sonal assertions from the text (e.g. “My sister’s name is
Mary.”), presumably unique to the author’s context,
which can be treated as a source of implicit knowledge in
much the same manner as the common sense assertions
coming from Open Mind.
The application of common sense in ARIA has several
fail-soft aspects. Annotations suggested by the agent
carry less weight than a user’s annotations in retrieval,
and can be rejected or revised by the user. Similarly, in
retrieval, common sense is used only to bridge semantic
gaps, and would never supersede explicit keyword
matching. If a user finds a suggestion useful, s/he can
choose to drag that photo in the text. But if the sugges-
tion is inappropriate, the user’s writing task is not dis-
rupted.
Page 3
3.2
Common Se nse in Affec tive Classifi-
cation of Text
Consider the text, “My wife left me; she took the kids
and the dog.” There are no obvious mood keywords such
as “cry” or “depressed”, or any other obvious cues, but
the implications of the event described here are decidedly
sad. This presents an opportunity for common sense
knowledge, a subset of which concerns the affective
qualities of things, actions, events, and situations. From
the Open Mind Common Sense knowledge base, a small
society of linguistic models of affect was mined out, us-
ing a set of mood keywords as a starting point. The im-
port of common sense knowledge to this application is to
make affective classification of text more comprehensive
and reliable by considering underlying semantics, in ad-
dition to surface features.
Figure 2. Empathy Buddy reacts to an email.
Using this commonsense-informed approach, two ap-
plications were built. One is an email editor, Empathy
Buddy, above, which uses Chernoff-style faces to inter-
actively react to a user as s/he composes an email using
one of six basic Ekman emotions [Liu, Lieberman, Selker
2003]. A user study showed that users rated the affective
Software Agent as being more interactive and intelligent
than a randomized-face control.
Another application uses a hyperlinked color bar to
help users visualize and navigate the affective structure
of a text document [Liu, Lieberman, Selker, 2002]. Us-
ing the tool, users were able to improve the speed of
within-document information access tasks.
The affective model approach has been recently ex-
tended to modeling point-of-view and personality, ana-
lyzing an author's writings and making a comparison of
what several authors "might have thought" about a speci-
fied topic [Liu and Maes, 2004].
3.3
Common Se nse in Video Capture and
Edi ting
The Cinematic Common Sense project [Barry & Daven-
port, 2003] is being developed to provide feedback to
documentary videographers during production. Common
sense knowledge relevant to the documentary subject
domain is retrieved to assist the videographer when they
are in the field recording video footage about a docu-
mentary subject. After each shot is recorded, metadata is
created by the videographer in natural language and sub-
mitted as a query to a subset of the Open Mind database.
For example, the shot metadata "a street artist is painting
a painting" would yield a shot suggestions such as "the
last thing you do when you paint a painting is clean the
brushes" or "something that might happen when you
paint a picture is paint gets on your hands” ." These as-
sertions can be used by the filmmaker as a flexible shot
list that is dynamically updated in accordance with the
events the filmmaker is experiencing. Annotation of
content is enriched, as in ARIA, to support later search of
image-based content. Collections of shots can be also
ordered into rough temporal and causal sequences based
on the associated common sense annotations.
Figure 3. Common Sense helps associate story elements with
video clips.
3.4
Common Se nse in Other Story telling
Applicati ons
A common thread throughout the above applications is
that they all assist the user in some sort of storytelling
process. Storytelling is a great area for common sense
because it draws on a wide spectrum of understanding of
situations of everyday life. It can provide an intermediate
level for the agent to understand and assist the user that is
better than simple keywords but stops short of full natu-
ral language understanding.
David Gottlieb and Josh Juster’s OMAdventure [Vari-
ous Authors, 2003] (Figure 3) dynamically generates a
Dungeons-and-Dragons type virtual environment by us-
ing common sense knowledge. If the current game loca-
Page 4
tion is a kitchen, the system poses the questions to Open
Mind, “What do you find in a kitchen?” and “What loca-
tions are associated with a kitchen?” If “You find an
oven in a kitchen”, we ask “What can you do with an
oven?” Objects such as the oven or operations such as
cooking are then made available as moves in the game
for the player to make, and the associated locations are
the exits from the current situation. If the player is given
the opportunity to create new objects are locations in the
game that can be a way of extending the knowledge. If
the player adds a blender to a kitchen, now we know that
blenders are something that can be found in a kitchen.
Figure 3. OMAdventure dynamically generates generates
an adventure game’s universe by using common sense
knowledge.
Alexandro Artola’s StoryIllustrator [Various Authors,
2003] (Figure 4) is like Aria in that it gives the user a
story editor and photo database and tries to continuously
retrieve photos relevant to the user’s typing. However,
instead of using an annotated personal photo collection, it
employs Yahoo’s image search to retrieve images from
the Web. Common sense knowledge is used for query
expansion, so that a picture of a baby is associated with
the mention of milk.
Chian Chuu and Hana Kim’s StoryFighter [Various
Authors, 2003] plays a game where the system and the
user take turns contributing lines to a story. The game
proposes a start state, e.g. “John is sleepy” and an end
state, “John is in prison”, and the goal is to get from the
start state to the end state in a specified number of sen-
tences. Along the way there are “taboo” words that can’t
be mentioned (“You can’t use the word ‘arrest’”) as an
additional constraint to make the game more challenging.
Common sense is used to deduce the consequences of an
event. (“If you commit a crime, you might go to jail”)
and to propose taboo words to exclude the most obvious
continuations of the story.
3.6
Common Se nse fo r Topi c Spotting i n
Conversation
Nathan Eagle, Push Singh and Sandy Pentland [Eagle,
Singh, Pentland, 2003] are exploring the idea of a wear-
able computer with continuous audio (and perhaps ulti-
mately, video) recording. They are interested not only in
audio transcription, but in situational understanding --
understanding general properties of the physical and
social environment in which the computer finds itself,
even if the user is not directly interacting with the ma-
chine.
Speech recognition is used to roughly transcribe the
audio, but with current technology, speech transcription
accuracy, especially for conversation, is poor. However,
understanding general aspects of the situation such as
whether the user is at home or at work, alone or with
people, with friends or strangers, etc., is indeed possible.
Such recognition is vastly improved by using common
sense knowledge to map from topic-spotting words out-
put by the speech recognizer, ("lunch", "fries", "styro-
foam") to knowledge about everyday activities that the
user might be engaged in (eating in a fast-food restau-
rant). Bayesian inference is used to rank hypotheses
generated by OMCS Net.
Austin Wang and Justine Cassell used common sense
in a virtual collaborative storytelling partner for children,
[Wang and Cassell, 2003], whose goal is to improve lit-
eracy and storytelling skills. An on-screen character,
SAM, starts telling a story and invites the child to con-
tinue the story at certain points. For example, "Jack and
Jane were playing hide and seek. Jane hid in… now it's
your turn".
The system uses speech recognition to listen to the
child's story, but the recognition is not good enough to
be sure of understanding everything the child had to say.
Instead, the results of the recognition are used for rough
topic-spotting, in the manner of Eagle's system.
In the hide and seek example, the system could hear
the word "bedroom". Then common sense knowledge is
used to determine what is likely to be in a bedroom, e.g.
bed, closet, dresser, etc. The result is used to concoct a
plausible continuation of the story, when it is the virtual
character's turn again to talk, e.g. "Jane's parents walked
into the bedroom while she was hiding under the bed".
3.7
Common Se nse fo r a Dy namic Touris t
Phr aseboo k
Globuddy [Musa et al., 2003], by Rami Musa, Andrea
Kulas, Yoan Anguilete, and Madleina Scheidegger uses
common sense to aid tourists with translation. Phrase-
books like Berlitz will commonly provide a set of words
and phrases useful in a common situation, such as a res-
taurant or hotel. But they can only cover a few such
situations. With Globuddy, you can type in your (perhaps
unusual) situation (“I’ve just been arrested”) and it re-
Page 5
trieves common sense surrounding that situation and
feeds it to a translation service. “If you are arrested, you
should call a lawyer.” “Bail is a payment that allows an
accused person to get out of jail until a trial”. A recent
implementation by Alex Faaborg and José Espinosa puts
Globuddy on handheld and cell phone platforms.
F ig u re 4 . T h e G lo b u d d y 2 d y n a m ic p h raseb o o k g iv es
y o u tran s latio n s o f p h rases co n ce p tu all y rela ted to a
see d w o rd o r p h rase
3.7 Common Se nse fo r Word Compl etio n
Applications like Globuddy play up the role of common
sense knowledge bases in determining what kinds of
topics are "usual" or "ordinary". A simple, but powerful
application of this is in predictive typing or word or
phrase completion. Predictive typing can vastly speed
up interfaces, especially in cases where the user has dif-
ficulty typing normally, or on small devices such as cell
phones whose keyboards are small. Conventional ap-
proaches to predictive typing select a prediction either
from a list of words the user recently typed, or from an
ordered list of the most commonly occurring words in
English. Alex Faaborg and Tom Stocky [Stocky,
Faaborg, Lieberman, 2004] have implemented a Common
Sense predictive text entry facility for a cell phone plat-
form. It uses Open Mind Common Sense Net to find the
next word that "makes sense" in the current context. For
example, typing "train st" leads to the completion "train
station" even though the user may not have typed that
phrase before, nor is "station" the most common "st"
word.
Figure 5. Common Sense can lead to good sugges-
tions for word completion
Performance of Common Sense alone in this task is com-
parable or slightly better than conventional statistical
methods and may be much better when combined with
conventional methods, especially where the conventional
methods don't make strong predictions in particular
cases. Similar approaches have great potential for use in
other kinds of predictive and corrective interfaces.
3.8 Co mmon Sense i n a Di sk Joc key's As-
sis tant
Joan Morris-DiMicco, Carla Gomez, Arnan Sipitakiat,
and Luke Ouko implemented a Common Sense Disk
Jockey [Various Authors], an assistant for music selec-
tion in dance clubs. DJs often select music initially based
on a few superficial parameters (age, ethnicity, dress) of
the audience, and then adjust their subsequent choices
based on the reaction of the audience.
CSDJ uses Erik Mueller’s ThoughtTreasure as a rea-
soning engine [Mueller, 1998] to filter a list of MP3 files
according to common sense assumptions about what kind
of music particular groups might like. It also incorporates
an interface to a camera that measures activity levels of
the dance floor to give feedback to the system as to
whether the selection of a particular piece of music in-
creased or decreased activity.
3.9
Common Se nse fo r Mapping Us er
Goa ls to Concre te Actions
We also have worked on some projects incorporating
common sense knowledge into conventional search en-
gines. These applications still maintain the “one-shot”
query-response interaction that we criticized in the be-
ginning as being less suited to common sense applica-
tions than continuously operating interface agents. How-
ever, we apply the common sense in a fundamentally
different way than conventional attempts to add inference
to search engines. The role of common sense is to map
from the user’s search goals, which are sometimes not
explicitly stated, to keywords appropriate for a conven-
tional search engine. We believe that this process will
make it more likely that the user would receive good re-
sults in the case where conventional keywords wouldn’t
work well, thereby making the interface more fail-soft.
Two systems, Reformulator [Singh, 2002] and GOOSE
[Liu, Lieberman & Selker, 2002] are common sense ad-
juncts to Google.
Reformulator, like Cyc, does inference on the subject
matter of the search itself. Our work in improving search
Page 6
engine interfaces [Liu, Lieberman & Selker, 2002; Singh
2002], is motivated by the observation that forming good
search queries can often be a tricky proposition. We
studied expert users composing queries [Liu, Liberman &
Selker, 2002], and concluded that they usually already
know something about the structure and contents of
pages they are expecting to find. After a little bit of
search common sense is used to decide on the nature of
the expected results, the chain of reasoning leading from
the high level search intent to query formation is usually
very straight-forward and commonsensical.
By contrast, novice users lack the experience in chain
reasoning from a high-level search intent to query for-
mation, so they often state their search goal directly. For
example, a novice may often type "my cat is sick" into a
search engine rather than looking for "veterinarians,
Boston, MA" even though the chain of reasoning is very
straight-forward.
In this situation, there is an opportunity for a search
engine Interface Agent to observe a novice user's queries.
The Agent attempts to infer the user's intent and when it
is detected that a query may not return the best results,
the Agent can help to reformulate the query using search
expertise and inferencing over commonsense knowledge,
and opportunistically suggest "Did you mean to look for
veterinarians in Boston, MA?" above the displayed re-
sults. In GOOSE, we were able to improve a significant
number of queries made by novice users. However, in
that system, we still needed users to help the system by
manually disambiguating the type of search goal. Our
current work on automated disambiguation will allow us
to develop an Interface Agent which does not interfere
with the user's task at all, and only suggests a better
query (appearing above the search results) if it is able to
offer a better one. This allows the Interface Agent to
make use of common sense to improve the user experi-
ence in a fail soft way. If common sense is too spotty to
reformulate a query, no suggestion is offered.
Figure 6. The GOOSE common sense search engine
Another application that also maps between users'
goals and concrete actions is currently under develop-
ment by Alex Faaborg, Sakda Chaiworawitkul and Henry
Lieberman for the composition of Web services.
In Tim Berners-Lee's proposed vision of the next-
generation Semantic Web [Berners-Lee, Hendler, Lassila,
2004], users can state high-level goals, and agent pro-
grams can scout out Web services that can satisfy those
goals, possibly composing multiple services, each of
which accomplishes a subgoal, without explicit direction
from the user. For example, a request "Schedule a doc-
tor's appointment for my mother within ten miles of her
house" might involve looking up directories of doctors
with a certain specialty; checking a reputation server;
consulting a geographic server to check addresses, routes,
or transit; synchronizing the mother's and doctor's sched-
ules; etc.
We fully concur with this vision. However, to date, most
of the work on the Semantic Web has focused on the
formalisms such as XML, OWL, SOAP and UDDI that
will be used to represent metadata stored on the Web
pages that will presumably be accessed by these agents.
Little work is concerned with how an agent might actu-
ally put together Semantic Web services to accomplish
high-level goals for the user.
Looking at currently available and proposed Web service
descriptions, we see that even if everyone agrees on the
representation formalism, different services might ask for
and return different kinds of information for the same
services, and connecting them is still a task that now re-
quires a human programmer to anticipate the form and
structure of such services.
For example, a weather service might deliver a weather
report given a Zip code. But if the user asked "What's the
weather in Denver?", then something has to know how
Zip codes are associated with cities. This is a job for
common sense.
Common sense is used to compose Web services in a
manner similar to the way it is used in GOOSE. User
goals are obtained through two different interfaces; one
that allows natural language statement of goals, and an-
other that provides a sidebar to a browser that proposes
relevant services interactively as the user is browsing.
OMCSNet is used to expand the user goal so that it can
potentially match semantically related concepts which
may appear in the Web service descriptions. Thus we
can achieve a much broader and more appropriate map-
ping of Web services than is possible with literal search
through Web service descriptions alone.
3.1 1 Interface s for Impro ving Common
Sense Kno wledge Bases
One criticism of Open Mind and similar efforts is that
knowledge expressed in single sentences is often implic-
itly dependent on an unstated context. For example, the
sentence “At a wedding, the bride and groom exchange
rings” might assume the context of a Christian or Jewish
wedding, and might not be true in other cultures. Re-
becca Bloom and Avni Shah [Various Authors, 2003]
implemented a system for contextualizing Open Mind
knowledge by prompting the user to add explicit context
elements to each assertion. Retrieval can then supply in-
formation about what context an assertion depends on or
find analogous assertions in other contexts. For example,
in a Hindu wedding, the bride and groom exchange
Page 7
necklaces that serve the same ritual function as rings do
in the West.
Several projects involved interfaces for knowledge
elicitation or feedback about the knowledge base itself.
The Open Mind web site itself contains several of what it
calls “activities” that encourage users to fill in templates
that call for a particular type of knowledge. Knowledge
about the function of objects is elicited with a template
“You __ with a __”. Tim Chklovski [Chklovski & Mihal-
cea, 2002] developed an interface for prompting the user
to disambiguate word senses in Open Mind and for auto-
matically performing simple analogies and asking the
user to confirm or deny them.
Andrea Lockerd’s ThoughtStreams [Various Authors,
2003] aims to acquire common sense knowledge through
simulation. Everyday life is modeled in a game world,
similar to the game, The Sims. An agent tracks user be-
havior in the world and tries to discover behavioral
regularities with a similarity-based learning algorithm. It
is also envisioned that a game character “bot” would be
introduced that would occasionally ask human characters
why they do things, in a manner of an inquisitive (but
hopefully not too annoying) child.
4
Roles for Common Sense in Applica-
tions
Each of these applications uses commonsense differently.
None of them actually does ‘general purpose’ common-
sense reasoning—while each makes use of a broad range
of commonsense knowledge, each makes use of it in a
particular way by performing only certain types of infer-
ences.
Retrieving event-subevent structure. It is some-
times useful to collect together all the knowledge that is
relevant to some particular class of activity or event. For
example the Cinematic Common Sense project makes use
of common sense knowledge about event-subevent
structure to make suitable shot suggestions at common
events like birthdays and marathons. For the topic ‘get-
ting ready for a marathon’, the subevents gathered might
include: putting on your running shoes, picking up your
number, and getting in your place at the starting line.
Goal recognition and planning. The Reformulator
and GOOSE search engines exploit common sense
knowledge about typical human goals to infer the real
goal of the user from their search query. These search
engines can make use of knowledge about actions and
their effects to engage in a simple form of planning. Af-
ter inferring the user’s true intention, they look for a way
to achieve it.
Temporal projection. The MakeBelieve storytelling
system [Liu & Singh, 2002] makes use of the knowledge
of temporal and causal relationships between events in
order to guess what is likely to happen next. Using this
knowledge it can generate stories like: David fell off his
bike. David scraped his knee. David cried like a baby.
David was laughed at. David decided to get revenge.
David hurt people.
Particular consequences of broad classes of ac-
tions. Empathy Buddy senses the affect in passages of
text by prediction only those consequences of actions and
events that have some emotional significance. This can
be done by chaining backwards from knowledge about
desirable and undesirable states. For example, if being
out of work is undesirable, and being fired causes to be to
be out of work, then the passing ‘I was fired from work
today’ can be sensed as undesirable.
Specific facts about particular things. Specific
facts like “Golden Gate Bridge is located in San Fran-
cisco”, or “a PowerBook is a kind of laptop computer”
are often useful. Aria can reason that an e-mail that
mentions that “I saw the Golden Gate Bridge” meant that
“I was in San Francisco at the time”, and proactively re-
trieves photos taken in San Francisco for the user to in-
sert into the e-mail.
Conceptual relationships. A commonsense knowl-
edgebase can be used to supply ‘conceptually related’
concepts. The Globuddy program retrieves knowledge
about the events, actions, objects, and other concepts
related to a given situation in order to make a custom
phrasebook of concepts you might wish to have transla-
tions for in a given situation.
4.1 Do Tr y This at Ho me
We invite the AI community to make use of the Open
Mind Common Sense knowledge base and associated
tools to prototype applications as we have. We hope these
application descriptions will inspire others to continue
along these lines. Please see
http://openmind.media.mit.edu/.
We also welcome feedback from those who do choose to
try this and would appreciate hearing of similar applica-
tions projects.
5 Conclusions
We think that system implementers often fail to realize
how underconstrained many user interface situations are.
In many cases, systems either do nothing or perform ac-
tions that are essentially arbitrary. These applications
show that there exists the potential to use common sense
knowledge to do something that at least might make
sense as far as the user is concerned.
A little bit of knowledge is often better than nothing.
Many applications, such as storytelling, or language
translation for tourists, can cover a broad range of sub-
jects. With such applications, it is better to know a little
bit about a lot of things than a lot about just a few things.
Many past efforts have been stymied by insisting that
coverage of the knowledge base be complete. They are
often afraid to perform inferences because of the possi-
bility of error. We rely on the interactive nature of the
Page 8
interface to provide feedback to the user and the opportu-
nity for correction and completion.
Explicit input from the user is very expensive in the
interface, so common sense knowledge can act as an am-
plifier of that input, bringing in related facts and concepts
that broaden the scope of the application.
Although our descriptions of each of these projects
have been necessarily brief, we hope that the reader will
be impressed by the breadth and variety of the applica-
tions of common sense knowledge. We don’t have to wait
for complete coverage or completely reliable inference to
put this knowledge to work, although as these improve,
the applications will only get better. We think that the AI
community ought to be paying more attention to this ex-
citing area. After all, it’s only common sense.
Sidebar: Open Mind Common Sense
We built the the Open Mind Common Sense (OMCS)
web site [http://openmind.media.mit.edu/] to make it easy
and fun for members of the general public to work to-
gether to construct a commonsense database. OMCS was
launched in September 2000, and as of January 2004 it
has accumulated a corpus of about 675,000 pieces of
commonsense knowledge from over 13,000 people across
the web, many with no special training in computer sci-
ence or artificial intelligence. The contributed knowledge
is expressed in natural language, and consists largely of
the kinds of simple assertions shown in Table 1.
Table 1. Sample of OMCS corpus
People live in houses.
Running is faster than walking.
A person wants to eat when hungry.
Things often found together: light bulb, contact, glass.
Coffee helps wake you up.
A bird flies.
The effect of going for a swim is getting wet.
The first thing you do when you wake up is open your eyes.
Rain falls from the sky.
Apples are not blue.
A voice is the sound of a person talking.
Rather than formulating a precise ontology in advance
and then having knowledge enterers contribute knowl-
edge expressed in terms of that ontology, we instead en-
couraged our users to provide information clearly in
English via free-form and structured templates. Indeed,
we sometimes think of OMCS not so much as a ‘knowl-
edge base’ per se, but as a corpus of commonsense
statements from which a more organized knowledge base
can be constructed using information extraction tech-
niques. In particular, we have extracted a large-scale se-
mantic network called OMCSNet [Liu and Singh, 2004]
consisting of 25 types of binary relations such as is-a,
has-function, has-subevent, and located-in. The most re-
cent version of OMCSNet contains 280,000 links relating
80,000 concepts, where the concepts are simple English
phrases like ‘go to restaurant’ or ‘shampoo bottle’.
We were surprised by the high quality of the contribu-
tions, given that the OMCS site had no special mecha-
nisms for knowledge validation or correction. A manual
evaluation of the corpus revealed that about 90% of the
corpus sentences were rated 3 or higher (on a 5 point
scale) along the dimensions of truth and objectivity, and
about 85% of the corpus sentences were rated as things
anyone with a high school education or more would be
expected to know. Thus the data, while noisy, was not
entirely overwhelmed by noise, as we had originally
feared it might, and also it consisted largely of knowl-
edge one might consider shared in our culture.
Several The Open Mind Word Expert site
[http://www.teach-computers.org/] lets users tag the
senses of the words in individual sentences drawn from
both the OMCS corpus and the glosses of WordNet word
senses. The Open Mind 1001 Questions site
[http://www.teach-computers.org/] uses analogical rea-
soning to pose questions to the user by analogy to what it
already knows, and hence makes the user experience
more interactive and engaging. The Open Mind Experi-
ences site [http://omex.media.mit.edu/] lets users teach
stories in addition to facts by presenting them with story
templates based on Wendy Lenhert's plot-units. Finally,
the latest Open Mind LifeNet site lets users directly build
probabilistic graphical models, and uses those models to
immediately make inferences based on the knowledge
that has been contributed so far.
References
[Barry & Davenport, 2003]. Barry B., and Davenport G.
(2003). Documenting Life: Videography and Common
Sense . In Proceedings of IEEE International Conference
on Multimedia. New York: IEEE, 2003.
[Berners-Lee, Hendler, Lassila, 2004] Berners-Lee, T.
Hendler, J., Lassila, O. The Semantic Web. Scientific
American, May 2001.
[Chklovski & Mihalcea, 2002] Chklovski, T. and R. Mi-
halcea, (2002). Building a Sense Tagged Corpus with
Open Mind Word Expert. In Proceedings of the Work-
shop on "Word Sense Disambiguation: Recent Successes
and Future Directions", ACL 2002.
Page 9
[Eagle, et. al, 2003] Eagle, N., P. Singh, A. Pentland,
Common Sense Conversations: Understanding Casual
Conversation using a Common Sense Database, Artificial
Intelligence, Information Access, and Mobile Computing
Workshop at the 18th International Joint Conference on
Artificial Intelligence (IJCAI) Acapulco, Mexico. August
2003.
[Lenat, 1995] Lenat, D. B. (1995). CYC: A large-scale
investment in knowledge infrastructure. Communications
of the ACM, 38(11): 33-38.
[Lieberman & Liu, 2002a] Lieberman, H. and H. Liu,
(2002). Adaptive Linking between Text and Photos Using
Common Sense Reasoning. In Proceedings of the 2nd
International Conference on Adaptive Hypermedia and
Adaptive Web Based Systems, (AH2002) Malaga, Spain.
[Liu & Lieberman, 2002b] Liu, H. and H. Lieberman,
(2002). Robust photo retrieval using world semantics.
Proceedings of the 3rd International Conference on Lan-
guage Resources And Evaluation Workshop: Using Se-
mantics for Information Retrieval and Filtering
(LREC2002), Las Palmas, Canary Islands.
[Liu, Liberman & Selker, 2002] Liu, H., Lieberman, H.,
Selker, T. (2002). GOOSE: A Goal-Oriented Search En-
gine With Commonsense. Proceedings of the 2nd Inter-
national Conference on Adaptive Hypermedia and Adap-
tive Web Based Systems, (AH2002) Malaga, Spain.
[Liu and Maes, 2004] Liu, H. and P. Maes, (2004)., What
Would They Think? A Computational Model of Atti-
tudes. International Conference on Intelligent User Inter-
faces (IUI '04), January 2004, Funchal, Portugal.
[Lieberman et al., 2001] Lieberman, H., E. Rosenzweig.
P. Singh, (2001). Aria: An Agent For Annotating And
Retrieving Images, IEEE Computer, July 2001, pp. 57-
61.
[Liu et al., 2003] Liu, H., H. Lieberman, , T. Selker,
(2003). A Model of Textual Affect Sensing using Real-
World Knowledge. In Proceedings of IUI 2003. Miami,
Florida.
[Liu, Lieberman & Selker, 2002] Liu, H., H. Lieberman,
and T. Selker, (2002) Automatic Affective Feedback in
an Email Browser. MIT Media Lab Software Agents
Group Technical Report SA02-01. November, 2002.
[Liu & Singh, 2002] Liu, H., P. Singh, (2002).
MAKEBELIEVE: Using Commonsense to Generate Sto-
ries. In Proceedings of the 20th National Conference on
Artificial Intelligence, (AAAI-02), 957-958, Edmonton,
Canada
[Liu and Singh, 2004] Liu, H. and P. Singh (2004). The
Open Mind Common Sense Net Toolkit. Draft at
http://web.media.mit.edu/~hugo/publications/drafts/OMC
SNet%20(CIKM).5.doc
[Maes, 1994] Maes, P. (1994). Agents that Reduce Work
and Information Overload. Communications of the ACM,
37(7).
[Minsky, 2000] Minsky, Marvin (2000). Commonsense-
based interfaces. Communications of the ACM, 43(8), 67-
73.
[Mueller, 1998] Mueller, Erik T. (1998). Natural lan-
guage processing with ThoughtTreasure. New York: Sig-
niform. Available at: http://www.signiform.com/tt/book/
[Musa, et al., 2003] Musa, R., A. Kulas, Y. Anguilette,
M. Scheidegger. (2003) Globuddy, A Broad-Context
Dynamic Phrasebook, International Conference on Mod-
eling and Using Context (CONTEXT '03), Stanford, CA.,
August 2003. Lecture Notes in Computer Science,
Springer-Verlag, Heidelberg, 2003.
[Singh, 2002] Singh, P. (2002). The public acquisition of
commonsense knowledge. In Proceedings of AAAI
Spring Symposium: Acquiring (and Using) Linguistic
(and World) Knowledge for Information Access. Palo
Alto, CA, AAAI
[Stocky, Faaborg, Lieberman, 2004] Common Sense for
Predictive Text Entry, submitted to CHI 2004. Vienna,
April 2004.
[Various Authors, 2003] Various Authors (2003). Com-
mon Sense Reasoning for Interactive Applications Pro-
jects Page.
http://www.media.mit.edu/~lieber/Teaching/Common-
Sense-Course/Projects/Projects-Intro.html.
[Wang and Cassell, 2003] Wang, A. and J. Cassell,
(2003). Co-authoring, Collaborating, Criticizing: Col-
laborative Storytelling between Real and Virtual Chil-
dren, Vienna Workshop '03: Educational Agents - More
than Virtual Tutors, Austrian Research Institute for Arti-
ficial Intelligence, Vienna, Austria, June 2003.
This is the html version of the file http://www.cs.rochester.edu/u/brown/242/assts/termprojs/phil.pdf.
G o o g l e automatically generates html versions of documents as we crawl the web.
To link to or bookmark this page, use the following url: http://www.google.com/search?q=cache:gnckszmyN-sJ:www.cs.rochester.edu/u/brown/242/assts/termprojs/phil.pdf+%22artificial+intelligence%22+%22common+sense%22+site:edu+pdf&hl=en&client=firefox-a
Google is not affiliated with the authors of this page nor responsible for its content.
These search terms have been highlighted: artificial intelligence common sense
These terms only appear in links pointing to this page: pdf
Page 1
Practical applications of Philosophy in Artificial Intelligence
Karim Oussayef
Among the sciences, Artificial Intelligence holds a special attraction for
philosophers. A.I. involves using computers to solve problems that seem to require
human reasoning. This includes computer programs that can beat human opponents at
games, automatically find and proof theorems and understand natural language. Some
people in the AI field contend that programs that solve these types of problems have the
possibility of not only thinking like humans, but also understanding concepts and
becoming conscious. This viewpoint is called strong AI
1
. Many philosophers are
concerned with this bold statement and there is no shortage of arguments against the
metaphysical possibility of strong AI. If these philosophical arguments against strong AI
are true then there are limits to machine intelligence that cannot be surpassed by better
algorithms, faster computers or more clever ideas.
Hilary Putnam in his paper Much Ado About Not Very Much asks “AI may
someday teach us something about how we think, but why are we so exercised about it
now? Perhaps it is the prospect that exercises us, but why do we think now is the time to
think decide what might in principle be possible?” The reason we are so exercised about
A.I. is because knowing whether true intelligence is a possibility will change the goals of
researchers in the field. If strong AI is not possible then the best we can hope for is a
program that acts humanly but doesn’t think humanly. Even this goal is a very difficult
and many programs seek to achieve it. Cycorp
2
is a company whose software attempts to
1
Coined by John Searl in Minds, Brains and Programs.
2
Information from Cycorp’s website.
Page 2
mimic human intelligence by creating a huge database of common sense facts. Their
website gives some examples: “Cyc knows that trees are usually outdoors, that once
people die they stop buying things, and that glasses of liquid should be carried right side
up.”
To illustrate how a fact-based program such as Cycorp’s would try to solve a
simple problem let us turn to the Turing test
3
. Turing reasoned that a computer could
prove that it was artificially intelligent by fooling a person into thinking it was another
human being. His test was modeled from this reasoning: A human would type questions
to either another human or a computer (he or she wouldn’t know which) for a certain
amount of time. If that person couldn’t tell at the end of the time which of the two he or
she was talking to, the computer would pass the test (and therefore Turing reasoned, be
artificially intelligent). Let me stress that I am not arguing that the Turing test is a good
one for determining if a computer can think; I am simply using it to demonstrate how a
program might go about solving a problem. The fact-based program mentioned above
might try to answer the simple question “What is a car?” by supplying the information
that was in its code: “A car is a small vehicle with 4 wheels”. A harder question might
have to do with a description a car object followed by “What am I describing?” This
could be answering by going down a tree of facts as follows: The description is of a
vehicle, search for all the objects under the vehicle topic. It has four wheels; discard the
possibility of the motorcycle. It is light; discard the possibility of the truck. Conclusion:
It must be a car.
A program like this could pass the Turing test if it was given enough data.
However it has many disadvantages. First it requires someone to input a vast amount of
3
Introduced by Alan Turing’s article Computing Machinery and Intelligence in 1950.
Page 3
information manually. Although the program is capable of making some extensions of
the given information, it still needs millions of hard facts. Cycorp’s database has been
painstakingly entered using over 600 person-hours of effort since 1984. The list of facts
now stands at 3 million (Anthes). Second the machine doesn’t seem to work like a
human, it looks up rules and then gives an answer instead of figuring out what the
question means.
Searle’s Chinese room analogy shows why this program isn’t an example of
strong AI. Imagine an English speaking person inside of a small room. This person has
access to a large rulebook, which is written in English. Other people outside the room
can pass notes written in Chinese to him through a small hole in the wall. Although the
person inside the small room cannot speak Chinese, he uses the complex rulebook to give
back an appropriate response to the Chinese writing in Chinese. Also imagine that this
rulebook is so well written that the answers the person inside the room gives back are
indistinguishable from the answers that a native Chinese speaker might give back. This
“man in a room” system would be able to carry on a written conversation with a native
Chinese speaker on the other side of the wall. In fact the Chinese person might assume
he was speaking to another person who understands Chinese. We can plainly see
however, that the person does not.
This analogy is disastrous for fact-based AI. In the same way that the computer
passes the Turing test by fooling humans into thinking it is another human, the English
speaker can fool native Chinese speakers into thinking that he understands Chinese. To
further explain, the person inside the room is analogous to the computer CPU; they both
know how to interpret instructions. The rulebook is analogous to the program; they
Page 4
supply the instructions to obtain the intended result. The computer programmed with this
fact-based knowledge does not understand English any more than the English speaker
understands Chinese. Both of them are following rules instead of understanding what is
being asked and responding based their interpretation.
The defeat of the fact-based program poses problems for strong A.I. supporters. It
shows that any program that relies on pre-made a set of rules (no matter how complex)
cannot understand in the same way that a human mind does. In fact Searle argues: “… in
the literal sense the programmed computer understands what the car and the adding
machine understand, namely, exactly nothing” (Searl 511). However Searle’s argument
doesn’t rule out all programs. A program that learns from scratch, without the use of a
rulebook or a prefabricated fact database, can understand in the same way that a human
can. I will now go about describing such a program.
To construct the fact-based program we attempted to record facts about the world.
The learning program takes an orthogonal approach. It attempts to program the computer
to learn these facts for itself. To see how to go about this let us examine how a small
child learns. A child comes into the world knowing very little. She does not know how
to talk, walk or understand English. She goes about learning these abilities with three
tools. First she has basic goals or needs. Some of a child’s needs are food, water and
shelter. Second she can observe the world. A child can tell that when she is eating, she is
getting less hungry. Finally she can remember what has happened to her. Let me
demonstrate how these three tools allow her to learn something. Imagine that this child is
hungry. She observes that when she cries her mother brings her food. She remembers
Page 5
what has happened to her and finally her need for food causes her to cry again the next
time she’s hungry. Her tools have allowed her to learn that crying results in getting food.
These three tools are the core of the learning program. However, the goals of a
computer will differ from the goals of a human. A computer has no need for food or
water so they are not appropriate goals. Instead these goals can be anything that A.I.
programmers think are important. Isaac Asimov proposed three such goals (or laws) in
his fictional stories
4
:
1. A robot may not injure a human being or, through inaction, allow a
human being to come to harm.
2. A robot must obey the orders given it by human beings, except where
such orders would conflict with the First Law.
3. A robot must protect its own existence, as long as such protection does
not conflict with the First and Second Laws.
In short a robot’s goals are human well-being, human will and its own well-being. These
goals can be implemented in the form of variables linked to actions that the computer
might perform. Whenever the computer does something that accomplishes one of its
goals it might raise the value of the variables connected with its current state or action.
Similarly it would lower the values of these action-variables when it did something
against its goals. These variables also represent the computer’s memory. This is where
the computer remembers what to do the next time it is in a similar situation. Finally the
computer needs a console, sensors or some other form of input so it can observe what is
happening around it. Let me demonstrate how it works with a simple example.
Imagine a robot equipped with a camera, a flashlight and wheels. The robot is put
in an environment and given the extra goal of reaching a certain spot. If the robot had
4
First published in Runaround in 1940.
Page 6
never been in this situation before it might have no idea of how to reach the goal in much
the same way that the child does not know how to get food. So it might begin by doing
any number of things. Perhaps it would turn on its flashlight. This would not help it
reach it’s goal so would try something different. Maybe it starts driving towards the goal.
The robot would observe that it is accomplishing a goal so the “going forward” action
might get a “+ 1 points” in the “trying to reach an object” context. Perhaps there is a wall
in front of it halfway to the flag. It runs into the wall and damages itself. This is bad for
the “well-being of self” goal so the “driving forward” action might get “–1 points” in the
“wall in front of me” context. These point value will help it remember what to do next
time it is trying to get from one point to another. When it sees a wall infront of it in the
future, the robot will see that “driving forward” has less points than, say, “driving
sideways” and might pick that option. The fact that it wants to reach its goals will teach
the robot through trial and error. Eventually it will learn how do drive around objects
(instead of into them).
I argue that a robot constructed in this fashion would actually understand how to
accomplish goals. To support this belief, let’s see if it does any better with the Chinese
room example. Remember that for the fact-based program the person inside the room is
analogous to the computer CPU and the rulebook is analogous to the program. However,
for the learning program there is no rulebook. The person inside the room is analogous to
both the CPU and the program. Instead of people asking questions and having him
answer back, imagine that the input through the slot in his room is the information he
receives from the outside world. At first he has no idea what this input means. He sends
random symbols back but after a while he notices a correlation between what he sends
Page 7
out and what he gets back. He starts to write his own rulebook in his head from this
information that allows him to translate Chinese input into English. When he writes back
he translates the answers that he thought of in English back to Chinese.
The way the “learning-program person” can communicate in Chinese is
drastically different than the way the “fact-based person” does. The “learning-program
person” learns what the Chinese means by association. From his knowledge he knows
the sense of the words. Some people may point out that he does not actually think in
Chinese so he must not understand the language. However, there are many people who
converse in a non-native tongue. We cannot claim that these people’s understanding of
the world is different than our own.
Searl might respond to this learning-program by saying that the person inside the
Chinese room would simulate the entire learning process and that the learning is not
internal but external. This means that the person inside of the room is following
directions that correspond to learning but he himself is not learning. But if such a
program falls victim to the Chinese room, wouldn’t a human brain fall victim as well?
Let us imagine a modified Chinese room for the human brain. Instead of the man inside
of the Chinese room simulating a computer program, he simulates the neurons in
someone’s brain. When he receives input, he would keep track of what neurons get
excited and calculate whether or not they fire. He would know from his rulebook (a
compendium of the laws of physics, chemistry and biology that would allow him to
completely simulate the inner workings of the brain) that when certain neurons fired that
he should output an answer. The person simulating the brain doesn’t understand Chinese
any better than the one simulating a computer program. Why would one be different than
Page 8
the other? Searl’s opinion is that “actual human mental phenomena might be dependant
on actual physical-chemical properties of actual human brains” (Searl 519). Penrose’s
“The emperor’s new mind” provides insight as to why this may be the case.
Penrose mentions many physical processes that are not computable. He first
examines the Mandelbrot set. The Mandelbrot set is created by mapping a formula using
the combination of real and complex numbers. The result is an Argand Plane. Here is
where Penrose brings up an important comment: “We might think of using some
algorithm for generating the successive digits of an infinite decimal expansion, but it
turns out that only a tiny fraction of the possible decimal expansions are obtainable in this
way: the computable numbers” (Penrose 648). In other words, the exact notion of the
Mandelbrot set cannot be computed with a computer. Penrose also mentions quantum
mechanical principles. Tiny sub-atomic particles do not follow the same laws of physics
that larger objects do. The superposition principle states that a particle can be in many
different states at the same time. These states are defined by factors of complex numbers
and thus are another example of a physical law that cannot be simulated in a computer.
These two examples may show why the Chinese room cannot simulate the human
brain. When the person inside of the room was following the directions for simulating a
computer the steps he took were explained by a well-defined algorithm. This is because
computers are Turing machines, a concept that was formalized elegantly by Alan Turing.
All Turning machines can be thought of as a device that reads and writes from an
infinitely long tape. On the tape is a sequence of partitions that are either blank or
marked. The device operates by moving either left or right on the tape. It can change the
current section to either “marked” or “blank” and read its current state. It does this by
Page 9
following a finite set of instructions. This simple abstraction is enough to run any
computer program no matter how complex. It is easy to think of the human inside of the
Chinese room controlling a Turing machine.
The brain may, however, rely on non-algorithmic processes than the person inside
the Chinese room will not be able to follow. If, for example, neuron X would fire only
because of a certain arrangement of subatomic particles, there would be no hard set
directions for what the Chinese-room-person should do. Perhaps the next instruction has
a random chance of occurring, if so the person will be confused and unable to complete
the instruction. It is important to find out whether the brain makes use of these processes
because if it does, it would explain why the Chinese room works for computers but not
for the human brain.
In the chapter “Where lies the physics of the mind,” Penrose argues that the brain
does indeed make use of non-computable phenomenon. He contends that expressions
that deal with consciousness such as “understanding” and “judgment” and those that do
not such as “mindlessly” and “automatically”, suggest a distinction between two parts of
the brain: algorithmic and non-algorithmic (Penrose 653). Penrose brings up Godel’s
incompleteness theorem as an example of how the brain makes use of non-algorithmic
part of the brain. Godel encoded first order predicate calculus into normal arithmetic
using prime numbers. By breaking down F.O.P.C. in this way, he could write out
arithmetic formulas that would equate to either true or false. He used this trick to
demonstrate that there are some statements that cannot be proven or disproved. One such
sentence would be: "A computer which knows the answer to all questions will never
Page 10
prove that this sentence is true.”
5
Human beings know that this sentence is true without
actually going through the process of proving it. If, however, a computer attempts to
assess the validity of the state through a formal proof it will be confused because the
statement remains true until the proof is complete.
Penrose argues that these types of sentences, which humans can reason about,
would be impossible for a computer to understand. What Penrose doesn’t notice is that
even if some statements could not be proved or disproved using FOPC logic, there are
other ways for computers to approach these problems. There is no reason that computers
couldn’t use higher logic to solve puzzles just like a human does. Penrose’s goal of
proving strong A.I. impossible fails because he doesn’t make the link between the non-
algorithmic/non-computable physical phenomenon and the human brain. If in the future
neuroscientists discovered that the brain relies on such processes then his argument
would hold more weight. Still, it would be possible for a program to simulate the
workings of the brain without simulating the actual physical processes.
In fact, computers and human brains excel at different tasks, a fact which makes
literal simulations wasteful. A computer can remember things for an infinite amount of
time (assuming the file isn’t deleted). It can also compute complicated mathematical
expressions in milliseconds. Even a human with the best eidetic memory or an
extraordinary mathematical talent couldn’t rival a computer in these tasks. On the other
hand, computers have a very hard time recognizing objects such as human faces. In dark
or light, different clothes or dyed hair, we can still recognize our best friend. Similarly
the human ability to understand language is amazing. We can utter sentences that we
have never said or heard before and understand a variety of accents and slang. These
5
Adapted from Denton
Page 11
“human algorithms” which require almost no effort for us are very difficult for a
computer. To throw away a computer’s advantages in mathematics, memory and many
other tasks seem a waste. Yet attempting to create a model of human neurons seems to
do exactly that. Instead, it would be better to attempt to simulate the way a human brain
solves problems instead the actual physical processes behind human thinking.
In this paper I have shown how various arguments against strong A.I. interact.
These arguments do not show that it is impossible but do restrict what kind of programs
can be thought of as “truly intelligent”. Searl’s Chinese room argument shows that fact-
based programs are incapable of understanding things in the same way as humans do. It
also excludes programs that have all their information hard coded in. Learning is
essential to programs that wish to support strong A.I. because information has to come
from the program, not from the programmer. Penrose has suggested that the brain is
unable to be simulated by a computer. If this is true than computers must be a simulation
of how the brain thinks not how the brain works. Finally Godel’s incompleteness
theorem shows that programs must use higher reasoning to achieve its goals. Philosophy
is often criticized for being un concerned with real world implications but in this case it
has shown the best direction for A.I. researchers to explore.
Page 12
References
Books
Clancey, William J. 1997. Situated Cognition. Cambridge, UK: Cambridge University Press.
Dreyfus, Hubert. 1992. What Computers Still Can't Do: A Critique of Artificial Reason. Cambridge, MA:
MIT Press.
Kim, Jaegwon. 1998. Philosophy of Mind. Boulder Colorado: Westview Press Inc.
Penrose, Roger. 1989. The Emperor's New Mind: Concerning Computers, Minds and the Laws of Physics.
Oxford: Oxford University Press.
Russell, Smart and Norvig, Peter. 1995, Artificial Intelligence: A Modern Approach
Smith, Brian Cantwell. 1996. On the Origin of Objects. Cambridge, MA: MIT Press/Bradford Books.
Papers
Dennett, Daniel C. 1988. When Philosophers Encounter Artificial Intelligence. The Artificial Intelligence
Debate: False Starts, Real Foundations: 283-296.
Fodor, J.A. 1980. Searl on What Only Brain Can Do. The Nature of Mind: 520.
Fodor, J.A. 1998. After-thoughts: Yin and Yang in the Chinese Room. The Nature of Mind: 524.
LaForte, Geoffrey, Patrick J. Hayes, and Kenneth M. Ford. 1998. Why Godel's Theorem Cannot Refute
Computationalism. Artificial Intelligence: 211-264.
McCarthy, Daniel C. 1988. Mathematical Logic in Artificial Intelligence. The Artificial Intelligence
Debate: False Starts, Real Foundations: 297-311
Putnam, Hillary. 1988. Much Ado About Not Very Much. The Artificial Intelligence Debate: False Starts,
Real Foundations: 269-282.
Sokolowski, Robert. 1988. Natural and Artificial Intelligence. The Artificial Intelligence Debate: False
Starts, Real Foundations: 45-64.
Searl, John R. 1980. Minds, Brains and Programs. The Nature of Mind: 509-519.
Searl, John R. 1980. Author’s response. The Nature of Mind: 521-523.
Searl, John R. 1998. Ying and Yang Strike Out. The Nature of Mind: 525.
Turing, A.M. (1950). Computing machinery and intelligence. Mind, 59, 433-460.
Journals
Gary H. Anthes, Computerizing Common Sense. Computerworld. 4/8/02.
Electronic
Cycorp: Company Overview. http://www.cyc.com/overview.html
Denton, Willaim. 2000. Godel’s Incompleteness Theorem http://www.miskatonic.org/godel.html
This is the html version of the file http://courses.cs.vt.edu/~masc1044/slidesfolder/Ch12.pdf.
G o o g l e automatically generates html versions of documents as we crawl the web.
To link to or bookmark this page, use the following url: http://www.google.com/search?q=cache:wN0iV-1HNQYJ:courses.cs.vt.edu/~masc1044/slidesfolder/Ch12.pdf+%22artificial+intelligence%22+%22common+sense%22+site:edu+pdf&hl=en&client=firefox-a
Google is not affiliated with the authors of this page nor responsible for its content.
These search terms have been highlighted: artificial intelligence common sense
These terms only appear in links pointing to this page: pdf
Page 1
Chapter 12
The Computer Continuum
1
The Computer Continuum
12-1
Chapter 12:
Artificial Intelligence and
Modeling the Human State
Are computers smart enough to replace people?
The Computer Continuum
12-2
Artificial Intelligence and
Modeling the Human State
In this chapter:
• Does “looking intelligent” mean that intelligence is present?
• How does the human brain differ from a computer?
• How does a computer gain and retrieve knowledge as
compared to how a human gains and retrieves knowledge?
• How is it that a computer can recognize text, speech, or a
human face?
• How are computer scientists making computers “smarter?”
The Computer Continuum
12-3
What is Intelligence:
Artificial or Not?
Attempts to understand intelligence:
• Plato (400 BC) - This Greek philosopher believed that
ethereal spirits were rained down from heaven and entered the
body.
• Aristotle (Plato’s student) - The heart must contain the soul
and the brain’s function was to cool the blood.
• Galen - Treated fallen gladiators with spinal cord injuries.
Noted that feeling lost in certain limbs sometimes came back.
• Galvani - Used Benjamin Franklin’s findings about static
electricity to show that static electricity stimulated the nerves
causing a frog to jump.
• Subsequently - Human nervous system found to be a complex
network of billions of neurons.
Page 2
Chapter 12
The Computer Continuum
2
The Computer Continuum
12-4
What is Intelligence:
Artificial or Not?
Does “looking intelligent” mean that intelligence is present?
• Maillardet’s Automaton (Henri Malliardet, 1805):
– Object having human form seemed to mimic the intelligence of
the human.
– Drawing machine.
• Disguised as a young boy.
• Containing levers, ratchets, cams and other mechanical
devices.
• Could draw several complex images.
– Because it had human form and could draw complex images, a
certain feeling of intelligence was ascribed to the machine.
The Computer Continuum
12-5
What is Intelligence:
Artificial or Not?
Sailing vessel drawn by
Maillardet’s Automaton.
The Computer Continuum
12-6
What is Intelligence:
Artificial or Not?
Alan Turing (1912 - 1954)
• Proposed a test - Turing’s
Imitation Game
– Tests the intelligence of the
computer.
• Phase 1:
– Man and woman separated
from an interrogator.
– The interrogator types in a
question to either party.
– By observing responses, the
interrogator’s goal was to
identify which was the man
and which was the woman.
Interrogator
Honest Woman
Lying Man
Page 3
Chapter 12
The Computer Continuum
3
The Computer Continuum
12-7
What is Intelligence:
Artificial or Not?
Phase 2 of the Turing’s test:
• The man was replaced by the
computer.
• If the computer could fool the
interrogator as often as the
person did, it could be said that
the computer had displayed
intelligence.
Interrogator
Honest Woman
Computer
The Computer Continuum
12-8
Modeling Human Intelligence
Modeling human intelligence systems:
• One way to study complex systems is to build a working
model of the system, and observe it in action.
• Two (of several) approaches to model some of the thinking
patterns of the human brain:
– Semantic networks
– Rule-based systems or Expert systems
The Computer Continuum
12-9
Modeling Human Intelligence
Semantic networks are designed after the psychological model of
the human associative memory.
John
Plumber
Worker
Owner
Ford
Car
May 97
Time
Oct 00
Ownership
Situation
Is a
Is a
Is a
Is a
Is a
Is a
Owner
Ownee
Start-time
End-time
Page 4
Chapter 12
The Computer Continuum
4
The Computer Continuum
12-10
Modeling Human Intelligence
Rule-based or Expert systems - Knowledge bases consisting of
hundreds or thousands of rules of the form:
IF (condition) THEN (action).
• Use rules to store knowledge (“rule -based”).
• The rules are usually gathered from experts in the field being
represented (“expert system”).
– Most widely used knowledge model in the commercial
world.
– IF (it is raining AND you must go outside)
– THEN (put on your raincoat)
The Computer Continuum
12-11
Modeling Human Intelligence
For any of these models of the human knowledge system to work,
it must be able to make use of this human knowledge in three
different ways:
• Acquisition - Must be some way of putting information or
knowledge into the system.
• Retrieval - Must be able to find knowledge when it is wanted
or needed.
• Reasoning - Must be able to use that knowledge through
“thinking” or reasoning.
The Computer Continuum
12-12
Modeling Human Intelligence
Knowledge Acquisition:
• A fact is the simplest type of knowledge that can be acquired.
– Bees sting.
• Ideas, concepts, and relationships are more difficult for
humans and machines.
– Provoking bees causes them to sting.
– What isa chair?
Page 5
Chapter 12
The Computer Continuum
5
The Computer Continuum
12-13
Modeling Human Intelligence
Knowledge Retrieval by Searching
• After knowledge has been acquired and stored in one’s
memory, it can be retrieved and used to solve problems.
• Brute -force search- Looks at every possible solution before
choosing among them.
– Hexapawn game example: The program searches through
all the possible moves and then selects the best.
The Computer Continuum
12-14
Modeling Human Intelligence
Hexapawn Game
Tree
Shows different
moves (“mirror
images” are not
shown.)
The Computer Continuum
12-15
Modeling Human Intelligence
Heuristic search - Rules of thumb, which are used to limit the
number of items that must be searched in solving a problem. (Not
guaranteed to lead to a solution.)
• Used by more complex systems such as those that diagnose
individuals that are prone to heart attacks.
• Chess game tree would have 10
120
possible moves.
– Uses rules of thumb to reduce the number of possible plays.
• Example: Examine a few plays ahead instead of all the
ways to the end of the game.
– Deep Blue (1996) by IBM - Garry Kasparov, world -champion
chess player, won over Deep Blue 4 points to 2.
– Deep Blue (1997) by IBM - Garry Kasparov conceded victory to
Deep Blue, 3.5 points to 2.5.
Page 6
Chapter 12
The Computer Continuum
6
The Computer Continuum
12-16
Modeling Human Intelligence
Reasoning with knowledge
• Humans: Reasoning is what we do when we solve problems.
• In Artificial Intelligence: Two types of reasoning are
commonly used.
– Shallow reasoning: Based on heuristics or rule -based
knowledge.
• Computers, for the most part, do shallow reasoning.
– Deep reasoning: Deals with models of the problem
obtained from analyzing the structure and function of
component parts of the problem.
• Humans commonly apply deep reasoning.
The Computer Continuum
12-17
Modeling Human Intelligence
How can the knowledge base be built up so that there is sufficie nt
knowledge to reason with?
• Learning systems: Intelligent computer programs that are
capable of learning.
• Types of learning that are used to write intelligent programs:
– Rote learning - Memorization of facts.
– Learning by instruction - Similar to student/teacher
relationship found in classrooms.
– Learning by deduction - Drawing conclusions from certain
premises (This is a cat. All cats are animals. Therefore, this is an
animal.)
– Learning by induction - Includes subcategories: learning by
example, experimentation, observation, and by discovery.
– Learning by analogy - Combines both deductive and inductive
learning. (Being bitten by a teased dog may make an individual
not tease bees.)
The Computer Continuum
12-18
Modeling Human Intelligence
Common Sense
• Problems that seemed to be most difficult, such as playing
chess, turned out to be relatively simple.
• The computer must be able to make inferences from the
knowledge base.
– Answers to problems might not be listed.
– The computer will need to come up with its own answers!
– This has been a very difficult area in Artificial
Intelligence
.
• Cyc (enCYClopedia ) Computer program that exhibits and can
apply common sense.
– e-Cyc: (Electronic commerce) Advanced search engine
narrows a search and gives list of meaningful subtopics.
Page 7
Chapter 12
The Computer Continuum
7
The Computer Continuum
12-19
Natural Language
Communication
The semantics (meaning of the words) is the goal in
communication with the computer.
• ELIZA (1965) by Joseph Weizenbaum of MIT
– Simulation of a therapist mirroring a patient’s
conversation in a positive manner.
– The program fooled people into believing it was a
therapist.
– In response of ELIZA’s popularity, Weizenbaum wrote a
book in 1976: Computer Power and Human Potential.
• “Science promised man power. But, as so often happens
when people are seduced by promises of power… the price
actually paid is servitude and impotence.”
The Computer Continuum
12-20
Natural Language
Communication
Part of an ELIZA Conversation:
All men are alike.
IN WHAT WAY
They’re always bugging us about something or other.
CAN YOU THINK OF A SPECIFIC EXAMPLE
Well, my boyfriend made me come here.
YOUR BOYFRIEND MADE YOU COME HERE
He says I’m depressed much of the time.
I AM SORRY TO HEAR YOU ARE DEPRESSED
It’s true, I am unhappy.
DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY
I need some help, that much seems certain.
WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP
Perhaps I could learn to get along with my mother.
TELL ME MORE ABOUT YOUR FAMILY
The Computer Continuum
12-21
Natural Language
Communication
Semantic Translation Problems (Problems with
language translation).
• A classic example, called the Bar-Hillel paradox, illustrates a
difficult semantic problem:
The pen is in the box.
The box is in the pen.
– Both sentences have identical syntax structures.
– Interpretations:
• First statement: A writing instrument is in the box.
• Second statement: A box is in the playpen.
• Convinced Bar-Hillel that computer translation of languages
was impossible.
Page 8
Chapter 12
The Computer Continuum
8
The Computer Continuum
12-22
Natural Language
Communication
Early attempts at language translation:
• An early attempt to translate an English expression to Russian
and back again to English:
– Typed in English (sentence to be translated...):
• The spirit is willing, but the flesh is weak.
– Translated by the program into Russian and back into
English:
• The vodka is strong, but the meat is rotten.
Translation programs have come a long way.
• WWW translation programs
– Accuracy and interpretation still very crude.
The Computer Continuum
12-23
Expert Systems
Expert systems are commercially the most successful domain in
Artificial Intelligence.
• These programs mimic the experts in whatever field.
Auto mechanic
Telephone networking
Cardiologist
Delivery routing
Organic compounds
Professional auditor
Mineral prospecting
Manufacturing
Infectious diseases
Pulmonary function
Diagnostic internal medicine
Weather forecasting
VAX computer configuration
Battlefield tactician
Engineering structural analysis
Space-station life support
Audiologist
Civil law
The Computer Continuum
12-24
Expert Systems
Expert systems are also called Rule-based systems.
• Expert’s expertise is built into the program through a
collection of rules.
• The desired program functions at the same level as the human
expert.
• The rules are typically of the form:
– If (some condition) then (some action)
– Example: If (gas near empty AND going on long trip)
then (stop at gas station AND fill the gas tank AND check
the oil).
• EXCON: An expert system used by Digital Equipment Corp.
to help configure the old VAX family of minicomputers.
Page 9
Chapter 12
The Computer Continuum
9
The Computer Continuum
12-25
Expert Systems
Two major parts of an expert system:
• The knowledge base: The collection of rules that make up the
expert system.
• The inference engine: A program that uses the rules by
making several passes over them.
– On each pass, the inference engine looks for all rules
whose condition is satisfied (if part).
– It then takes the action (then part) and makes another pass
over all the rules looking for matching condition.
– This goes on until no rules’ conditions are matched.
– The results are all those action parts left.
The Computer Continuum
12-26
Expert Systems
Inference engines can pass through the rules in
different directions:
• Forward chaining: Going from a rule’s condition to a rule’s
action and using the action as a new condition.
• Backward chaining: Goes in the other direction.
– Example: Medical doctors use both.
• Forward chaining: Going to the doctor with
symptoms (stomach pain). The doctor will come up
with a diagnosis (ulcer).
• Backward chaining: The doctor asks if patient has
been eating green apples knowing green apples cause
stomach aches.
The Computer Continuum
12-27
Expert Systems
Harold Cohen created an
expert system called AAORN
to create art in 1973.
• AARON is a collection of
over 1,000 rules.
– Includes information
regarding human anatomy
and gravity.
• AARON is free to draw what
it may draw. It then colors the
drawings.
• A PC-version of AARON is
being prepared for mass
distribution.
Page 10
Chapter 12
The Computer Continuum
10
The Computer Continuum
12-28
Neural Networks
Neuron: Basic building-block of the brain.
• There are several specialized types, but all have the same
basic structure:
• The basic structure of an animal neuron.
The Computer Continuum
12-29
Neural Networks
Artificial models of the brain are of two distinct types:
• Electronic: Has electronic circuits that act like neurons.
• Software: This version runs a program on the computer that
simulates the action of the neurons.
The Computer Continuum
12-30
Neural Networks
Artificial neurons :
Commonly called processing elements,
are modeled after real neurons of humans and other animals.
• Has many inputs and one output.
– The inputs are signals that are strengthened or weakened
(weighted).
– If the sum of all the signals is strong enough, the neuron will put
out a signal to the output.
Output
Artificial
Neuron
Inputs
Page 11
Chapter 12
The Computer Continuum
11
The Computer Continuum
12-31
Neural Networks
Neural Network:
A collection of neurons which are
interconnected. The output of one connects to several others with
different strength connections.
• Initially, neural networks have no knowledge. (All
information is learned from experience using the network.)
Input 1
Input 2
Input 3
Neuron 1
Neuron 2
Output from
Neuron 1
Output from
Neuron 2
The Computer Continuum
12-32
Neural Networks
Training a Neural Network
• Supervised training:
– Occurs when the neural network is given input data.
– The resulting output is compared to the correct input.
– The strengths of the connections are then modified so as
to minimize errors in succeeding input/output pairs.
• Example: Back propagation: This method of learning is
divided into two phases:
1. The inputs are applied to the network, and the outputs
compared with the correct output.
2. The resulting information about any error is fed
backwards through the network, adjusting the connection
strengths to minimize the error.
The Computer Continuum
12-33
Neural Networks
Neural networks in action: A case study.
• Mortgage Risk Evaluator.
– Data from several thousand mortgage applicants was used
to train a neural network.
• Credit data of each individual was paired with each
loan result.
• Patterns for successful loans and defaults of
mortgages were contained in the data.
• The neural network’s weights (measurements of
strengths) were adjusted to match the actual output.
– Now, a new mortgage applicant is entered as input. The
program determines whether they are a bad risk.
Page 12
Chapter 12
The Computer Continuum
12
The Computer Continuum
12-34
Evolutionary Systems
Alan Turing, in 1950, identified three attributes that
are the basis for what is now termed genetic
programming.
• Heredity
• Mutation
• Natural selection
• Evolution is being used to create or grow programs.
The Computer Continuum
12-35
Evolutionary Systems
Genetic Algorithm (simulated evolution):
• Mimics the processes in the genetics of living systems.
• Created by John Holland (mid-1960’s) U. of Michigan.
• The human puts together the system and specifies the desired
results, but the details on how it is done are left to evolve.
• Example: Koza, a student of Holland, developed a system that
had tree-structured chromosomes.
– Using basic astronomical data, his system came up with
Kepler’s 3rd law of planetary motion.
• “the cube of a planet’s distance from the sun is
proportional to the square of its period”
• Major problem with genetic algorithms: An intimate
knowledge of the system must be known.
The Computer Continuum
12-36
Evolutionary Systems
Genetic Programming:
• A technique that follows Darwinian evolution.
• The evolution takes place directly on the programs in the
population that are striving to reach the goal specified by the
programmer.
– Only the goal is known and possibly some of the structure of
the solution..
Page 13
Chapter 12
The Computer Continuum
13
The Computer Continuum
12-37
Complex Adaptive Systems
Complex adaptive systems: A collection of
many parts individually operating under
relatively simple rules, and are highly
interactive in a nonlinear way.
• Their parts are self organizing, operate in
parallel, and exhibit emergent behavior (totally
unpredictable results can occur).
• The system of parts evolves with natural
selection operating.
• Example: Mound-building termite colonies in
Australia.
– Mounds can be several feet high.
– Termites follow a simple set of rules.
– Mounds affect what can grow around it.
The Computer Continuum
12-38
Complex Adaptive Systems
Chaos:
• Described as a situation where things seem unorganized and
unpredictable.
• Tiny changes in the starting point produce solutions to a
problem that seem to have almost random results.
• “Butterfly affect”: A tiny flip of a butterfly’s wings could sta rt
a hurricane.
Artificial life: (a-life)
• A phenomena in computers that has attributes of life.
• Some argue that computer viruses are a form of a -life.
The Computer Continuum
12-39
Natural Language Translation
Two distinct classes of translation software:
• One works while you are on the WWW.
– Can be a direct translation of a complete Web page or
parts of its foreign language text.
• The other is a standalone piece of software that is used to
translate files of foreign language text.
– Many are available.
• Simply Translating is a program that costs under
$50.00.
Page 14
Chapter 12
The Computer Continuum
14
The Computer Continuum
12-40
Natural Language Translation
Web-based Language Translation
• Babel Fish (Free service on Alta Vista)
– Text is cut and then pasted into a translation box.
– “Test translation” from English to Italian and back:
• The spirit is willing, but the flesh is weak.
• The spirit is arranged, but the meat is weak person.
• FreeTranslation.com
– Allows you to enter a URL and then translates it.
– Also does text entry for direct translation to and from English.
– “Test translation” from English to German and back:
• The spirit is willing, but the flesh is weak.
• The intellect is ready, but the meat is weak.
This is the html version of the file http://www.eecs.umich.edu/~rthomaso/documents/nls/nmslite.pdf.
G o o g l e automatically generates html versions of documents as we crawl the web.
To link to or bookmark this page, use the following url: http://www.google.com/search?q=cache:aYTte7oTiIkJ:www.eecs.umich.edu/~rthomaso/documents/nls/nmslite.pdf+%22artificial+intelligence%22+%22common+sense%22+site:edu+pdf&hl=en&client=firefox-a
Google is not affiliated with the authors of this page nor responsible for its content.
These search terms have been highlighted: artificial intelligence common sense
These terms only appear in links pointing to this page: pdf
Page 1
Formalizing the Semantics of Derived Words
Richmond H. Thomason
Philosophy Department
University of Michigan
Ann Arbor, MI 48109-2110
U.S.A.
March 24, 2001
Working Draft of a Paper in Progress
This is a working draft:
version of March 24, 2001.
The material is volatile; do not quote.
Comments welcome.
Page 2
1.
Introduction
The logical approach that has been so successful in the semantic interpretation of syntactic
structure has never produced a very satisfactory account of word meaning. This paper is
intended to promote and illustrate an approach to that problem.
I believe that this approach leads to a wider problem that brings together elements of
linguistics and philosophy in an illuminating way. But the single case study that I provide
here, while it may be suggestive, does not go far enough to make a good case for the more
general point. This paper is extracted from a larger collection of documents, and is intended
to motivate and illustrate the ideas.
But I hope that even a partially successful and fragmentary sketch of the larger project
may convince some members of my audience that the natural language semantics community
and the subgroup of the AI community interested in formalizing common sense knowledge
have a great deal in common, and much to learn from one another, and that what they have
to learn is useful and important for philosophy.
2.
Logicism
1
I want to begin by situating certain problems in natural language semantics with respect to
larger trends in logicism, including:
(i) Attempts by positivist philosophers earlier in this century to provide a log-
ical basis for the physical sciences;
(ii) Attempts by linguists and logicians to develop a “natural language ontology”
(and, presumably, a logical language that is related to this ontology by
formally explicit rules) that would serve as a framework for natural language
semantics;
(iii) Attempts in artificial intelligence to formalize common sense knowledge.
Frege did a lot for logic, but I think he left us with an undeservedly narrow and unpromis-
ing version of logicism that is entirely too focused on the subject matter of mathematics and
the analytic tool of definition.
Let X be a topic of inquiry. X logicism is the view that X should be presented as an
axiomatic theory from which the rest can be deduced by logic. Science logicism is expressed
as an ideal in Aristotle’s Organon. But Aristotle’s logic is far too weak to serve as a means of
representing Aristotelian science, and logicism remained impracticable until the 17th century,
when a separation of theoretical science from common sense simplified the task of designing
an underlying logic.
2
There is a moral here about logicism. X logicism imposes a program: the project of
actually presenting X in the required form. But for the project to be feasible, we have to
choose a logic that is adequate to the demands of the topic. If a logic must involve explicit
formal patterns of valid reasoning, the central problem for X logicism is then to articulate
formal patterns that will be adequate for formalizing X.
1
The material in this and the subsequent section is lifted in part from [Thomason, 1991].
2
Despite the simplification, of course, a workable formalism did not begin to emerge until the 19th century.
Page 3
The fact that very little progress was made for over two millennia on a problem that
can be made to seem urgent to anyone who has studied Aristotle indicates the difficulty of
finding the right match of topic and formal principles of reasoning. Though some philosophers
(Leibniz, for one) saw the problem clearly, the first instance of a full solution is Frege’s choice
of mathematical analysis as the topic, and his development of the Begriffschrift as the logical
vehicle. It is a large part of Frege’s achievement to have discovered a choice that yields a
logicist project that is neither impossible nor easy.
I will summarize some morals. (1) Successful logicism requires a combination of a formally
presented logic and a topic that can be formalized so that its inferences become logical
consequences. (2) When logicist projects fail, we may need to seek ways to develop the logic.
(3) Logic development can be difficult and protracted.
3.
Extensions to the empirical world
The project of extending Frege’s achievement to the empirical sciences has not fared so well.
Of course, the mathematical parts of sciences such as physics can be formalized in much the
same way as mathematics. Though the metamathematical payoffs of formalization are most
apparent in mathematics, they can occasionally be extended to other sciences.
3
But what of
the empirical character of sciences like physics? One wants to relate the systems described
by these sciences to observations.
Rudolph Carnap’s Aufbau
4
was an explicit and ambitious attempt to extend mathematics
logicism to science logicism, by providing a basis for formalizing the empirical sciences. The
Aufbau begins by postulating elementary units of subjective experience, and attempts to
build the physical world from these primitives in a way that is modeled on the constructions
used in Frege’s mathematics logicism.
Carnap believed strongly in progress in philosophy through cooperative research. In this
sense, and certainly compared with Frege’s achievement, the Aufbau was a failure. Nelson
Goodman, one of the few philosophers who attempted to build on the Aufbau, calls it “a
crystallization of much that is widely regarded as worst in 20th century philosophy.”
5
After the Aufbau, the philosophical development of logicism becomes somewhat frag-
mented. The reason for this may have been a general recognition, in the relatively small
community of philosophers who saw this as a strategically important line of research, that
the underlying logic stood in need of fairly drastic revisions.
6
This fragmentation emerges in Carnap’s later work, as in the research of many other
logically minded philosophers. Deciding after the Aufbau to take a more direct, high-level
approach to the physical world, in which it was unnecessary to construct it from phenomenal
primitives, Carnap noticed that many observation predicates, used not only in the sciences
but in common sense, are “dispositional”—they express expectations about how things will
behave under certain conditions. A malleable material will deform under relatively light
pressure; a flammable material will burn when heated sufficiently. It is natural to use the
word ‘if’ in defining such predicates; but the “material conditional” of Frege’s logic gives
3
See [Montague, 1962].
4
[Carnap, 1928].
5
[Goodman, 1963], page 545.
6
I can vouch for this as far as I am concerned.
2
Page 4
incorrect results in formalizing such definitions. Much of [Carnap, 1936 1937] is devoted to
presenting and examining this problem.
Rather than devising an extension of Frege’s logic capable of solving this problem, Carnap
suggests dropping the requirement that these predicates should be explicated by definitions.
This relaxation makes it harder to carry out the logicist program, because a natural way
of formalizing dispositionals is forfeited. But it also postpones a difficult logical problem,
which was not, I think, solved adequately even by later conditional logics in [Stalnaker and
Thomason, 1970] and [Lewis, 1973]. Such theories do not capture the notion of normality
that is built into dispositionals: a more accurate definition of ‘flammable’, for instance, is
‘what will normally burn when heated sufficiently’. Thus, logical constructions that deal
with normality offer some hope of a solution to Carnap’s problem of defining dispositionals.
Such constructions have only become available with the development of nonmonotonic logics.
Although the logicist program has turned into a number of disparate logicist projects,
from around 1970 on we have seen steady, cumulative progress on these projects. Most of
this progress has been made not by philosophers, but by linguists and computer scientists;
large-scale formalization projects and the development of logics appropriate for them are
now far more common in these other fields than in philosophy. Works like [Dowty, 1979] and
[Link, 1983] (by linguists) and [Davis, 1991] (by a computer scientist) illustrate the point.
The logical tools that are currently used by philosophers in thinking about philosophical
problems are over thirty years old. In fact, except for a relatively narrow group of specialists,
the philosophical community remains unaware of the newer developments and their relevance
to philosophy.
This project is meant to illustrate what can be done to illuminate a historically important
problem using methods from nonmonotonic logic (a contribution from computer science) and
the theory of eventuality structure (a contribution from linguistics). It relies heavily on work
of Mark Steedman, who works in both linguistics and computer science.
7
It can also be seen as part of a linguistic project concerning the meanings of complex or
derived words.
4.
Linguistic logicism
In linguistics, a clear logicist tradition emerged from the work of Richard Montague, a
philosopher who (building to a large extent on Carnap’s work in [Carnap, 1956]) developed
a logic he presented as appropriate for philosophy logicism.
Montague motivates his logical framework in [Montague, 1969] with a problem in the
semantics of derived words: the need to relate empirical predicates like ‘red’ to their nomi-
nalizations, like ‘redness’. He argued that many such nominalizations denote properties, that
terms like ‘event’, ‘obligation’, and ‘pain’ denote properties of properties, and that proper-
ties should be treated as functions taking possible worlds into extensions. The justification
of this formal ontology, and of the logical framework that goes with it, consists in its abil-
ity to formalize certain sentences in a way that allows their inferential relations with other
sentences to be captured by the underlying logic.
Philosophers other than Montague—not only Frege, but Carnap in [Carnap, 1956] and
7
See [Steedman, 1998].
3
Page 5
Church in [Church, 1951]—had resorted informally to this methodology. But Montague was
the first to see the task of natural language logicism as a formal challenge. By actually
formalizing the syntax of a natural language, the relation between the natural language
and the logical framework could be made explicit, and systematically tested for accuracy.
Montague developed such formalizations of several ambitious fragments of English syntax in
several papers, of which [Montague, 1973] was the most influential.
The impact of this work has been more extensive in linguistics than in philosophy. Formal
theories of syntax were well developed in the early 1970s, and linguists were used to using
semantic arguments to support syntactic conclusions, but there was no theory of semantics
to match the informal arguments. “Montague grammar” quickly became a paradigm for
some linguists, and Montague’s ideas and methodology have influenced the semantic work
of all the subsequent approaches that take formal theories seriously.
As practiced by linguistic semanticists, language logicism would attempt to formalize
a logical theory capable of providing translations for natural language sentences so that
sentences will entail one another if and only if the translation of the entailed sentence follows
logically from the translation of the entailing sentence and a set of “meaning postulates”
of the semantic theory. It is usually considered appropriate to provide a model-theoretic
account of the primitives that appear in the meaning postulates.
This methodology gives rise naturally to the idea of “natural language metaphysics,”
which tries to model the high-level knowledge that is involved in analyzing systematic rela-
tions between linguistic expressions. For instance, the pattern relating the transitive verb
‘bend’ to the adjective ‘bendable’ is a common one that is productive not only in English
but in many languages. So a system for generating derived lexical meanings should include
an operator able that would take the meaning of ‘bend’ into the meaning of ‘bendable’.
To provide a theory of the system of lexical operators and to explain logical interactions
(for instance, to derive the relationship between ‘bendable’ and ‘deformable’ from the re-
lationship between ‘bend’ and ‘deform’), it is important to provide a model theory of the
lexical operators. So, for instance, this approach to lexical semantics leads naturally to a
model-theoretic investigation of ability,
8
a project that is also suggested by a natural train
of thought in logicist AI.
9
Theories of natural language meaning that, like Montague’s, grew out of theories of
mathematical language, are well suited to dealing with quantificational expressions, as in
(4.1) Every boy gave two books to some girl,
In practice, despite the original motivation of his theory in the semantics of word formation,
Montague devoted most of his attention to the problems of quantification, and its interaction
with the intensional and higher-order apparatus of his logical framework.
But some of those who developed Montague’s framework turned their attention to lexical
problems, and a body of the later research in Montague semantics—especially David Dowty’s
8
That the core concept that needs to be clarified here is ability rather than the bare conditional ‘if’ is
suggested by cases like ‘drinkable’. ‘This water is drinkable’ doesn’t mean ‘If you drink this water it will
have been consumed’. (Of course, ability and the conditional are related in deep ways.) I will return briefly
to the general problem of ability in Section 7.5, below.
9
See, for example, [Shoham, 1993].
4
Page 6
early work in [Dowty, 1979] and the work that derives from it—concentrates on semantic
problems of word formation, which of course is an important part of lexical semantics.
10
5.
Formalizing common sense
Due to the influence of John McCarthy, a group of common sense logicists has emerged
within the logically minded members of the Artificial Intelligence Community. McCarthy’s
views have been strongly and consistently expressed in a series of papers beginning in 1959.
11
The idea is that we will not know how to build algorithms that express intelligent behavior
until we have an explicit theory of the core phenomena of intelligent thought; and the term
‘common sense’ is merely a way of indicating the phenomena in question. In practice, the
research of the AI logicists is integrated with much less ambitious formalization tasks having
to do with specialized sorts of reasoning such as planning and temporal reasoning. But
formalizing common sense remains as an important high-level goal for most of us.
To a certain extent, the motives of the common sense logicists overlap with Carnap’s
reasons for the Aufbau. The idea is that the theoretical component of science is only part of
the overall scientific project, which involves situating science in the world of experience to
explain the reasoning that goes into the testing and application of theories; see [McCarthy,
1984] for explicit motivation of this sort. For extended projects in the formalization of
common sense reasoning, see [Hobbs and Moore, 1988] and [Davis, 1991].
The project of developing a broadly successful logic-based account of semantic interre-
lationships among the lexical items of a natural language is roughly comparable in scope
with the project of developing a high-level theory of common sense knowledge. Linguists
are mainly interested in explanations, and computer scientists are (ultimately, at any rate)
interested in implementations. But for logicist computer scientists who have followed Mc-
Carthy’s advice of seeking understanding before implementing, the immediate goals of the
linguistic and AI projects are not that different.
And—at the outset at least—the subject matter of the linguistic and the computational
enterprise are remarkably similar. The linguistic research motivated by lexical decomposition
beginning in [Dowty, 1979] and the computational research motivated largely by problems
in planning (or practical reasoning) both lead naturally to a focus on the problems of repre-
senting change, causal notions, and ability.
6.
Formalizing nonmonotonic reasoning
See [Ginsberg, 1987] for a good guide to the field of nonmonotonic reasoning and its early
development. For subsequent developments, some good book-length treatments have become
10
This emphasis on compositionality in the interpretation of lexical items is similar to the policy that
Montague advocated in syntax, and it has a similar effect of shifting attention from representing the content
of individual lexical items to operators on types of contents. But this research program seems to require
a much deeper investigation of “natural language metaphysics” or “common sense knowledge” than the
syntactic program, and one can hope that it will build bridges between the more or less pure logic with
which Montague worked and a system that may be more genuinely helpful in applications that involve
representation of and reasoning with linguistic meaning.
11
See the papers collected in [Lifschitz, 1990].
5
Page 7
available, including [Antoniou, 1997, Brewka et al., 1997, Schlechta, 1997]. Also see the
relevant chapters of [Gabbay et al., 1994].
Among the available theories of defeasible reasoning that could be applied in lexical
semantics, I find circumscription the most congenial to use in attempting to apply these
theories to problems of natural language semantics, for the following reasons.
– Circumscription is relatively conservative from a logical point of view. For
instance, its language is simply the language of classical first-order or
higher-order logic, and the local semantics of expressions—their
satisfaction conditions in a model—are left unchanged. This makes it
relatively easy to use circumscription as a development tool.
– It is a straightforward matter to convert Montague’s formalism into a
circumscriptive theory.
– The more sophisticated versions of circumscription provide an explicit for-
malism for dealing with abnormalities.
12
I believe that such a formalism is
needed in the linguistic applications.
This version of the paper is designed to be understandable without going into technical-
ities. In particular, to understand the ideas behind circumscription, readers need only to
know the following things.
1. A number of abnormality predicates are introduced into the language.
2. In defining logical consequence, attention is restricted to models in which
the abnormalities are simultaneously minimized, while certain terms (the
ones that are deemed independent of the abnormalities) are held constant,
and certain other terms are allowed to vary.
3. This has the effect of taking only certain “preferred models” into account.
A theory Γ circumscriptively implies a consequence A if A is true in all the
preferred models of Γ.
4. These preferences can be constrained by an explicit “abnormality theory”
using the predicates.
7.
Thesis
The following is an appropriate and illuminating logicist project.
To use a nonmonotonic version of Montague’s Intensional Logic, combined with
specialized domains dealing with eventuality types, plurals, and mass nouns, as the
means of formalizing the logical relations between the meanings of semantically
related words.
I try to make a case for this idea by illustrating it with several case studies. This version
of the paper will contain only one such study. But readers familiar with lexical semantics
should be able to see that the techniques can readily be generalized to other cases.
12
See [Lifschitz, 1988].
6
Page 8
8.
Case studies
The first case study (and the only one presented in this abbreviated version) has to do with
words involving the suffix ‘able’.
8.1.
The -able suffix
The -able suffix illustrates a number of characteristics that challenge semantics.
1. There is variation in the meanings it assumes, but this variation is across a
family of closely related shades of meaning. As usual in these cases, it is
hard to tell whether to treat the variation by listing senses, by finding a
single common meaning allowing for different uses, or by making the
meaning context-dependent.
2. The meanings themselves are difficult to formalize.
3. These meanings seem to invoke references to concepts via relations of
common-sense real world knowledge rather than linguistic knowledge.
4. There are exceptional patterns.
8.1.1.
Sense 1 of able: the ability to perform actions
The most usual pattern of Verb+able involves transitive verbs V that are broadly telic.
Such verbs have three characteristics: they correspond to procedures that are in the normal
repertoire of actions of human agents, there are normal or standard ways of initiating these
actions, and there is a successful end state associated with the performance of the actions.
In what I will call the paradigmatic case, the meaning of the derived adjectival form is that
a thing normally will achieve the state s successfully when a test action associated with V
is applied to it. The term ‘successful’ is deliberately used here to cover both the cases in
which the state is really achieved, and in which the state is achieved without undesirable
side effects. (This last condition, we can see, can shade into cases in which there are not
only no undesirable side effects, but in which the state is worthy of being achieved.)
Here are some examples illustrating this paradigmatic case. (Warning, some of these
cases are ambiguous, and also fall under other cases.)
7
Page 9
acceptable
dispensible
observable
adjustable
doable
openable
admissible
driveable
provable
adoptable
expendable
printable
applicable
expressible
readable
approachable
fixable
recognizable
bearable
flexible
reusable
believable
formalizable
reversible
breakable
imaginable
solvable
cleanable
implementable
TeXable
communicable
learnable
trainable
consumable
liftable
transferable
defeatable
loveable
transportable
defensible
modifiable
wearable
detectable
moveable
withdrable
8.1.2.
Carnap’s problem: defining ‘soluble’
The natural way to define ‘x is water-soluble’ is this.
(8.2) If x were put in some water, then x would dissolve in the water.
So at first glance, it may seem that the resources for carrying out the definition that Carnap
found problematic will be available in a logic with a subjunctive conditional. We have had
such logics, based on the apparatus of modal logic, since around 1970; see [Stalnaker and
Thomason, 1970, Lewis, 1973].
But the fact that these conditionals conform to the rule of modus ponens makes them
unsuitable for this purpose. Suppose it happens to be true that if one were to put this lump
of salt in some water, it would be in this water—and this water is already saturated with
salt. The fact that the lump would not then dissolve is no reason why this salt should count
as not water-soluble.
This and other such thought experiments indicate that what is wanted is not the tra-
ditional subjunctive conditional, but a “conditional normality” of the sort that is used in
deontic logics. Such conditionals can also be integrated into a nonmonotonic formalism.
13
In a circumscriptive framework, we do not introduce a special conditional, but formulate
normalcy constraints using truth functional logic with abnormality predicates. Thus, (8.2)
becomes something like this:
(8.3) ∀x, y, t[[Water(y) ∧ Put-in(x, y, t) ∧ ¬Ab(x, y, t)]
→ Dissolve(x, y, t)].
The abnormality predicate in the antecedent of this formulation removes difficulties that
arise from the defeasibility of the generalization captured by ‘soluble’. We can formulate a
theory of abnormalities by explicitly adding an axiom to the effect that for all quantities x
of salt, Ab(x, y, t) holds for any quantity of water y that is saturated with salt at t. We can
13
See [Boutilier, 1992], [Asher and Morreau, 1991].
8
Page 10
add other conditions of this sort as they occur to us.
14
The fact that we are circumscribing
the predicate Ab will make constraint (8.3) apply as a default in the nonexceptional cases.
This repairs one problem in (8.2); (8.3) is able to deal with counterexamples relating to
the defeasibility of the causal relationship indicated by this sense of -able. But there are two
other problems from which (8.3) suffers.
(i) Vacuous cases. It follows from (8.3) that a lump of iron that is never put
in water is water-soluble.
(ii) Delayed effects. According to (8.3), a lump of salt that is put in water at
t will normally dissolve at t. This is never true. There is always a delay in
the effect. But it would be hopeless to find a formula that would predict
the delay.
To my knowledge, Problem (i) was first noticed by Nelson Goodman in connection with
the problem of conditionals (see [Goodman, 1947]). Problem (ii), which on reflection appears
to be even more difficult, has hardly been mentioned in the philosophical literature.
The first problem would have been solved by using a deontic conditional operator rather
than circumscription, as follows.
∀x, y, t[[Water(y) ∧ Put-in(x, y, t)Ab(x, y, t)] →Dissolve(x, y, t)].
The conditional shifts attention to worlds and times at which x has successfully been put
in water, and in which things go normally. With abnormality predicates, we need to ensure
separately that everything that can be tested in the appropriate way is somehow tested in
the appropriate way. We can do this by using a modal operator
which ranges over the
worlds obtainable by the performance of suitable test actions. This idea yields the following
reformulation.
(8.4) ∀x, y, t [[Water(y) ∧ Put-in(x, y, t) ∧ ¬Ab(x, y, t)]
→ Dissolve(x, y, t)].
Problem (ii) remains. The use of times in (8.2)–(8.4) is the source of the problem. Times
are fine in reasoning domains where quantitative measurements of change are appropriate,
but here they introduce a level of detail that is distracting.
However, we do need some sort of universal quantifier in formulating these constraints—
we wish to say that whenever x is put into water, x dissolves. We could begin by saying that
this means that x dissolves in any case in which x is put in water.
But this is either too vague for comfort, or it makes (8.4) false. We put a lump of salt
in ordinary water—let’s call this a case. We can now see the salt dissolving. Is this the
same case, or another? We wait, and now the salt is dissolved. Is this too a different case?
The truth of (8.4), with t construed as a quantifier over cases, depends crucially on how we
individuate cases in this example. But we have not very robust intuitions about cases, and
in particular the term ‘case’ doesn’t provide any help about how we should perform this
individation.
14
There is a division of labor here; these conditions belong to the abnormality theory. The abnormality
theory is not part of the defintion of soluble, but it contributes to the adequacy of the definition.
9
Page 11
But the exercise we have just gone though make the clear that the “cases” we are consid-
ering in this example are happenings—and we do have good intuitions about these. Thinking
of the quantifier in (8.4) as ranging over happenings, or (to use a technical term) eventualities,
we are able to make progress on Problem (ii).
This is actually a sign that we are on the right track, since years of work in natural
language semantics have made a very plausible case for the importance of eventualities and
their structure in the ontology that is needed for semantics, and especially for the semantics
of words.
15
Here, we are interested in eventualities that exhibit a typical structure; they consist of an
inception, a body, and a culmination. The inception is usually an action, and may itself be
an eventuality with this same three-part structure; in our example, this is putting something
in a quantity of water. The body is usually a process, often one that can be measured in
some way which tracks stages in which the culmination is reached. The culmination is a
state; in our example, the state of the salt’s being dissolved. We will call such a three-part
happening a telic eventuality.
Our final definition of ‘water-soluble’ is obtained by using eventualities in place of times.
16
(8.5)
∀x[Water-Soluble(x) ↔
∀e
1
∀y[[Put-in(e
1
) ∧ movee(e
1
) = x ∧ container(e
1
) = y ∧ Water(y)] →
∃e[Dissolving(e) ∧ dissolvee(e) = x ∧ medium(e) = y]
∧ ¬Ab(e)] →
∃e
2
[culmination(e) = e
2
∧ Dissolved(e
2
)
∧ disolvee(e
2
) = x ∧ medium(e
2
) = y]].
In words: x is water-soluble if and only if necessarily if an event e
1
of putting x in a
quantity of water occurs then e
1
is the inception of a dissolving eventuality e involving the
same x and quantity of water, which—unless something abnormal about e—will culminate
in a state in which x is dissolved.
This is not only a definition, but it appears to solve Carnap’s problem of defining ‘soluble’.
Such definitions can be simplified considerably by refining the definition of a telic eventuality,
and by appealing to general properties of these eventualities.
The other common senses of X-able include: “If an appropriate test is performed then
a result X that is not undesirable will normally be achieved” and “If an appropriate test
is performed then a result X that is desirable will normally be achieved.” These cases are
illustrated by drinkable and despicable. There are many more or less idiosyncratic cases that
do not fit any of these patterns, such as palatable, comfortable, and reasonable. There are
many suppletive cases that do not appear to be derived at all, such as capable and liable.
We have to be prepared for such exceptions in lexical semantics.
Those who know a little modal logic are likely to think that modal auxiliaries like can
are formalized with the modal operator
. This creates a pleasant analogy between the
linguistic expression of universal and existential quantification on the one hand, and that of
necessity and possibility on the other.
15
[Dowty, 1979] is one of the classic sources for this topic.
16
This formula uses more or less standard formalization techniques in event-centered semantics, where
something like ∃e[Push(e) ∧ Past(e) ∧ Pusher(e) = Charlie ∧ Pushee(e) = Piano
43
] is used to represent
Charlie pushed the piano.
10
Page 12
The modal can and the suffix -able are not the same, but in one important respect they
are both alike: they are more like causal conditionals than like possibility operators. This
point is linked to another tradition that, like many of the ideas presented here, goes back to
Aristotle’s account of change in the common sense world. For more on the modal issue, see
[Cross, 1986].
9.
Conclusion
I have said that this is part of a larger project. To get a sense of how the thesis articulated
in Section 7 fares, it will be necessary to investigate a number of cases in considerable detail.
I have developed partial studies of the following cases.
1. Some causal constructions.
2. Agency.
3. Some denominal verbs.
4. The -er of normal function, as in fastener.
And I have begun a separate study of compound nominals (such as water meter cover ad-
justment screw), which intersects with many of the issues described here.
However the thesis itself fares, I want to recommend projects that seek to develop logical
theories of word meaning to all formally-minded people interested in linguistic meaning. In
the end, we will obtain a much better understanding of the common sense world and how
it is reflected in language and reasoning through cooperative work that uses the best ideas
of linguistics, computer science, and philosophy. I recommend this cooperative approach to
anyone who is interested in projects of this kind.
11
Page 13
Bibliography
[Antoniou, 1997] Grigoris Antoniou. Nonmonotonic Reasoning. The MIT Press, Cambridge,
Massachusetts, 1997.
[Asher and Morreau, 1991] Nicholas Asher and Michael Morreau. Commonsense entailment:
a modal theory of nonmonotonic reasoning. In J. Mylopoulos and R. Reiter, editors,
Proceedings of the Twelfth International Joint Conference on Artificial Intelligence, pages
387–392, Los Altos, California, 1991. Morgan Kaufmann.
[Boutilier, 1992] Craig Boutilier. Conditional logics for default reasoning and belief revision.
Technical Report KRR–TR–92–1, Computer Science Department, University of Toronto,
Toronto, Ontario, 1992.
[Brewka et al., 1997] Gerhard Brewka, J¨
urgen Dix, and Kurt Konolige. Nonmonotonic Rea-
soning: An Overview. CSLI Publications, Stanford, 1997.
[Carnap, 1928] Rudolph Carnap. Der logische Aufbau der Welt. Weltkreis-Verlag, Berlin-
Schlactensee, 1928.
[Carnap, 1936 1937] Rudolph Carnap. Testability and meaning. Philosophy of Science, 3
and 4:419–471 and 1–40, 1936–1937.
[Carnap, 1956] Rudolph Carnap. Meaning and Necessity. Chicago University Press, Chicago,
2 edition, 1956. (First edition published in 1947.).
[Church, 1951] Alonzo Church. The need for abstract entities in semantic analysis. Proceed-
ings of the American Academy of Arts and Sciences, 80:100–112, 1951.
[Cross, 1986] Charles B. Cross. ‘Can’ and the logic of ability. Philosophical Studies, 50:53–64,
1986.
[Davis, 1991] Ernest Davis. Common Sense Reasoning. Morgan Kaufmann, San Francisco,
1991.
[Dowty, 1979] David R. Dowty. Word Meaning in Montague Grammar. D. Reidel Publishing
Co., Dordrecht, Holland, 1979.
[Gabbay et al., 1994] Dov Gabbay, Christopher Hogger, and J.A. Robinson, editors. Hand-
book of Logic in Artificial Intelligence and Logic Programming, Volume 2: Nonmonotonic
Reasoning. Oxford University Press, Oxford, 1994.
[Ginsberg, 1987] Matthew L. Ginsberg, editor. Readings in Nonmonotonic Reasoning. Mor-
gan Kaufmann, Los Altos, California, 1987. (Out of print.).
12
Page 14
[Goodman, 1947] Nelson Goodman. The problem of counterfactual conditionals. The Jour-
nal of Philosophy, 44:113–118, 1947.
[Goodman, 1963] Nelson Goodman. The significance of der logische aufbau der welt. In Paul
Schilpp, editor, The Philosophy of Rudolph Carnap, pages 545–558. Open Court, LaSalle,
Illinois, 1963.
[Hobbs and Moore, 1988] Jerry R. Hobbs and Robert C. Moore, editors. Formal Theories
of the Commonsense World. Ablex Publishing Corporation, Norwood, New Jersey, 1988.
[Lewis, 1973] David K. Lewis. Counterfactuals. Harvard University Press, Cambridge, Mas-
sachusetts, 1973.
[Lifschitz, 1988] Vladimir Lifschitz. Circumscriptive theories: A logic-based framework for
knowledge representation. Journal of Philosophical Logic, 17(3):391–441, 1988.
[Lifschitz, 1990] Vladimir Lifschitz, editor. Formalizing Common Sense: Papers by John
McCarthy. Ablex Publishing Corporation, Norwood, New Jersey, 1990.
[Link, 1983] Godehard Link. The logical analysis of plurals and mass terms: A lattice-
theoretical approach. In Rainer B¨
auerle, Christoph Schwarze, and Arnim von Stechow,
editors, Meaning, Use, and Interpretation of Language, pages 302–323. Walter de Gruyter,
Berlin, 1983.
[McCarthy, 1984] John McCarthy. Some expert systems need common sense. In H. Pagels,
editor, Computer Culture: the Scientific, Intellectual and Social Impact of the Computer,
volume 426 of Annals of The New York Academy of Sciences, pages 129–137. The New
York Academy of Sciences, 1984.
[Montague, 1962] Richard Montague. Deterministic theories. In Decisions, Values, and
Groups, volume 2, pages 325–370. Pergamon Press, Oxford, 1962. Reprinted in Formal
Philosophy, by Richard Montague, Yale University Press, New Haven, CT, 1974, pp. 303–
359.
[Montague, 1969] Richard Montague. On the nature of certain philosophical entities. The
Monist, 53:159–194, 1969.
[Montague, 1973] Richard Montague. The proper treatment of quantification in ordinary
English. In Jaakko Hintikka, editor, Approaches to Natural Language: Proceedings of the
1970 Stanford Workshop on Grammar and Semantics, pages 221–242. D. Reidel Publishing
Co., Dordrecht, Holland, 1973. Reprinted in Formal Philosophy, by Richard Montague,
Yale University Press, New Haven, CT, 1974, pp. 247–270.
[Schlechta, 1997] Karl Schlechta. Nonmonotonic Logics. Springer-Verlag, Berlin, 1997.
[Shoham, 1993] Yoav Shoham.
Agent oriented programming.
Artificial Intelligence,
60(1):51–92, 1993.
13
Page 15
[Stalnaker and Thomason, 1970] Robert C. Stalnaker and Richmond H. Thomason. A se-
mantic analysis of conditional logic. Theoria, 36:23–42, 1970.
[Steedman, 1998] Mark
Steedman.
The
productions
of
time.
Un-
published
manuscript,
University
of
Edinburgh.
Available
from
http://www.cogsci.ed.ac.uk/˜steedman/papers.html., 1998.
[Thomason, 1991] Richmond Thomason. Logicism, artificial intelligence, and common sense:
John McCarthy’s program in philosophical perspective. In Vladimir Lifschitz, editor,
Artificial Intelligence and Mathematical Theory of Computation, pages 449–466. Academic
Press, San Diego, 1991.
14
This is the html version of the file http://www.cnl.salk.edu/~tony/ptrsl.pdf.
G o o g l e automatically generates html versions of documents as we crawl the web.
To link to or bookmark this page, use the following url: http://www.google.com/search?q=cache:K6k2oGffoggJ:www.cnl.salk.edu/~tony/ptrsl.pdf+%22artificial+intelligence%22+%22common+sense%22+site:edu+pdf&hl=en&client=firefox-a
Google is not affiliated with the authors of this page nor responsible for its content.
These search terms have been highlighted: artificial intelligence common sense
These terms only appear in links pointing to this page: pdf
Page 1
Levels and loops: the future of
arti®cial intelligence and neuroscience
Anthony J. Bell
Interval Research Corporation, 1801 Page Mill Road, Palo Alto, CA 94304, USA
In discussing arti¢cial intelligence and neuroscience, I will focus on two themes. The ¢rst is the univers-
ality of cycles (or loops): sets of variables that a¡ect each other in such a way that any feed-forward
account of causality and control, while informative, is misleading.
The second theme is based around the observation that a computer is an intrinsically dualistic entity,
with its physical set-up designed so as not to interfere with its logical set-up, which executes the computa-
tion. The brain is di¡erent. When analysed empirically at several di¡erent levels (cellular, molecular), it
appears that there is no satisfactory way to separate a physical brain model (or algorithm, or representa-
tion), from a physical implementational substrate. When program and implementation are inseparable
and thus interfere with each other, a dualistic point-of-view is impossible. Forced by empiricism into a
monistic perspective, the brain^mind appears as neither embodied by or embedded in physical reality,
but rather as identical to physical reality.
This perspective has implications for the future of science and society. I will approach these from a
negative point-of-view, by critiquing some of our millennial culture's popular projected futures.
Keywords: arti¢cial intelligence; neuroscience; cyclic systems; dualism; science ¢ction
1. INTRODUCTION
In this paper I will survey the recent history, current
status and future prospects of arti¢cial intelligence (AI)
and neuroscience. I will attempt to relate the social moti-
vations and potential impact of the ¢elds concerned on
society at large.
2. THE SCIENCE FICTION FUTURE
Formalities over, and given that the Millennium is a
signi¢cant enough social phenomenon that it colours
popular impressions of the future of science, it is worth
looking at what impressions a person of the year 2000
might have formed from late twentieth century popular
science books, science ¢ction books and ¢lms, and even
from the science pages of newspapers. Such a person
might be forgiven for thinking that the future will be
something like this.
Nano-robots will perform all molecular repairs in our
bodies, making us e¡ectively immortal. Highly engineered
drugs, perhaps the descendants of Prozac and Ecstasy, will
take care of emotional disorders, as a side-e¡ect solving all
social problems, so everyone will be happy (¢nally).
That's for the nostalgic minority who cling to living in
the primitive biological form. More cyber-aware indi-
viduals will have downloaded themselves into the `Net'and
will exist like a William Gibson character in a global
computer network which is capable of providing all pro-
tagonists with the most fantastic entertainment. Many
global problems will be solved with the demographic move
to the `Net', problems such as population, food, transporta-
tion and energy.
The `Net-heads' will have been passed on the way by the
`Worldbots', digital mechanical life-forms which will ¢rst
ease human life by performing all mundane tasks, but will
shortly after become so much more intelligent than the
unenhanced us that they will practically become `spiritual
machines', which may or may not use sel¢sh altruism to
decide to be benign towards the human animals, and if we
are lucky, they will continue to serve us, something like
digital Bodhisattvas.
Back in the cyberworld, boundaries between individuals
will break down, and transhuman life-forms will appear,
analogously to the emergence of multicellular life in the
ocean. Implanted into robot spaceships, these life-forms
will lumber into space like the ¢rst amphibious ¢sh
lumbered onto the land. A long time after this, perhaps
after a few galactic wars (in which the `Dark Side' may be
brie£y £irted with but not joined forever), the universe will
be one huge Internet, matter everywhere drawn into the
process of computational living. The extremum of this is
called the Omega Point. (A ¢nal twist is that since the
Omega Point does not join the Dark Side, again possibly
using game theoretic reasoning, it will decide to be benign
and resurrect everyone who ever lived and give them what
they most desire. This is called theJudeo-Christian heaven
byTipler (1995). Other references used in constructing this
version of future history are Gibson (1986), Moravec
(1990) and Kurzweil (1999).)
These amazing developments are the almost inevitable
consequences of the merging of the digital and the organic
worlds, on the threshold of which we are now standing.
Cellphones and laptop computers are only the beginning.
We might call this future the bio-informational age, in
keeping with its millennial timing, and the smoothness with
which it mixes in with elements of NewAge philosophy.
Phil. Trans. R. Soc. Lond. B (1999) 354, 2013^2020
2013
& 1999 The Royal Society
Page 2
3. THE CURRENT JOB OF SCIENCE
It's a giddy picture indeed, but how much of it, if any,
will come true? If none of it is going to happen, it would
be very helpful if science could tell us why, so that we
could get on with living our real future.
The di¤culty for science is that the prospect of a bio-
informational future, with its cyborg, transpersonal
themes causes us to ask questions concerning individu-
ality, consciousness, mind and machine, exactly those
questions which science has had least success in framing.
AI and neuroscience are the ¢elds that come closest in
engineering and biology to framing such questions.
Scratch the surface of many AI researchers and neuro-
scientists (perhaps quite vigorously) and you may ¢nd
someone who started o¡ by asking `What are we?'
The answers to this question are not that numerous.
Either we are machines, in which case AI should be
possible and neuroscience should be able to work out the
algorithm (or algorithms) that the brain is running, or
we are something else, in which case both projects will
fail in their ultimate goals, which is not to say they will
not achieve great things along the way. (One of the great
things that they might achieve is an exact picture of their
own limits.)
Either way, by examining the history and current state
of AI and neuroscience and by identifying the issues
beneath the surface of these ¢elds, we may gather some
sense of what are the important themes playing along
science's internal frontier (disregarding for now how
di¡erent this frontier looks from outside).
4. HISTORY AND STATE OF ARTIFICIAL
INTELLIGENCE
AI's ultimate purpose is to build a robot that lives in
the world with a computer for a brain. It therefore
assumes that the essence of the living and/or thinking
process can be captured in digital computation.
The ¢rst attempts to produce AI in the 1960s involved
writing facts and rules into the machine using various
quasi-logical languages. In the 1980s this became less
popular. Rule-based systems were seen as non-robust:
they could not adapt well to small changes in circum-
stances. Also, every fact had to be programmed in by a
human. This led people to think that real-numbered,
`subsymbolic' systems were needed, and these systems had
to be able to learn facts (or learn something) themselves,
just by observing data. Historically, this view carried
within it the cybernetics view of the 1950s.
It was one short step from this shift to statistical
theories. The short step was called neural networks
(Haykin 1999); it started in 1984 (Rummelhart &
McClelland 1986) and it is not over yet. An inter-
disciplinary ¢eld with a higher than average tolerance for
speculation and free-wheeling enquiry, neural networks
were popular with students and military funders, and
often regarded with frustration by other disciplines that
shared a border. As the ¢eld became more rigorous, it re-
established its connections with mainstream AI, through
common interests in statistical machine learning. Tech-
nically speaking, the ¢eld of neural networks is content-
less. The empirical side is neuroscience; the theoretical
side is statistics and signal processing. This is perhaps
what makes it such a great ¢eld to work in.
Symbolic AI was thus subverted by a shift to statistical
learning theories. It was also subverted in two other
directions by the emergence of the ¢elds of arti¢cial life
(Langton 1997) and behaviour-based robotics (Arkin
1998) (or situated agents). Arti¢cial life (or alife) is
subsymbolic in that it implicitly assumes that intelligence
is just the complexend of a simulatable life process. A
living system and its environment are typically simulated
together, often using genetic algorithms and population
dynamics to simulate evolution.
Behaviour-based robotics attempts bravely to deal with
the perceptual-motor loop of a robot in a real environ-
ment, rejecting both the alife simulated worlds and the
mainstream AI notion of a representation of the world.
Echoing Gibson (1979) in his famous debate with Marr
(1982) (Bruce & Green 1990), the `agents'-literature
focuses on complexbehaviour coming from simple
mechanisms operating in tight coupling with a complex
environment, in contrast to Marr's emphasis on the feed-
forward computation of a representation from sensory
data.
Alife and behaviour-based robotics lack a structural
foundation such as that given to neural networks and
statistical machine learning by mathematics. This makes
it hard to judge progress or assess methodology in these
¢elds. However, on the other side, neural networks that
learn both sensory perceptions and motor actions in an
environment are extremely rare, and for a good reason: it
is di¤cult to build a statistical model of an environment
when the system's perceptions are transformed into
actions that a¡ect the statistics of the input.
Furthermore, what should such an acting system do?
There is an obvious goal for a feed-forward perceptual
system: build a probability distribution of what happens.
The hidden symmetries (dependencies, redundancies) in
this distribution are the hidden structure of the world.
But in this cyclic case, when the world is at least partly
constructed by the actions of the system, the shape of this
distribution is action dependentöthe system gets to
partly choose what symmetries exist, and the notion of a
hidden set of privileged symmetries is under threat. This
is post-modernism for statisticians.
At this point, most people would abandon informa-
tional, or unsupervised, goals and appeal to one of the
many speci¢c goals which a robot system might have,
such as to ¢nd food or recharge the batteries. While these
are no doubt important, they do have an air of arbitrari-
ness about them that makes us uneasy: we are familiar
enough with the £uxof goals in our personal experiences
to desire something more invariant to underly action
selection.
5. QUESTIONS CURRENTLY LATENT IN ARTIFICIAL
INTELLIGENCE
Here we have identi¢ed two questions which lie
beneath the surface of the pluralistic AI of today.
The ¢rst question, to rephrase, asks why we do not
have a mathematical theory of the perception^action
cycle. Of course there is work on active perception, on
sensory^motor coordinate systems, and engineering
2014 A. J. Bell Levels and loops: the future ofarti¢cial intelligence and neuroscience
Phil. Trans. R. Soc. Lond. B (1999)
Page 3
department robotics is full of mathematics. But the kind
of theory I mean is one that is as universally useful for
characterizing cyclic systems as Shannon's information
theory is for characterizing communications channels, i.e.
feed-forward systems). (Incidentally, maximizing the
channel capacity involves ¢nding those hidden symme-
tries we mentioned that exist in the probability distribu-
tion of the input. This forms the basic goal of my own
favoured area of neural networksöunsupervised learning
(Hinton & Sejnowski 1999).)
Implicit in this is the second question. What would we
want such a post-Shannon system to do? What quantity
should a perception^action cycle system maximize, as a
feed-forward channel might maximize its capacity?
A third question was directed at AI researchers by
Penrose (1989), and by the hostility and controversy it
caused, you knew he had hit a weak spot in AI. Penrose
wondered if the fact that the physical substrate of the
world, of which relativity and quantum mechanics are
our best accounts, might be su¤ciently di¡erent from the
digital substrate of computers that it would render AI
impossible. Is there something in the quantum that is
necessary for mind?
Sco¤ng AI-philosophers characterized Penrose's pos-
ition as `we don't understand quantum mechanics and we
don't understand consciousness, so they must be the same
thing'. The derision increased when Penrose, to make his
hypothesis more speci¢c, proposed, with Stuart
Hamero¡, that quantum consciousness manifests itself
through coherent quantum e¡ects in a network of proteins
called microtubules which form the structural skeleton of
neurons (and other cells).
Critics, distracted by the strangeness of these speci¢c
proposals (which are not crucial to his argument), may
miss the validity of Penrose's general doubt about the
computer: that it is a particularly unusual artifact, being
deterministic, discrete time and discrete state. The whole
state of the machine at the digital level may be written
down. No natural objects seem to be of this nature. The
computer is really a physical instantiation of a model. We
know a model can compute, but can it live or think?
Functionalism (the philosophy of AI) was based on
using the computer metaphor for mind, arguing that the
brain was the hardware implementation of the `mental
program'. But Penrose's arguments were really designed
to raise doubts about this separation of physical and
mental processes. Could the brain be separated from a
supposedly ¢nitely describable mental process running on
it? Since Rene¨ Descartes, the conceptual separation has
been there in our language, but is it scienti¢cally really
there?
Either there is a physical level at which the separation
can be performed (analogous to the level of logic gates in
computers) or functionalists have to admit that the brain
is not a machine. But the failure to detect a `logic gate
level' halfway up the brain's reductionist hierarchy may
not be the end of the argument for the functionalist, who
could still argue that if there is a computer at the bottom,
AI would be possible, at the very least with a computer
with the resources of the universe. The `universe-as-
computer' is a popular fringe-topic in physics, lying
behind an e¡ort to ¢nd a ¢nite discrete process such as a
cellular automaton that might underly the known laws of
physics. But until someone succeeds in showing this, we
might be wiser to stick with R. F. Feynman, who noted
that quantum processes are not in general simulatable,
even by Turing machines (and who in the process gave
rise to the mysterious and unformed ¢eld known today as
quantum computing).
The luck (or skill) of scientists is that sometimes they do
not have to philosophize to ¢nd the answer. They can ask
questions of Nature directly. So perhaps this is a good
point to survey the history and current state of neuro-
science, because this is the discipline whose empirical
project is exactly the ¢nite description of brain processes.
6. HISTORY AND STATE OF NEUROSCIENCE
The early landmarks in post-war neuroscience were the
Nobel prize winning work of Hubel & Wiesel (1968) for
their studies of the receptive ¢elds of monkey visual
cortical cells, and that of Hodgkin & Huxley (1952) for
their uncovering of the mechanism and mathematics of
spiking in neurons. It has grown into a huge ¢eld with
the annual Society of Neurosciences meeting in the USA
attracting 30 000 people.
The two early Nobel prizes re£ect perhaps a natural
split in the ¢eld between those working above or below
the level of the cell. Many of the great successes of the
1970s and 1980s were at the subcellular level, as the mol-
ecular biology revolution progressed, and as a result this
part of neurobiology was highly empirical and essentially
continuous with mainstream cellular, molecular and
developmental biology.
In this period, the molecular basis of neural signalling,
both in spiking and synaptic transmission was uncovered.
A bewildering array of ion channels, neurotransmitters
and neuromodulators were found to be engaged in the
processes of sculpting neural response properties and
controlling communication between neurons. From the
chemistry of photon absorption by photoreceptors, to the
chemistry of muscle contraction, the nervous system
apparently performed an astonishingly complicated and
coordinated series of molecular actions not qualitatively
di¡erent from those in other living cells, but somehow in
the brain this molecular dance constituted percept,
thought and action.
At and above the level of the spiking neuron, things
were slightly di¡erent. Lacking the formal structural
basis of molecular biology, neuron-level neuroscience
focused on the spike trains as signals representing neural
information. The discreteness of the spike as an
information-carrying unit was matched in biology only
by the genetic code. This led to early attempts to charac-
terize the `neural code', attempts that were revived by
Bialek and co-workers in the 1990s (Rieke et al. 1997).
(Notably, inevitably, these e¡orts attempt to characterize
neurons as feed-forward information channels.) Behind
these e¡orts is a faith in the neuron level, certainly as a
useful descriptive level, but also as a `computing level'
which molecular and biophysical processes exist to
implement. Does the goop that we see in the electron
micrographs merely exist to implement `the spiking
computer'? This is the neuroscience analogue of the func-
tionalist debate in AI, and I will return to it in ½ 7(c),
after addressing the issue of cycles in neuroscience.
Levels and loops: the future ofarti¢cial intelligence and neuroscience A. J. Bell 2015
Phil. Trans. R. Soc. Lond. B (1999)
Page 4
7. QUESTIONS CURRENTLY LATENT IN
NEUROSCIENCE
(a) Cycles in neuroscience
The same problem with cycles presents itself in
neuroscience as in AI, but whereas the primary cycle of
concern in AI was the perception^action cycle, in
neuroscience, the cycles are everywhere.
It is interesting that the clearest stories in neuroscience
are those which at ¢rst glance most closely resemble feed-
forward systems. One example is the synapse. The spike
arrives at the presynaptic bouton, causing vesicles of
neurotransmitter to be released, which in turn cause ion
channels in the postsynaptic site to open and change the
postsynaptic electrical potential. Another example is the
early visual system, starting with the retina and moving
through thalamus into early visual cortex. The treatment
of this system as a feed-forward channel, despite massive
corticothalamic and corticocortical feedback, has enabled
information theoretic learning models the modest success
of producing qualitatively correct predictions for the form
of the static (Bell & Sejnowski 1997) and dynamic (Van
Hateren & Van der Schaaf 1998) cortical receptive ¢elds
that were ¢rst observed by Hubel & Wiesel (1968).
However, feed-forward processing in the nervous
system is the exception rather than the rule, and often
what looks feed-forward contains complicated feedback
systems at a di¡erent level of analysis. For example, the
spikes of a cortical neuron have now been seen to extend
far into the dendritic tree, a¡ecting, through voltage-
dependent channels, the integration of signals from
synapses. This destroys the illusion that the neuron works
like a directional `neural network' neuron, performing a
weighted sum of its input signals.
Even in the synapse and the retina there are feedbacks.
Although the (human) retina receives no neural inputs
from the brain, the brain controls gaze direction which
determines what the retina sees. Although neurotrans-
mitter does not travel backwards across synapses in most
neurons, many other molecular signals do, as the exten-
sive and controversial attempts to ¢nd synaptic Hebbian
learning mechanisms in long-term potentiation have
revealed.
In abstract, the lack of a theory of cycles in biology can
be seen by considering an experiment in which some vari-
able X is changed and some other variable Y is moni-
tored. What is published are the relatively rare cases
where some correlation in X and Y is observed. The
temptation then is to say that `X controls Y' and from this
to build a model of feed-forward neural information
processing (or if X is a chemical, we may market it as a
drug to control Y).
In nature, things happen di¡erently from in the experi-
ment. X may rise, causing Y to rise, but then increased Y
usually causes X to diminish, directly or through some
other variables Z. These cycles of positive and negative
feedback are universal in biology and cause equilibrium
values of X and Y, or stereotypical dynamic behaviour to
occur. A neural spike is one example of a transient
dynamic caused by positive and negative feedback, where
X is the sodium current and Y the potassium current.
Slipping into the language of probability theory, if we
desire to discover the relationship in nature, of X and Y,
we may measure their joint probability distribution
p(X, Y), and we could do so by observing X and Y under
normal operating conditions, observing a peak in the
distribution at equilibrium, and some trajectories corre-
sponding to the stereotypical dynamics of the variables.
But in trying to estimate whether X controls Y, ex peri-
ments often take the form of measuring the conditional
distribution p(YjX) and constructing the joint distribu-
tion through the formula p(X, Y) p(YjX)p(X). This
latter strategy gives the wrong answer for p(X, Y)
because (i) rather than the system controlling p(X), we
are controlling it, thus cutting the system at X, and (ii)
we have, through our choice of independent and depen-
dent variables, imposed on the system a direction
(X ! Y) of dependency, with an implied direction of
causality that does not exist in nature.
There is no doubt that such experiments can still be
useful in teasing out dynamic cyclic behaviour. The
kinetics of ion channels can be identi¢ed with the aid of
voltage and current clamping techniques, but there is a
recognition in such experiments that the clamped cell is a
frozen picture of the true process. This recognition often
seems to go missing as the feedback loops get wider (`out
of sight, out of mind') and particularly as biology
becomes technology. Examples that spring to mind are
the widespread prescription of drugs that combat depres-
sion by controlling seratonin levels, or attempts to control
ecosystems by introducing new species, or, for that
matter, the attempt to tailor many aspects of a plant's
genetic make-up to ¢t an industrial model of agriculture.
Anyone seriously studying or modelling metabolism or
ecosystems knows the extent to which they are dealing
with cycles, but somehow, when the results reach into the
area of medicine or its macroscopic equivalent `planet
management', the causal, feed-forward style of thinking is
what is presented, particularly to the news media and
commercial interests. Anything which does not ¢t the
feed-forward model is linguistically demoted to the status
of a `side-e¡ect', to be eliminated if possible. But side-
e¡ects are nature's way of telling the scientist that all
processes are cyclic.
(b) Interlude: biology's master control node
I cannot resist, at this point, discussing the role of
biology's master control node, the genome. Although it is
somewhat o¡ the subject of AI and neuroscience,
arguments pointing back to the genome as the causal
factor behind animal behaviour and intelligence are so
universal in our culture, that to allow the genome special
status outside feedback cycles would be to endorse a
control-node mysticism rivalled in shape and form only
by that of the monotheistic Anglican bishops who debated
so famously with T. H. Huxley. (When science became a
greater authority on human origins than the church, the
transition hid the fact that it was a change of government
without a change in policy. Furthermore, a¡ording the
genome special status allows the present-day church of
evolutionary psychology to rampage unchecked and, in
my opinion, the wrong lessons are then drawn from
biology.)
The genome's grand cycle with other genomes,
mediated through populations of phenotypes is the king
of all biological feedback loops. It is a trans-individual
2016 A. J. Bell Levels and loops: the future ofarti¢cial intelligence and neuroscience
Phil. Trans. R. Soc. Lond. B (1999)
Page 5
molecular regulation loop, qualitatively similar to those
occurring within cells, with cooperation (or symbiosis;
Margulis & Sagan 1995) corresponding to the positive
feedback loop and competition for resources corre-
sponding to the negative feedback loop. Neo-Darwinists,
stuck on the negative pole, like to interpret cooperative
behaviour as `sel¢sh' altruism (I'll scratch your back if
you scratch mine). The inverse position, on the positive
pole, is to interpret competition for resources as sel£ess
greediness (I'll eat you, but honestly, this is not about
me). You might consider both positions absurd, or you
might use the latter point of view as an antidote to the
dominance of the former in our culture. The point here is
that competition and cooperation have equal status and
the process of `natural selection' in which we are judged
by an external environment (more biblical parallels) is
better viewed as a complexmolecular regulation loop
like any other.
The regulation loop is mediated through phenotypic
success, which brings up another loop-denying habit of
neo-Darwinists, which is to see the genome as a controller
for all aspects of the phenotype, right down to its speci¢c
behaviour: DNA as the determining code for an
organism. There must be a particular attraction in this
idea for certain authors, because they take great pleasure
in outraging people's common sense by portraying organ-
isms as the helpless puppets of their genes (Dawkins
1990).
I will not duplicate the e¡ort of the many authors who
have attacked the social or behavioural versions of this
notion (for example, the preposterous notion that there
could be a gene for homelessness, which was actually
considered in an editorial in Science), because this would
be to attack it at its weakest point. I'd like to attack the
notion in its strongest version: the molecular. The central
dogma of molecular biology is that `genes make proteins,
and not the other way round'.
The central dogma of molecular biology is wrong!
Sequences of DNA code for strings of amino acidsö
trueöbut how these amino acids are assembled into
functioning proteins and which parts of the DNA are
read in the ¢rst place are both controlled by proteins, and
depend on the state of the cell and its type. It's as if there
was a bookish town (a cell) with a central library (the
genome) and people (proteins) who came in to read short
sections here and there, share with each other what they
had read, and use the knowledge to build and change the
town. Who is controlling hereöthe townsfolk or the
library? (Answer: neither.)
Where did the people in the town come from? If `genes
make proteins', then the library made them, but the truth
is that they were there all along. The functioning
networks of enzymes that set to work on your DNA when
you were conceived were already in place in the salty
water of your mother's egg cell. They were just the latest
instalment in a continuous epigenetic lineage that
stretches back to your primordial metabolic ancestor, a
droplet of seawater that accidentally got stuck inside a
lipid membrane with a fortuitous set of amino acids.
It is harder to make more unsubstantiated assertions
in biology than in the area known as `origin of life'. But
if the `genes makes proteins' debate really comes down
to whether there was RNA (code) before proteins
(metabolism) or proteins before RNA in the ¢rst proto-
cells (De Duve 1991), then two factors should be con-
sidered: (i) amino-acid chains form much more readily
than nucleic-acid chains, and (ii) it is more likely that the
¢rst people wrote the ¢rst books, than that the ¢rst books
wrote the ¢rst people. (It is noteworthy that both neo-
Darwinists and New Testament theologians believe that
`in the beginning was the word (logos)'.) Of course, now
it is claimed there were ribozymes (RNA with the ability
to catalyse reactions), but was this metabolism evolving a
code, or a code evolving metabolism?
The outcome of this debate is not crucial. The intent
here is merely to weaken the notion of DNA as a kind of
controller of the phenotype. An equally valid (and
equally invalid) perspective has the phenotype choosing
what is read from the gene and what is done with it. In
reality, the organism and its genes are caught in a cyclic
dynamic, and if the organism decides to spend its after-
noon in a (real) library, instead of attempting to father
children, then you can be sure that the pattern of gene
expression will alter accordingly.
This argument ¢ts with our ¢rst general theme of criti-
quing feed-forward thinking in AI and neuroscience.
(c) Levels in neuroscience
Returning now to the second theme we touched on
when discussing AI, ½ 5 ended with a consideration of
levels of a system and functionalism. There was a chal-
lenge to the functionalist to empirically investigate the
brain and identify a level at which the brain could be
¢nitely `written down', a level analogous to logic gates in
computers. The obvious candidate is the neuron level. If
we wrote down the sequence of all spikes of all neurons,
would that be enough to specify the `neural computation'?
Do molecular and biophysical processes exist to imple-
ment a `spiking computer' at the neuron level?
I believe the answer to these questions is no. While no
speci¢c physical processes below the gate-level of a
computer interfere with the model-like operation of the
computer (unless something goes wrong), this cannot be
said at the neuron level of the brain. Molecular and
biophysical processes control the sensitivity of neurons to
incoming spikes (both synaptic e¤ciency and post-
synaptic responsivity), the excitability of the neuron to
produce spikes, the patterns of spikes it can produce and
the likelihood of new synapses forming (dynamic
rewiring), to list only four of the most obvious inter-
ferences from the subneural level. Furthermore, trans-
neural volume e¡ects such as local electric ¢elds and the
transmembrane di¡usion of nitric oxide have been seen to
in£uence, respectively, coherent neural ¢ring, and the
delivery of energy (blood £ow) to cells, the latter of
which directly correlates with neural activity.
The list could go on. I believe that anyone who
seriously studies neuromodulators, ion channels or
synaptic mechanism and is honest, would have to reject
the neuron level as a separate computing level, even
while ¢nding it to be a useful descriptive level. Perhaps a
physicist or a neural-network theorist, in looking for an
easy theory, would still argue that the molecular level is
mere implementational detail, but in most cases this is
more a result of prejudice, supported by laziness and
ignorance. If the molecular level is unimportant for an
Levels and loops: the future ofarti¢cial intelligence and neuroscience A. J. Bell 2017
Phil. Trans. R. Soc. Lond. B (1999)
Page 6
organism's behaviour, then how is a prokaryotic bacteria,
vastly simpler than a neuron, able to navigate, eat and
avoid toxins, all without the bene¢t of a nervous system?
If the neuron level is no good, are there any other
candidate levels? Several have been proposed. The theory
of neuronal groups, or cell assemblies, was another early
candidate. The apparent `noisiness' of individual spike
trains could be smoothed out by integrating over groups
of neurons coding, say, a given visual stimulus. The mean-
ingful unit of perception was seen to be the activity of the
group. In my view this idea contains a common error:
failure to appreciate that noisiness is in the eye of the
beholder, in this case the experimenter. In the case where
a stimulus is presented and that part of the neural
response which does not correlate with the stimulus is
regarded as noise, we have a situation almost as bad as
thinking French people are stupid because they produce
strange noises in response to questioning.
What about the molecular level? Say we write down
how many of each type of molecule are in each cell. Can
this capture the computation of the cell? Unfortunately
not, because the location of the molecules are important.
Testing of enzyme reactions in bulk phase (solutions in
test-tubes) is partly responsible for an impression that in
the cell, molecules largely jitter around with Brownian
motion and sometimes bump into each other and react.
What turns out to be more likely is that most reactions
take place locally in membrane-associated protein
complexes, and the product of one reaction is passed
directly on as substrate for the next. Evidence for this
detailed spatial organization, called metabolic channel-
ling, is accumulating (Ovadi 1995). Rather than being
unreliable and `wet', much of cellular biochemistry may
already operate in what has been called the machine
phase (although of course, in this paper I am arguing
that `machine', is the wrong word), where intricately
detailed and coordinated reactions occur, not in the bulk
phase. It seems that nanotechnology already exists,
except that it is not technology in the normal sense in
which a ¢nite model is implemented using some parti-
cular substrate level. It is di¤cult to imagine human engi-
neers making more e¤cient or complexprocesses by top-
down manipulation of individual atoms.
We have reached the level of individual molecules, and
the functionalist might say, no doubt through gritted
teeth, that he is happy to write down the position of all
the molecules in a brain. This will still be a ¢nite descrip-
tion. If there is no evidence of submolecular interferences,
we could have a `molecular machine' to satisfy the func-
tionalist. Remember that at this molecular level, we are
looking for something as clean as a logic gate, which is a
device responding deterministically to its logical inputs,
and which is insensitive to the motions of individual elec-
trons.
At this level, things become more controversial. Mol-
ecular computing is actually an area of advanced engi-
neering research, so though it is not clear that it always
falls within the discrete-state Turing model of computa-
tion, it might seem harder to dismiss the notion that
molecules compute in nature.
If we use molecules to construct Turing-style
computing devices, then, like good functionalists, we will
have molecular computers. But what molecules do in
nature may be di¡erent. In fact, it is. There are sub-
molecular interferences that violate the separateness of
the `molecular machine' level, and they are quantum
e¡ects. Two examples of this are electron transfer in
photosynthesis and the energetics of enzyme interactions
(Welch 1986). In both cases, quantum coherences are
necessary to explain the e¤ciency of the reactions.
But we don't even need to go as far down as quantum
e¡ects, because proteins do not end at the edges of the
black and red balls of which ball-and-stick molecular
models are constructed. Their electrical ¢elds extend into
the surrounding water molecules, orientating them to
form what is called structured water. Structured water is
also important in determining how enzyme reactions
occur, and how ion channels are selective to certain ions.
To argue that one piece of structured water or one
quantum coherence is a necessary detail in the functional
description of the brain would clearly be ludicrous. But if,
in every cell, molecules derive systematic functionality
from these submolecular processes, if these processes are
used all the time, all over the brain, to re£ect, record and
propagate spatio-temporal correlations of molecular £uc-
tuations, to enhance or diminish the probabilities and
speci¢cities of reactions, then we have a situation qualita-
tively di¡erent from the logic gate. The variables lying
beneath the level of a molecular `gate' can a¡ect the beha-
viour of the gate, so the functionalist is again frustrated,
and the notion of the brain as a molecular `computer' can
be viewed as no more than an analogy, and an inaccurate
one.
To say these things is not to be a `New Age quantum
mystic'. It is to attempt to clearly state empirical obser-
vations about molecular biology and to use them to
attack the prevalent tendency to view biological organ-
isms as machines in the exact technical sense in which
computers are machines, i.e. in the sense that they are
physical instantiations of ¢nite models which do not
permit physical interactions beneath the level of their
machine parts (e.g. the logic gate) to in£uence their
functionality.
It is a big leap from this argument to quantum
consciousness. There is no evidence that large-scale
macroscopic quantum coherences, such as those in super-
£uids and superconductors, occur in the brain. That some
people like to make the quantum consciousness leap is
testament more to the compelling connections between
the mathematics of quantum mechanics and a holistic
non-mechanistic world-view in which mind is immanent
(Bohm 1980), than to any speci¢c biological evidence. But
as the ¢rst scienti¢c workshops on `quantum biology'
meet, there is a good chance that a fascinating area of
theoretical and experimental research will come about,
and that more evidence will accumulate to suggest that
functionalism cannot be used as a theory of the processes
occuring in organisms.
8. RESTATEMENT OF THE ARGUMENT
In discussing AI and neuroscience, I have focused on
two themes. The ¢rst is the universality of cycles, in other
words of sets of variables that a¡ect each other in such a
way that any feed-forward account of causality and
control is misleading.
2018 A. J. Bell Levels and loops: the future ofarti¢cial intelligence and neuroscience
Phil. Trans. R. Soc. Lond. B (1999)
Page 7
The second theme is based around the observation that
a computer is an intrinsically dualistic entity, with its
physical set-up designed not to interfere with its logical
set-up, which executes the computation. In empirical
investigation, we ¢nd that the brain is not a dualistic
entity. Computer and program may be two, but mind and
brain are one. The brain is thus not a machine, meaning it
is not a ¢nite model (or computer) instantiated physically
in such a way that the physical instantiation does not inter-
fere with the execution of the model (or program).
9. THE BIO-INFORMATIONAL AGE REVISITED
What do these arguments say about the future, about
science and society and their relationships? Will the
cyber-dream take place, or should we quit AI and
neuroscience and join a hippie commune? The technical
conclusions on this seem to me to be as follows.
There will be no nanotechnological robots running
around inside our bodies, at least none that are any more
wizardly than the non-machine-like molecular complexes
that already exist. There will be no `control node' drugs
that can pin us on the right end of the sadness^happiness
spectrum, and thankfully we can drop this one-
dimensional view of the human emotions. There will be
no people living without brains, as digital patterns in the
Internet. There will be no spiritual machines, models so
advanced that they can deduce things that we ¢nd
mysterious. There will be no machines with minds.
Cyborgs seem more plausible. The extension of human
capacity through technology is already familiar to us,
and it is a small step from driving a car to operating
remote or tissue-embedded robot limbs. The process of
building new models and surrounding ourselves with
them will not be abolished in a return to some idealized
pretechnological state that never existed. Models will
merely be put in their place.
So if most of these things are not going to happen,
where does society's focus on robots, virtual reality and
the `wired world' dream, come from? I believe it is a
psychological reaction to the increasing proliferation of
models around us. When social interactions become codi-
¢ed instead of open-ended, when people ¢nd themselves
in roles as producers and consumers in a vast social
machine, then the fantasy of the cyborg has already come
true. When I enter an air-conditioned building in which
the windows are all sealed and the lighting is all £uores-
cent, I am walking into a model, a virtual reality.
But the more our behaviour becomes machine-like,
generated by and interpreted through the models that we
and others construct, the more we will feel disconnected
from the level below (and above) the models. We will be
less able to see that we are not machines, and that there is
no separating level at the logic gate that holds us above
our physical substrate, and no control nodes in our brain
that enable us to look down on reality. We are in the
middle of it. I think this is a lesson that science is teaching
us. If this lesson were truly to percolate into our culture
from our science, and not be perceived by science as `the
threat of irrationality', then we would suddenly ¢nd
ourselves living in a di¡erent world.
This is why I am ultimately optimistic about prospects
for AI and neuroscience, despite my negative predictions
about the success of their ultimate goals. I. Newton's
mechanistic world-view took a blow with the arrival of
quantum physics, but almost a century later, we still have
physicists. Physics, it turns out, does not need to be tied to
mechanism (in the strict sense we have used in this paper,
quantum mechanics is non-mechanical), and neither does
biology.
Computer science, mathematics, probability theory:
these are more tied up with the building of ¢nite models,
but they too have an intriguing role to play, for along the
border of the set of all models lurk paradoxand inconsis-
tency, the `universal solvents' (to use D. C. Dennett's
phrase in a situation where it applies) that dissolve
models. This is very interesting territory, ¢rst explored by
K. Go«del, who showed, remarkably, that there are true
things that can be said within a consistent model which
the model itself cannot prove. But interesting half-
dissolved models can be built along the frontier, models
that give paradoxthe respect it deserves. Quantum
physics is one such model. After all, paradoxis not just
something to be obliterated at ¢rst sight, or ignored.
Rather, it is an information structure which tells us
exactly the shape and form of the failure of a model. (Ex
falso quodlibet is what logicians say to express their obser-
vation that in Boolean logic, from `true and not-true',
anything is provable. But if this was the end of the story,
then how could a Zen koan be useful, how could it be
about anything? In fact there are a whole array of non-
Boolean logics and paraconsistent logics. Some are even
used in AI, re£ecting the fact that when people are asked
`Do you like Bill Clinton?' many of them want to say
`I don't know' (underdetermined) and `I love him
and hate him at the same time' (overdetermined).)
Paradoxinforms us about the failure of a model in a
qualitatively di¡erent way than Bayesian theory tells us
that the observed and the estimated distribution of some
variable are di¡erent. This suggests to me that there is
something below probability theory, which, because the
Cox^Jaynes formalism of Bayesian probability theory is
founded on Boolean logic, may well be reachable by
generalizing logical structures to incorporate answers
other than yes and no.
These speculations, together with the empirical argu-
ments I have made in the rest of this paper, suggest that
there is a very exciting role for AI and neuroscience to
play in the next century. As G.-C. Rota, a mathematician
and an advocate of Husserl, Heidegger and Wittgenstein,
wrote,
Even in our days of constantly predicted revolutions, it is
di¤cult not to be led to an optimistic conclusion. The
new sciences of the computer and the brain will validate
the philosophers' theories. But what is more important,
they will achieve a goal that philosophy has been unable
to attain. They will deal the death-stroke to the age-old
prejudices that have beset the concept of mind.
(Rota 1990, p.107)
AI and neuroscience are exactly placed where the
deaths of dualism and feed-forward thinking are sched-
uled to take place. If these disciplines choose to partici-
pate in this shift, rather than cling to concepts that are
not empirically supported, then there will be many inter-
esting PhD theses to write.
Levels and loops: the future ofarti¢cial intelligence and neuroscience A. J. Bell 2019
Phil. Trans. R. Soc. Lond. B (1999)
Page 8
Finally, so far I have left out one question: Will there be
a transhuman age? For this there is a strong biological
precedent in the two major steps in biological evolution.
The ¢rst, the incorporation into eukaryotic bacteria of
prokaryotic symbiotes, and the second, the emergence of
multicellular life-forms from colonies of eukaryotes.
Hegel had a word, sublation, for the harmonic incor-
poration of components into a whole without destruction
of their individual nature, and we are all familiar with
the good feeling that comes from playing in a team.
However, those who followed up on G. W. F. Hegel's
visions helped construct the nightmarish machine-like
political state of mid-century fascism, so we are right to
feel nervous about any superorganism with a hierarchical
(i.e. feed-forward, controllable) structure. Thankfully,
unlike twentieth century broadcast media, the Internet
provides a good, non-hierarchical model for future infor-
mation £ow and social creativity. It is not risking too
much to predict that it will continue to be a profound
stimulus for social change.
Will this lead, ultimately, to some form of transhuman
phase transition in the coming centuries? I believe that
something like this may happen, and that science (and
technology in some form, as with the Internet) will play a
part in this. But I believe that at least part of this devel-
opment will be a return to the past, a re-enchantment, to
a vision of life that does not view humans or their minds
as outside nature. Both our nostalgia for the past and our
millennial fascination with a global cyber-reawakening
are symptoms of the fact that we in the western world
currently live in the most individualistic culture in human
history. Our transhuman imagined science-¢ction future
may be, at base, a projection which contains the diagnosis
of the present, as Jung might have observed.
Just like our private dreams, our public dreams are not to
be taken literally.They are symbolic and indicative of imbal-
ances in the present. The relieving news is that in correcting
these imbalances, we will create a future which is not as as
alien as the science-¢ction future seems. In fact, it might
look as familiar to us as something which we had forgotten.
REFERENCES
Arkin, R. C. 1998 Behavior-based robotics (intelligent robots and
autonomous agents). Cambridge, MA: MIT Press.
Bell, A. J. & Sejnowski, T. J. 1997 The independent components
of natural scenes are edge ¢lters.Vision Res. 37, 3327^3338.
Bohm, D. 1980 Wholeness and the implicate order. London:
Routledge and Kegan Paul.
Bruce, V. & Green, P. 1990 Visual perception: physiology, psychology,
and ecology, 2nd edn. Hillsdale, NJ: Lawrence Erlbaum
Associates.
Dawkins, R. 1990 The sel¢sh gene. Oxford University Press.
De Duve, C. 1991 Blueprint for a cell. London: Portland Press.
Gibson, J. J. 1979 The Ecological Approach to Visual Perception.
Boston, MA: Houghton Mi¥in.
Gibson, W. 1986 Neuromancer. Phantasia Press.
Haykin, S. S. 1999 Neural networks: a comprehensive foundation, 2nd
edn. New Jersey: Prentice-Hall.
Hinton, G. E. & Sejnowski, T. J. 1999 Unsupervised learning:
foundations of neural computation. Cambridge, MA: MIT
Press.
Hodgkin, A. L. & Huxley, A. F. 1952 A quantitative description
of membrane current and its application to conduction and
excitation in nerve. J. Physiol. 117, 500^544.
Hubel, D. H. & Wiesel, T. N. 1968 Receptive ¢elds and func-
tional architecture of monkey striate cortex. J. Physiol. 195,
215^244.
Kurzweil, R. 1999 The age of spiritual machines: when computers
exceed human intelligence. New York: Viking Press.
Langton, C. G. 1997 Arti¢cial life: an overview. Cambridge, MA:
Bradford Books, MIT Press.
Margulis, L. & Sagan, D. 1995 What is life? London: Weidenfeld
and Nicolson.
Marr, D. 1982 Vision. NewYork: Freeman.
Moravec, H. 1990 Mind children: the future of robot and human intel-
ligence. Cambridge, MA: Harvard University Press.
Ovadi, J. 1995 Cell architecture and metabolic channeling. Austin, TX:
Landes; NewYork: Springer.
Penrose, R. 1989 The emperor's new mind. Oxford University
Press.
Rieke, F., Warland, D., de Ruyter van Steveninck, R. & Bialek,
W. 1997 Spikes: exploring the neural code. Cambridge, MA: MIT
Press.
Rota, G.-C. (ed.) 1997 Philosophy and computer science. In
Indiscrete thoughts, pp. 104^107. Boston, MA: Birkha«user.
Rumelhart, D. E. & McClelland, J. L. 1986 Parallel distributed
processing: exploration in the microstructure of cognition.
Cambridge, MA: MIT Press.
Tipler, F. J. 1995 The physics of immortality: modern cosmology, God
and the resurrection of the dead. New York: Doubleday.
Van Hateren, J. H. & Van der Schaaf, A. 1998 Independent
component ¢lters of natural images compared with simple
cells in primary visual cortex. Proc. R. Soc. Lond. B 265, 359^
366.
Welch, G. R. (ed.) 1986 The £uctuating enzyme. Nonequilibrium
problems in the physical sciences and biology, vol. 5. New York:
Wiley.
2020 A. J. Bell Levels and loops: the future ofarti¢cial intelligence and neuroscience
Phil. Trans. R. Soc. Lond. B (1999)