0 What is a linguistic signal?Definition, meaning, description


Linguistics is the formal study of language. Signals are anything that transmit meaning and can be verbal or nonverbal . It may a puzzle, actions, vocalizations, context are the linguistic signals that allow me to successfully communicate to other person. In fact, depending on the manner and facial expressions and body language and timbre of voice.

  1
Submitted to Journal of Italian Linguistics

Verbs, nouns, and simulated language games

Domenico Parisi
Institute of Cognitive Science and Technology
National Research Council
parisi@ip.rm.cnr.it

 Angelo Cangelosi
Centre for Neural and Adaptive Systems
School of Computing
University of Plymouth
acangelosi@soc.plym.ac.uk

Ilaria Falcetta
University of Rome La Sapienza
ilariafalcetta@libero.it


Abstract

The  paper  describes  some  simple  computer  simulations  that  implement
Wittgenstein’s notion of a  language game, where the meaning of a linguistic signal is
the  role  played  by  the  linguistic  signal  in  the  individual’s  interactions  with  the
nonlinguistic  and  linguistic  environment.  In  the  simulations  an  artificial  organism
interacts  at  the  sensory-motor  level  with  an  environment  and  its  behavior  is
influenced  by  the  linguistic  signals  the  individual  receives  from  the  environment
(conspecifics).  Using  this  approach  we  try  to  capture  the  distinction  between
(proto)verbs  and  (proto)nouns, where  (proto)verbs  are  linguistic  signals  that  tend  to
co-vary with  the  action with which  the  organism must  respond  to  the  sensory  input
whereas  (proto)nouns  are  linguistic  signals  that  tend  to  co-vary  with  the  particular
sensory  input  to which  the organism must  respond with  its  actions. Some  extensions
  2
of  the  approach  to  the  analysis  of  other  parts  of  speech  ((proto)adjectives,
(proto)sentences,  etc.)  are  also  described.  The  paper  ends  up  with  some  open
questions and suggestions on how to deal with them.


1. Simulated language games

The meaning of a  linguistic signal is the manner in which the linguistic signal is used
in  the  everyday  interactions  of  speakers/hearers  with  the  world  and  the  role  the
linguistic  signal  plays  in  their  overall  behavior.  This  Wittgensteinian  definition  of
meaning, while probably correct, poses a serious problem for the study of language in
that,  although  linguistic  signals  as  sounds  or  visual  (written)  forms  are  easily
identified,  observed,  and  described,  the way  in which  linguistic  signals  are  used  by
actual  speakers/hearers  in  real  life  situations  is very difficult  to observe and describe
with  any  precision,  reliability,  and  completeness.  Therefore,  linguists,
psycholinguists,  and philosophers  tend  to  replace meanings with such poor “proxies”
as  verbal  definitions,  translations  (when  studying  linguistic  signals  in  other
languages),  or  the  limited  and  very  artificial  uses  of  linguistic  signals  in  laboratory
experiments  (e.g.,  the naming of pictures or  the decision  if  a  sequence of  letters  is a
word or a nonword). 

An  alternative  to  such  practices  is  to  adopt  Wittgenstein’s  strategy  of  studying
“language games”,  i.e.,  simplified models of  the very  complex  and diverse  roles  that
linguistic  signals  play  in our  complicated  everyday  language which may be  closer  to
the  “games  by  means  of  which  children  learn  their  native  language”  (Wittgenstein
1953, 5e) and to languages “more primitive than ours” (Wittgenstin 1953, 3e). In this
paper  we  adopt  this  Wittgensteinian  strategy  but  with  a  significant  change:  our
language  games  are  simulated  in  a  computer. We  create  artificial  organisms  which
live  in  artificial worlds  and which may  receive and produce  linguistic signals  in such
  3
a way  that  these  linguistic  signals become  incorporated  in  their overall behavior  and
in  their  interactions with  the world. Simulated  language  games  have  two  advantages
when  they  are  compared  with  the  philosopher’s  language  games.  First,  since
simulated  language games are “objectified”  in the computer (the organisms’ behavior
can  be  actually  seen  on  the  computer  screen)  and  they  do  not  only  exist  in  the
philosopher’s mind  or  in  his/her  verbal  expressions  and  discussions with  colleagues,
they offer more degrees of  freedom  and more objectivity when one  tries  to describe,
analyze,  measure,  and  manipulate  experimentally  the  meaning  of  linguistic  signals
conceived  as  their  role  in  the  overall  behavior  of  the  artificial  organisms.  Second,
given  the  great  memory  and  computing  resources  of  the  computer,  which  greatly
execeed  those of  the human mind, one  can progressively  add new  components  to an
initially  very  simple  simulation  in  such  a way  that  the  language  games may  become
more and more similar to actual languages.

Recently,  computer models  have  been  used  to  simulate  the  evolutionary  emergence
of  language  in populations of interacting organisms (Cangelosi & Parisi 2002; Knight
et  al.  2000;  Steels  1997).  Various  simulation  methodologies  have  been  employed,
such  as  communication  between  rule-based  agents  (Kirby  1999),  recurrent  neural
networks (Batali 1994; Ellefson & Christiansen 2000), robotics (Kaplan, 2000; Steels
& Vogt  1997),  and  internet  agents  (Steels & Kaplan  1999). Among  these,  artificial
life  neural  networks  (ALNNs:  Parisi  1997)  provide  a  useful modelling  approach  for
studying  language  (Cangelosi & Parisi 1998; Cangelosi & Harnad  in press; Parisi &
Cangelosi 2002). ALNNs are neural networks that control the behaviour of organisms
that  live  in  an  environment  and  are members  of  evolving  populations  of  organisms.
They  provide  a  unifying  methodological  and  theoretical  framework  for  cognitive
modelling  because  of  the  use  of  both  evolutionary  and  connectionist  techniques  and
the  interaction  of  the  organisms  with  a  simulated  ecology.  All  behavioral  abilities
(e.g.,  sensorimotor  skills,  perception,  categorization,  language)  are  controlled  by  the
  4
same  neural  network.  This  permits  the  investigation  of  the  interaction  between
language and other cognitive and sensorimotor abilities. 

2. Verbs and nouns

Among  linguistic  signals  such  as words  one  can  distinguish  among  different  classes
of  words  based  on  some  general  properties  of  the  use  of  these  different  classes  of
words  (Brown & Miller  1999). The  purpose  of  this  article  is  to  explore what neural
network models  can  contribute  to  a  better  understanding  of  the  nature  of  verbs  and
nouns  and, possibly, other parts of  speech. The distinction between verbs  and nouns
is  perhaps  the most  basic  and  universal  distinction  among different  classes of words
in human languages and a neural network treatment of verbs and nouns, if successful,
can then be extended to other parts of speech. Verbs and nouns may be distinguished
on semantic or syntactic grounds. Semantically, verbs and nouns can be distinguished
in terms of the different types of entities to which they refer. Verbs are said to refer to
actions or processes while nouns refer to objects or static entities (cf., e.g., Langacker
1987). Syntactically, verbs and nouns are distinguished in terms of the different roles
they  play,  or  the  different  contexts  in which  they  appear,  in  phrases  and  sentences.
Given  our  simplified  language  games,  in  which  almost  no  multi-component  signals
are  used  such  as  phrases  and  sentences,  the  work  to  be  reported  here  tries  to
illuminate the semantics rather than the syntax of verbs and nouns. 

We  hypothesize  that  in  the  early  stages  of  language  acquisition  in  children,  and
perhaps also in the early stages of linguistic evolution in the lineage of Homo sapiens,
words begin  to differentiate  into verbs  and nouns with verbs  referring  to actions and
nouns  to  objects. But what  does  it mean  to  refer  to  actions  or  to  objects  and, more
generally, what  it  is  for  a word  to  refer? Heard sounds acquire meaning or  reference
(we  use  the  two  terms  interchangeably)  for  an  organism  and  therefore  become
linguistic  signals  for  the  organism  when  they  influence  the  way  in  which  the
  5
organism responds to the input from the environment. We imagine a basic situation in
which the organism is exposed to visual input from the environment and the organism
responds  to  this  visual  input  with  some  motor  action.  Heard  sounds  are  additional
inputs  to  the  organism  which  are  physically  produced  by  the  phono-articulatory
behavior  of  some  nearby  conspecific.  If  this  additional  input  systematically
influences how the organism responds to the visual input, with specific sounds having
specific  influences on  the organism’s behavior, we  say  that  the  sounds have become
linguistic signals which have meaning or reference. 

Our  organisms  see  objects  in  the  environment  and  they  respond  by  moving  their
(single)  arm  in  order  to  execute  some  action  with  respect  to  the  objects.  An
organism’s  behavior  is  controlled  by  the  organism’s  nervous  system  which  is
modeled  using  an  artificial  neural  network. The neural network has  two distinct  sets
of  input  units  (sensory  receptors). One  set  of  input  units  encodes  the  content  of  the
organism’s  retina  (visual  input).  The  other  set  of  input  units  encodes  the  current
position  of  the  organism’s  arm  (proprioceptive  input).  The  network’s  output  units
encode  muscle  movements  which  result  in  changes  in  the  arm’s  position.
Intermediate  between  the  input  and  the  output  units  there  are  one  or more  layers  of
hidden  units. All  the  network’s  units  encode  information  in  terms of  the quantitative
state  of  activation  of  the  units.  The  neural  network  functions  as  a  succession  of
input/output  cycles  of  activity.  In  each  cycle  the  pattern  of  activation  of  the  input
units  is  transformed  into  the patterns of  activation of  the  successive  layers of hidden
units  by  the  connection  weights  linking  one  unit  to  the  next  one  until  an  output
pattern  of  activation  is  generated which  results  in  a micro-movement  of  the  arm. A
succession  of  micro-movements  is  an  action  of  the  organism  with  respect  to  the
visually  perceived  objects.  The  organism may  see  a  single  object  at  a  time  or  two
objects at  the same  time and  it may  respond by moving  its arm to reach an object or
to push the object away from itself or to pull it toward itself.

  6
Now we  add  language.  Imagine  that  the  organism’s  neural  network  includes  a  third
set  of  input  units  which  may  encode  various  sounds  (auditory  input).  These  heard
sounds  tend  to  influence  the way  in which  the organism  responds  to  the visual input.
When  the  organism  hears  one  particular  sound  it  responds  to  the  visual  input  with
some  particular  action  which  may  be  different  (although  it  need  not  be)  from  the
action with which the organism would have responded to that input in the absence of
the  sound  (including  no  action  at  all).  When  a  different  sound  is  heard  by  the
organism, the organism may respond with a different action.

We  will  describe  a  number  of  simple  situations  in  which  linguistic  signals  acquire
their  meaning  in  that  they  become  part  of  the  organism’s  total  experience  in  its
environment. 

Imagine  the  following  language  game  (Cangelosi & Parisi  2001; Parisi & Cangelosi
2002). The  life of  the organism  is divided up  into episodes which are composed of a
number  of  successive  input/output  cycles.  In  each  episode  the  organism  sees  one  of
two objects, O1 and O2, which vary in their shape. Together with this visual input the
organism  receives  an  auditory  input,  a heard  sound presumably pronounced by some
conspecific  located  nearby  in  the  organism’s  environment.  There  are  only  two
possible sounds, S1 and S2, but  in any given episode the organism hears only one of
these  two  sounds. At  the  beginning  of  each  episode  the  endpoint  of  the  organism’s
arm  (the  hand)  is  already  positioned  on  the  object.  If  we  observe  the  organism’s
behavior,  we  see  that  the  organism  responds  to  the  visually  perceived  object  by
pushing  the object away  from  itself if it hears the sound S1 and by pulling the object
toward  itself  if  it  hears  the  sound S2. This  happens  independently  from whether  the
object  is O1  or O2.  In  these  circumstances, we  say  that  the  two  sounds which  are
heard  by  the  organism  are  (proto)verbs.  (In  fact  they  have  a  meaning  which  is
equivalent  to  the meaning of  the English verbs “push” and pull”.) S1 and S2 co-vary
with  the  action  with  which  the  organism  responds  to  the  visual  input  but  they  are
  7
indifferent  to  the content of  the visual  input,  i.e.,  to whether the object which is seen
and which is pushed or pulled is O1 or O2. 

Imagine now another language game (Falcetta 2001). The organism sees both objects,
O1 and O2, at the same time. The two objects are located one in the left half and one
in  the  right  half  of  the  organism’s  visual  field.  Together  with  this  visual  input  the
organism hears one of  two  sounds, S3  and S4. At  the beginning of each episode  the
organism’s arm  is  in a  randomly selected position but always away from the objects.
(Notice  that  the  organism  does  not  see  its  arm.  It  is  informed  by  the  proprioceptive
input  about  the  arm’s  current  position  but  it  only  sees  the  objects.)  When  the
organism hears S3 it moves its arm and reaches object O1 whereas when it hears S4 it
reaches object O2. In these circumstances, we say that the two sounds S3 and S4 are
(proto)nouns. 

Notice that, like S1 and S2, S3 and S4 influence the action produced by the organism.
Assuming that in a given episode the object O1 is in the left hemifield and the object
O2  in  the  right  hemifield,  if  the  organism  hears  S3  it moves  its  arm  toward  the  left
portion  of  the  visual  field  and  reaches  the  object which  is  there  (O1) whereas  if  it
hears S4 it moves the arm toward the right portion of the visual field and reaches O2.
However,  in  this second  language game  the  linguistic input has a different role in the
overall  experience  of  the  organism.  While  in  the  first  language  game  the  two
linguistic  signals,  S1  and  S2,  had  the  role  of  determining  the  particular  action
executed by  the organism, pushing or pulling,  independently  from whether  the object
was O1 or O2, in this new language game there is a single action, reaching an object,
and  the  two  linguistic signals, S3 and S4, have  the  role of directing  the action of  the
organism toward one particular object rather than toward the other. 

  8
Therefore, we characterize verbs as  linguistic signals  that co-vary with the actions of
the  organism  whereas  nouns  are  linguistic  signals  that  co-vary  with  the  particular
objects which are involved in these actions.

Since  in  the  second  language  game  the  organism  is  capable  of  only  one  action,  i.e.,
reaching  an  object with  its  arm,  there  is  no  need  for  the  language  to  specify which
action  to  choose  - which  is  the  role of verbs. The organism has only  to know which
one  of  the  currently  perceived  objects  must  be  reached,  and  providing  this
information  is  the  role  of  nouns.  But  consider  a  third,  somewhat  more  complex,
language game in which the organism is both capable of two distinct actions, pushing
and pulling objects (as in our first language game) and it sees two different objects at
the  same  time  (as  in  our  second  language  game).  In  the  new  language  game  the
organism will need  to hear  two  linguistic signals, one verb and one noun,  in order to
know what  to do. The  auditory  input units will  encode one of  the  two verbs S1  and
S2 at  time T0 and then one of the two nouns S3 and S4 at time T1, or viceversa. (In
this language game the temporal order of the two words in each sequence is irrelevant
but,  whatever  the  temporal  order,  to  be  able  to  appropriately  process  this  simple
(proto)sentence  the neural network will need a working memory which keeps a trace
of  the  first  word  while  hearing  the  second  word.)  In  general,  to  have  a
(proto)sentence,  one  portion  of  the  heard  sounds must  co-vary with  the  action  to be
executed and  the other portion with the object on which the action is to be executed.
Since actions can be executed on more than a single object (e.g., the action of giving
involves  two  objects:  the  object  given  and  the  person  receiving  the  object),
(proto)sentences  may  include  more  than  a  single  noun.  (For  the  emergence  of
subjects  or  agents,  cf.  the  last  section.  For  the  evolutionary  emergence  of
compositionality, cf. Cangelosi 2001.)

We  have  defined  nouns  in  terms  of  their  role  in  directing  the  organism's  action
toward  particular  objects.  Consider,  however,  that  the  organism’s  action  can  also
  9
consist  in what  is called “overt attention”,  i.e., movements of  the organism’s eyes or
head  that  allow  the  organism  to  visually  access  some  particular  object  -  the  object
which is specified by the noun. Normally organisms see many different objects at the
same time and by hearing a noun they select one particular object as the object which
is  to be  involved  in  the organism’s action while ignoring the other objects. However,
in  other  cases  the  organism  hears  some  particular  noun  without  seeing  the  object
which  is  indicated by  the noun.  In  these circumstances  the noun causes  the organism
to move  its  entire body  (locomoting) or particular parts of  its body  (turning  the head
or the eyes) until it finds an object with the required properties and it can execute the
expected action on the object.

To  illustrate  this  role  of  nouns  let  us  consider  a  fourth  language  game.  The
organism’s visual  field  is divided into three parts: a central portion with better seeing
capabilities  (fovea)  and  two  peripheral  portions,  on  the  left  and  on  the  right  of  the
central  portion,  with  less  good  vision.  The  neural  network  which  controls  the
organism’s behavior has two sets of output (motor) units, not just a single set as in the
preceding  language games. One set of motor units controls  the organism’s arm, as in
our previous simulations, while the second set of motor units controls the movements
of  the  organism’s  (single)  eye. At  the  beginning  of  each  episode  the organism  looks
straight ahead but it can move its eye either to the right or to the left. In every episode
the  organism’s  visual  field  contains  three objects with different  shapes, O3, O4,  and
O5, which  are  randomly distributed one  in  the visual  field’s central portion and each
of  the  other  two  in  one  of  the  two  peripheral  portions.  Notice,  however,  that  the
organism  can  recognize  the  shape  of  an  object  if  the  object  is  located  in  the  central
fovea but not if it is located in the peripheral portions of the visual field.

The organism  is capable of only one action using  its arm: reaching an object. Hence,
we don’t need verbs in this language game. In each episode the organism hears one of
three  linguistic  signals  (nouns):  S3,  S4,  and  S5.  If  the  organism  hears  the  linguistic
  10
signal S3  and  the object O3  is  in  the  fovea,  the organism directly  reaches  the object
with  its arm. However, if O3 is not in the fovea the organism rotates its eye either to
the  left or  to the right. The organism continues to rotate its eye until the object O3 is
in the fovea, and at this point it reaches the object. The same is true for the other two
objects,  O4  and  O5,  and  the  other  two  linguistic  signals,  S4  and  S5.  The  new
language  game  makes  it  clear  in  what  sense  nouns  control  the  movements  of  the
organism’s  eye,  head, or  entire body  that  allow  the organism  to obtain visual  access
to  some  particular  object  contained  in  its  environment  so  that  the  organism  can
execute  some  further  action  with  respect  to  the  appropriate  object,  i.e.,  the  object
specified by the noun. 

In  the  language  games  we  have  described  we  can  distinguish  between  verbs  and
nouns  in  that  some  particular  linguistic  signal  co-varies  either  with  the  organism’s
action or with  the particular object which  is  involved in the organism’s action. In the
former case we say that the linguistic signal is a verb whereas in the latter case it is a
noun.  But  consider  a  fifth  language  game  in  which  the  organism  lives  in  an
environment  which  contains  both  edible  and  poisonous  mushrooms  (Cangelosi  and
Parisi, 1998). To  survive  and  reproduce  the organism must be able  to approach  (and
eat)  the  edible  mushrooms  and  to  avoid  the  poisonous  ones.  Notice  that  each
individual  mushroom  is  perceptually  different  from  all  other  mushrooms,  including
those belonging  to  the same category. Therefore, when it encounters a mushroom the
organism must  be  able  to both  recognize  (classify)  the mushroom  as  either  edible or
poisonous and respond with the appropriate action to the mushroom (approaching and
eating the edible mushrooms and avoiding the poisonous ones). When it encounters a
mushroom  the  organism  can  hear  one  of  two  linguistic  signals,  S6  and  S7,
presumably produced by some nearby conspecific which wants to help our organism.
Of  these  two  linguistic  signals, S6  co-varies with  (all)  edible mushooms  and S7  covaries
 with (all) poisonous mushrooms. Are S6 and S7 verbs or nouns? We think that
the distinction cannot be made in this language game. S6 co-varies both with one type
  11
of action (approaching and eating the mushroom) and with one type of objects (edible
mushrooms),  and  S7  co-varies  with  both  the  other  type  of  action  (avoiding  the
mushroom)  and  the  other  type  of  objects  (poisonous  mushrooms).  Therefore,
although  S6  and  S7  are  linguistic  signals  since  they  influence  the  organism’s
behavior  (for  example  they make  the behavior more efficient),  there  is no ground  for
saying  that  they  are  either  verbs  or  nouns  because  they  co-vary  simultaneously with
both the action on the part of the organism and the type of objects to which the action
is  addressed.  It  might  be  that  this  type  of  language  game,  in  which  it  is  still
impossible  to distinguish between verbs  and nouns,  reflects  a very primitive stage of
language  such  as  the  language  of  our  earliest  language-using  ancestors  and  the
language of children between, say, 1 year and 1 year and a half of age.

In our model nouns  co-vary with objects  and verbs with actions. However,  there are
two  types  of  objects,  natural  objects  (e.g.,  trees)  and  artificial  objects  (e.g., knives).
Organisms respond to natural objects with a variety of different actions depending on
the  circumstances  but  there  is  generally  no  particular  action  associated  with  each
natural  object.  An  organism  may  respond  to  a  tree  by  cutting  the  tree,  picking  up
fruits  from  the  tree,  recovering under  the  tree  for shadow, etc. In contrast, organisms
tend  to  respond  to  artificial  objects with  one  particular  action which  is  specific  for
each  of  them. A  knife  is  normally  used  to  cut,  although  a  knife  can  also  be  bought,
cleaned,  put  into  a  drawer,  etc.  Therefore,  in  a  sense  artificial  objects  are  more
associated with  the specific actions  than natural objects and,  from  this point of view,
they  resemble  verbs.  However,  linguistic  signals  that  co-vary  with  artificial  objects
are  nouns  in  the  same way  as  linguistic  signals  that  co-vary with  natural  objects.  In
both cases the linguistic signal is used to direct the attention/action of the organism to
some particular object in the environment.

3. Adjectives and, more generally, noun modifiers

  12
Consider  now  a  sixth,  somewhat  more  complex,  language  game.  In  the  preceding
language  games  the  different  objects  differed  only  in  their  shape.  In  the  organisms’
environment  there was only one object  for each shape, and therefore there were only
two  (or  three,  in  the  fourth  language game) objects  in all.  In  the new  language game
the  organism’s  environment  contains  four  objects. Two  objects  have  one  shape  and
the other two objects have a different shape. However, the two objects with the same
shape differ in their color: one is blue and the other one is red.

In  each  episode  the  organism  sees  two  objects  and  the  two  objects  have  the  same
shape but different  color. Hence, providing  the organism with  the noun  that  refers  to
objects of a given shape (our second language game) is useless. The organism would
not  know which  object  to  reach with  its  arm. However, we  now  introduce  two new
linguistic  signals,  S8  and  S9. When  the  organism  hears  the  sound  S8  it  reaches  the
blue  object  and  when  it  hears  the  sound  S9  it  reaches  the  red  object.  In  these
circumstances  S8  and  S9  are  (proto)adjectives. Notice  that  if  the  organism  sees  all
four objects at the same time, it will need both a noun and an adjective in sequence (a
(proto)noun phrase) to be able to identify the particular object which it is supposed to
reach.

Adjectives  have  the  same  general  role  of  nouns  in  the  behavior  of  our  organisms:
they  direct  the  attention  of  the  organism  to  particular  objects  and  guide  the
organism’s action toward those objects. So what distinguishes nouns from adjectives?
In our  simulations nouns  co-vary with  (in  common parlance,  refer  to) objects having
particular  shapes whereas adjectives co-vary with other properties of objects such as
their  color.  In  fact,  shape  appears  to  be  more  important  for  distinguishing  among
different nouns  than other properties of objects.  In psycholinguistic experiments both
children  and  adults  generalize  invented  words  syntactically  identified  as  nouns  to
other  objects  having  the  same  color,  size,  or  texture  of  an  initial  object more  often
than  to  objects  with  a  different  shape  (Landau  et  al.  1988),  although  words
  13
syntactically  identified  as  count  nouns  show  this  tendency  more  than  words
syntactically  identified  as  mass  nouns  (Landau  et  al.  1992).  Therefore,  we
hypothesize  that,  while  both  nouns  and  adjectives  have  the  same  general  role  of
directing  the  attention/action  of  organisms  to  particular  objects  in  the  environment,
nouns  differ  from  adjectives  because  nouns  direct  the  organisms’  attention/action  to
objects  with  a  given  shape  and  adjectives  to  objects  with  a  given  color  or  size  or
some other property. 

Of  course,  there  is  nothing  special  or metaphysical  about  shape  as  contrasted  with
color  or  size  in  object  identification  except  that  objects  which  differ  in  shape  are
more likely to require different actions on the part of organisms than objects differing
in color or size.  (This may explain why other properties of objects such as those that
identify an object as an animal, e.g.,  texture, may also be  important for nouns (Jones
et  al.  1991;  1998).  Animals  generally  require  different  types  of  actions  directed
toward  them  in  contrast  to non-animals.) Shape  rather  than  color or  size  tends  to be
unique to classes of objects that require specific types of actions. Trees tend to have a
unique  shape whereas  they  do  not  have  a  unique  color  or  size. Only  trees  have  the
shape of trees but not only trees are green. All the objects which co-vary with (i.e. are
designated  by)  a  given  noun  share  a  particular  shape  which  is  not  shared  by  other
objects whereas even  if  they are all of  the same color,  like strawberries,  this color is
shared also by other objects not called “strawberries”.

Now  consider  another  language  game.  The  organism  sees  two  objects  at  the  same
time.  The  two  objects  can  be  either  the  same  object  (same  shape)  or  two  different
objects  (different shapes) but  in any case  they are  located  in different portions of the
visual  field. For  example,  an object  can be  located  in  the  left portion  and one  in  the
right portion of  the visual  field. The organisms hears one of  two sounds, S8 and S9.
When  it hears S8,  the organisms  reaches  the object  located  in  the  left portion of  the
visual field whereas when it hears S9 it reaches the object located in the right portion
  14
of  the visual  field. Notice  the difference between  this  language game and  the second
language  game  described  above.  In  that  language  game  the  organism  was  also
directed  by  language  to  go  to  the  left  portion or  the  right portion of  the visual  field.
However, when the organism heard, for example, S3 it went to the left portion of the
visual  field  if  the  object O1 was  there  but  it went  to  the  right  portion  of  the  visual
field  if  the  object  O1  was  in  the  right  hemifield.  In  other  words,  the  organism’s
behavior  was  guided  by  the  shape  of  the  objects  and  therefore  S3  and  S4  were
classified as nouns. In this new language game, on the contrary, the organism reaches
the  object  located  in  the  left  hemifield  whethere  the  object  is  O1  or  O2,  i.e.,
independently  from  the  shape  of  the  object. Therefore  the  new  linguistic  signals, S8
and S9, cannot be nouns. Are they adjectives? 

We  introduce  a  new  class  of  words  called  non-adjective  noun  modifiers.  Both
adjectives  and non-adjective noun modifiers  are noun modifiers but, while adjectives
tend to co-vary with more or less permament properties of objects such as their color
or  size,  non-adjective  noun  modifiers  co-vary  with  more  temporary  properties  of
objects  such  as  the object being  located  in  the  left or  right portion of  the organism’s
visual  field.  An  object  can  be more  or  less  permanently  red  or  small  but  it  is  only
temporarily  placed,  say,  in  the  left  portion  of  the  organism’s  visual  field. Hence, S8
and  S9  are  non-adjective  noun modifiers.  (Notice  that  non-adjective  noun modifiers
tend  to be  sequences of more  than one word  (phrases) whereas  adjectives are single
words.  For  example,  the meaning  of  S8  is  roughly  equivalent  to  the meaning  of  the
English phrase “on the left”.)

To summarize, we have distinguished two large categories of linguistic signals: verbs
and  what  we  can  call  noun  phrases.  Verbs  co-vary  with  the  action with which  the
organism  responds  to  the  visual  input  largely  independently  from  the  content  of  the
visual  input.  Noun  phrases,  on  the  other  hand,  direct  the  attention/action  of  the
organism  to  particular  visually  perceived  objects  in  the  environment.  Noun  phrases
  15
can  be  simply  nouns  or  they  can  be  sequences  of  linguistic  signals  which  almost
always  include  a  noun  accompanied  by  a  noun  modifier,  which  can  be  either  an
adjective  or  a  non-adjective  noun  modifier  (itself  a  phrase  in  many  cases).  Noun
modifiers  have  the  same  role  of  nouns  in  directing  the  attention/action  of  the
organism to the particular object which is to be involved in the organism’s action but
they  refer  to different properties of objects. Nouns  refer  to the shape of objects or to
other properties of objects  that  tend  to be more highly  correlated with  the actions of
the  organism with  respect  to  the objects. Adjectives  refer  to more or  less permanent
properties  of  objects which,  however,  are  less  highly  correlated with  the  actions  of
the organism with  respect  to  the objects. Non-adjective noun modifiers  refer  to more
temporary  or  extrinsic  properties  of  objects  such  as  their  current  position  in  the
organism’s visual field or, more generally, in space (e.g., “on the desk”).

Verbs  also  may  be  accompanied  by  verb  modifiers  which  are  similar  to  noun
modifiers.  These  verb modifiers  can  be  adverbs  (single  word)  or  adverbial  phrases
(sequence  of  words).  Verb  modifiers  ask  the  organism  to  execute  an  action  in  the
particular way which is indicated by the adverb or adverbial phrase. Consider this last
language  game. The  language  game  is  identical  to  our  first  language  game  in which
the organism can either push or pull an object. What  is new  is  that  the organism can
push  or  pull  the  object  either  slowly  or  quickly.  The  organism  can  hear  two  new
signals,  S10  and  S11,  together  with  the  verbs  S1  (pull)  and  S2  (push). When  the
organism  hears  S10,  it  pushes  or  pulls  the  object  slowly whereas when  it  hears  the
S11 it pushes or pulls the object more quickly. S10 and S11 are (proto)adverbs.

4. Many open questions

We  have  described  a  number  of  simple  simulated  language  games  that  are  aimed  at
clarifying  how  heard  sounds  become  linguistic  signals  and  how  different  classes  of
sounds  which  play  different  roles  in  the  organism's  experience  and  interaction with
  16
the  environment  become  different  parts  of  speech.  These  language  games  are
simulated  in  the  sense  that we  can  construct  artificial  organisms  that  behave  in  the
ways we have described. Neural networks  respond  to  the  input,  i.e.,  they behave,  in
particular ways  because  they  have  particular  connection weights.  In  our  simulations
we use a genetic algorithm to find the appropriate connection weights which result in
the  desired  behaviors. A  genetic  algorithm  is  a  learning  procedure which  is  inspired
by  evolution  (Holland  1975).  However,  there  is  no  assumption  that  the  linguistic
abilities  (responding  appropriately  to  linguistic  signals)  of  our  organisms  are  either
entirely  genetically  inherited  (which  of  course  cannot  be  since  different  humans
speak  different  languages)  or  entirely  learned  during  life  with  no  important
genetically  inherited  basis  (which  cannot  be  since  only  humans  have  language).
Simply,  we  have  not  addressed  the  problem  of  the  origin  of  the  linguistic  abilities
exhibited by our artificial organisms.

Of  course, we  have  just  scratched  the  surface  of  the  problem  of  accounting  for  the
differences among the parts of speech. Let us mention a list of open questions, with in
some cases some hints as to how to address these questions in the present framework.

(1) We  have  simulated  (some  aspects  of)  the  ability  to  understand  language,  i.e.,  to
respond  appropriately  to  heard  sounds  which  are  linguistic  signals,  but  we  haven't
said  anything  about  the  ability  to  produce  language,  i.e.,  to  execute  the  phonoarticulatory
  motor  behaviors  which  result  in  the  physical  production  of  the
appropriate  sounds/linguistic  signals. To  simulate  the  ability  to  speak  it  is  necessary
to add a further set of output units to the neural network of our organisms which will
encode phono-articulatory movements  resulting  in  the physical production of sounds.
Aside  from  that,  we  believe  that  the  basic  categories  of  words  remain  the  same:
produced  sounds  are  verbs  if  they  co-vary with  the  actions  of  the  speaker  or  of  the
hearer;  they are nouns if they co-vary with the objects (mainly identified on the basis
  17
of  their  shape)  involved  in  the  actions  of  the  speaker  or  of  the  hearer;  they  are
adjectives if they co-vary with other properties of objects; and so on.

(2) We have simulated verbal commands but language has many other pragmatic uses
and  is  involved  in  different  types  of  speech  acts:  acts  of  information,  questions,
expressions of  intentions or desires, etc. To account  for  these other uses of  language
we will need more complicated language games and more complex social interactions
among our simulated organisms.

(3) Many  verbs  to  do  not  refer  to  actions  and many  nouns  do  not  refer  to  concrete,
perceptually  accessible  objects.  Verbs  sometimes  co-vary  with  (i.e.,  refer  to)
processes rather than with actions (Langacker 1987). Actions are processes but many
processes are not actions of organisms (e.g., the process of snowing). Verbs referring
to  processes  which  are  not  actions  require  that  our  artificial  organisms  possess  an
ability  to  abstract  “change  of  state”  (or  even  “lack  of  change  of  state”  for  verbs
referring to states such as sleeping) in a succession of inputs even if the succession of
input  does  not  reveal  an  action.  Furthermore,  verbs  and  nouns may  not  all  possess
verbness  and  nounness  to  the  same  degree.  There  might  be  a  continuum  of
verbness/nounness.

(4) Language  is  often  used  in  situations  in which  the  organism  is  not  responding  to
external  (in  our  case,  visual)  input  with  external  motor  behavior  (in  our  case,  the
movements  of  the  arm).  The  organism  can  respond  to  heard  sounds  without
producing  any  external  behavior,  it  can  produce  linguistic  signals  with  no  current
input  from  the  external  environment,  and  it  can  even  use  language  purely  internally
with  no  external  input  or  external  output  of  any  kind  (thinking).  These  uses  of
language  all  involve  the  self-generation  of  input  by  a  neural  network,  both  linguistic
(imagined  sounds)  and  nonlinguistic  (imagined  actions  and  their  effects  in  the
  18
environment)  input.  The  ability  to  self-generate  input  is what  defines mental  life  as
distinct from behavior.

(5) Nouns  and verbs,  and of  course  the other parts of speech, have properties which
are  syntactic  in  nature,  rather  than  semantic. These  syntactic  properties  derive  from
their  use  in  sequences  of  words  which  have  sequential  constraints  (for  example,  in
English  verb  objects  follow  verbs,  do  not  precede  them)  and  internal  structure  (cf.
Cangelosi & Parisi 2002; Turner & Cangelosi 2002).

(5) Nouns can be morphologically “derived” from verbs and verbs from nouns.

(6)  The  kind  of  simple  verb-noun  sequences  we  have  considered  in  one  of  our
language games  represent verb-object (proto)sentences. How verb subjects emerge in
languages? Probably  the  emergence of  subjects  in action sentences  (agents)  is  linked
with  the  ability  to  recognize  the  same  action  as made  by me  and  as made  by  other
individuals  (cf.  the  “mirror  neurons”  of  Rizzolatti  &  Arbib  1998).  In  these
circumstances  one  has  to  specify  not  only  the  object(s)  on  which  the  action  is
executed  (the  verb  complement(s))  but  also  the  author  of  the  action,  i.e.,  the  agent
(the verb’s subject).

Acknowledgements
Angelo Cangelosi’s work  for  this  paper was  partially  funded  by  an UK Engineering
and Physical Research Council Grant (GR/N01118).
  19

Bibliographical References

Batali,  John  (1994),  “Innate  biases  and  critical  periods:  combining  evolution  and
learning  in  the  acquisition  of  syntax”,  in  Brooks,  Rodney  & Maes,  Patti,  eds.,
Artificial Life IV, Cambridge, Mass., MIT Press (1994:160-171).

Brown,  Keith  &  Miller,  Jim  (1999),  Concise  Encyclopedia  of  Grammatical
Categories, Amsterdam, Elsevier.

Cangelosi, Angelo  (2001),  “Evolution  of  communication  and  language using  signals,
symbols and words”, IEEE Transactions on Evolutionary Computation, 5:93-101.

Cangelosi, Angelo & Harnad, Stevan (in press), “The adaptive advantage of symbolic
theft  over  sensorimotor  toil:  grounding  language  in  perceptual  categories”,
Evolution of Communication, 4(1).

Cangelosi, Angelo & Parisi, Domenico (1998), “The emergence of a ‘language’ in an
evolving population of neural networks”, Connection Science, 10:83-97.

Cangelosi,  Angelo &  Parisi,  Domenico  (2001),  “How  noun  and  verbs  differentially
affect  the  behavior  of  artificial  organisms”,  in Moore,  Johanna  D.  &  Stenning,
Kennneth,  eds.,  Proceedings  of  the  23rd  Annual  Conference  of  the  Cognitive
Science Society, Hillsdale, N.J., Erlbaum (2001:170-175).

Cangelosi,  Angelo  &  Parisi,  Domenico,  eds.  (2002),  Simulating  the  Evolution  of
Language, London, Springer.

Ellefson,  Michelle  R.  &  Christiansen,  Morten  H.  (2000),  “Subjacency  constraints
without  universal  grammar:  evidence  from  artificial  language  learning  and
connectionist  modeling”,  in  Proceedings  of  the  22nd  Annual  Conference  of  the
Cognitive Science Society, Hillsdale, N.J., Erlbaum (2000:645-650).

Falcetta,  Ilaria  (2001),  Dalle  reti  neurali  classiche  alle  reti  neurali  ecologiche:  il
significato  come  proprieta’  emergente  delle  interazioni  senso-motorie  tra
organismo e ambiente, Dissertation, University of Rome La Sapienza.

Holland,  John H.  (1975), Adaptation  in Natural  and Artificial  Systems, Ann Arbor,
Michigan, University of Michigan Press.

Jones,  Susan  S.,  Smith Linda B., & Landau Barbara  (1991),  "Object  properties  and
knowledge in early lexual learning", Child Development, 62:499-512.
  20

Jones,  Susan  S.,  Smith  Linda  B.  (1998),  "How  children  name  objects  with  shoes",
Cognitive Development, 13:323-334.

Kaplan, Frederik  (2000),  “Talking AIBO:  first  experimentation of verbal  interactions
with an autonomous  four-legged  robot”,  in Nijholt, A., Heylen, D. & Jokinen, K.,
eds.,  Learning  to  Behave:  Interacting  agents.  CELE-TWENTE  Workshop  on
Language Technology (2000:57-63).

Kirby,  Simon  (1999),  “Syntax  out  of  learning:  the  cultural  evolution  of  structured
communication  in a population of  induction algorithms”, in Floreano, Dario et al.,
eds., Proceedings of ECAL99 European Conference on Artificial Life, New York,
Springer (1999:694-703).

Knight,  Chris,  Studdert-Kennedy,  Michael,  &  Hurford,  Jim,  eds.,  (2000)  The
Evolutionary  Emergence  of  Language:  Social  Function  and  the  Origins  of
Linguistic Form, Cambridge, Cambridge University Press.

Landau,  Barbara,  Smith,  Linda  B.,  &  Jones,  Susan  S.  (1988),  “The  importance  of
shape in early lexical learning, Cognitive Development, 2:291-321.

Landau, Barbara, Smith, Linda B., & Jones, Susan S.  (1992), “Syntactic context and
the  shape bias  in children’s and adults’  lexical  learning”, Journal of Memory and
Language, 31:807-825.

Langacker,  Ronald  W.  (1987),  Foundations  of  Cognitive  Grammar.  Volume  1:
Theoretical Prerequisites, Stanford, Cal., Stanford University.

Parisi,  Domenico  (1997),  An  Artificial  Life  approach  to  language,  Mind  and
Language, 59:121-146.

Parisi,  Domenico  &  Cangelosi,  Angelo  (2002),  “A  unified  simulation  scenario  for
language  development,  evolution,  and historical  change”,  in Cangelosi, Angelo &
Parisi, Domenico,  eds., Simulating  the Evolution of Language, London, Springer,
2002:255-276.

Rizzolatti,  Giacomo  &  Arbib,  Michael  A.  (1998),  “Language  within  our  grasp”,
Trends in Neurosciences, 21:188-194.

Soja,  Nancy  N.  (1992),  “Inferences  about  the  meanings  of  nouns:  the  relationship
between perception and syntax”, Cognitive Development, 29-45.

Steels,  Luc  (1997),  “The  synthetic  modeling  of  language  origins”,  Evolution  of
communication, 1:1-34.
  21

Steels, Luc &  Kaplan, Frederik (1999), “Collective learning and semiotic dynamics”,
in Floreano, Dario  et  al.,  eds., Proceedings of ECAL99 European Conference on
Artificial Life, New York, Springer (1999:679-688).

Steels,  Luc  &  Vogt,  Paul  (1997),  “Grounding  adaptive  language  games  in  robotic
agents”,  in  Husband,  Paul  &  Harvey,  Inman,  eds.,  Proceedings  of  the  Fourth
European Conference on Artificial Life, Cambidge, Mass., MIT Press (1997:474-
482).

Turner,  Huck  &  Cangelosi,  Angelo  (2002),  “Implicating  working  memory  in  the
representation  of  constituent  structure  and  the  origins  of word  order  universals”,
paper  presented  at  4th  International  Conference  on  the  Evolution  of  Language,
Boston.

Wittgenstein, Ludwig (1953), Philosophical Investigations, London, Blackwell.