Chapter 12 - Seeing
The visual code
It may seem to be a paradox to say that there are programs for seeing. The
essence of programs is that they are plans for action, whereas in the
conventional schemes of both philosophers and physiologists sensation and action
are separated, and a sensation would usually be considered to come first, before
action. We are proposing that the reverse is true and that higher animals, at
least, go around actively searching for things to see and that they 'see' mainly
those things that were expected because the program includes hypotheses and
rules for testing them.
To understand this paradox we have to explain why seeing is not like photography
and this involves some further rather subtle considerations about symbolism,
which even today are not widely understood. When we see a red object, what goes
up the optic nerve? It obviously is not red light so what then is it? The
physiologist replies, 'a set of nerve impulses'. But nerve impulses are not red
either. To express the situation we may say that the nerve impulses are signals
that act as symbols of red light, in the sense that they can be decoded by an
appropriate brain. Our task is then to explain what is implied by speaking in
this way of 'signals', 'symbols', and 'decoding'.
The retina
Although seeing is not like photography we shall begin by saying that the eye is
in some ways very like a camera. It has a lens and a diaphragm (the iris) and a
focusing device. What is more, the first step in the process of vision is a
photochemical change somewhat like that in a photographic plate. The retina
contains a mosaic of more than one hundred million separate receiving elements
of two sorts, the rods and cones, each of which detects a tiny part of the image
that is thrown on it by the lens, producing a minute electrical or chemical
change. Only the cones are sensitive to colours and most of them are
concentrated near the centre of the eye. Here there is a small area, the fovea,
containing only about 30 000 receptive cones. These perform nearly all the
detailed work of seeing, except in dim light. In order to see things we have
therefore continually to explore them by minute movements of the eyes around
them, examining the part we want to see by the fovea. This is of fundamental
importance for our system for thinking about vision because the program that
controls these eye movements largely determines what we see.
Seeking what to see
It is easy to see that a person moves his eyes in jerks, say when he is reading
or looking around the room. These large movements are separated by periods of
fixation. They occur at a maximum of about five a second (in rapid reading) but
often they are less frequent. The movements are ballistic and very fast
(1000°/s). Even during fixation the eyes continue to make small tremor movements
at about 50/s and 10 minutes of arc, enough to change the position of an image
on the fovea by about 30 cones. Both the large and small fast movements are now
usually called saccades (from a French word meaning the pull on the reins of a
horse). In addition the eyes make slow drifts during fixation, and they may also
follow moving targets (pursuit movements) or move to maintain stereoscopy (vergence
movements).
Fig 12.1
No information is received during a saccade, so the large movements divide up
the process of seeing into a discontinuous series of packages, and it can be
shown that information does in fact reach the cerebral cortex in bursts,
corresponding with them. Figure 12.1 shows the sequence of pieces of information
that might be sent from the fovea as the eye scans a pyramid. Each jump is
towards a point that is likely to be interesting. The direction that is chosen
depends on the program in the brain, which makes a forecast on the basis of the
information received. In Fig. 12.2(a and b) the interest is obviously in the
human figures, and especially their eyes. Incidentally this is one more example
of the propensity of our brain programs to direct attention to human features, a
tendency that is probably partly inborn and no doubt accentuated by life as a
social creature. In Fig. I2.2(b) when instruction was given to search for
particular features the program was modified accordingly.
So the programs of enquiry that are learned from childhood onwards dictate what
movements are made in response to the information coming in with each jump.
Vision is a dynamic process, using a series of scans, but these are not rigidly
determined as in a television raster. They are varied according to the nature of
the scene itself and the previous experience of the individual. Moreover the
scanning does not work by converting the information in the spatial scene into a
single channel, but puts it into many parallel channels, which maintain the
spatial relations, so in a sense the original picture is reproduced on the
cortex, but modified and much expanded (p. 125).
We can thus regard all seeing as a continual search for the answers to questions
posed by the brain. The signals sent from the retina constitute 'messages'
conveying these answers. The brain then uses this information to construct a
suitable hypothesis about what is there and a program of action to meet the
situation. As a hungry boy looks around, his eyes may send signals that suggest
a fruit tree. Signals go back to the eyes to search for food and if the
returning messages indicate 'apples' he starts the climb to pick and eat them.
Encoding in the retina
The sequence of processes involved in the act of seeing do not therefore really
begin in the retina, but involve the brain. Nevertheless it is convenient to ask
just how the retina composes its messages. The rods and cones are the
light-sensitive elements. They contain special pigments, which change when the
intensity of light falling on them varies. This change alters the electrical
potentials of the cells, so that the pattern of light thrown by the lens
produces a corresponding pattern of electrical and chemical change in the
various neurons that make up the retina (Fig. 12.3). Many of these cells are
little 'microneurons', which do not send away yes-or-no signals but produce
graded changes that increase or decrease the probability that their larger
neighbours will set off action potentials. This is a sort of analogue
computation, which finally generates discontinuous (digital) all-or-nothing
signals in the largest cells of the retina, the ganglion cells, whose axons run
to the brain. These impulses in the optic nerve fibres at each moment of
scanning a scene are the answers, in code, to the 'questions' that had been
asked at the previous moment. Of course if something quite unexpected happens it
is seen even though it had not been anticipated. The point is that what goes on
in the retina is not the recording of a 'picture', but the detection of a series
of items, which are reported to the brain. If the eyes are prevented from moving
the signals fade within a second and no picture can be seen.
Fig 12.02a
Fig 12.02b
Fig 12.03