Tag Archives: prediction

The Nature of Visual Experience


Many philosophers have used visual illusions as support for a representational theory of visual experience. The basic idea is that sensory input in the environment is too ambiguous for the brain to really figure out anything on the basis of sensory evidence alone. To deal with this ambiguity, theorists have conjectured that the brain generates a series of predictions or hypotheses about the world based on the continuously incoming evidence and it’s accumulated knowledge (known as “priors”). On this theory, the nature of visual experience is explained by saying that what we experience is really just the prediction. So on the visual illusion above, the brain guesses that the B square is a lighter color and therefore we experience it as lighter. The brain guesses this because in its stored memory is information about typical configurations of checkered squares under typical kinds of illumination. On this standard view, all of visual experience is a big illusion, like a virtual-reality type Matrix.

Lately I have been deeply interested in thinking about these notions of “guessing” and “prediction”. What does it mean to say that a collection of neurons predicts something? How is this possible? What does it mean for a collection of neurons to make a hypothesis? I am worried that in using these notions as our explanatory principle, we risk the possibility that we are simply trading in metaphors instead of gaining true explanatory power. So let’s examine this notion of prediction further and see if we can make sense of it in light of what we know about how the brain works.

One thought might be that predictions or guesses are really just kinds of representations. To perceive the B square as lighter is just for your brain to represent it as lighter. But what could we mean by representation? One idea comes from Jeff Hawkin’s book On Intelligence. He talks about representations in terms of invariancy. For Hawkins, the concept of representation and prediction is inevitably tied into memory. To see why consider my perception of my computer chair. I can see and recognize that my chair is my chair from a variety of visual angles. I have a memory of what my chair looks like in my brain and the different visual angles provide evidence that matches my stored memory of my chair. The key is that my high-level memory of my chair is invariant with respect to it’s visual features. But at lower levels of visual processing, the neurons are tuned to respond only to low-level visual features. So some low-level neurons only fire in respond to certain angles or edge configurations. So on different visual angles these low-level neurons might not respond. But at higher levels of visual processing, there must be some neurons that are always firing regardless of the visual angle because their level of response invariancy is higher. So my memory of the chair really spans a hierarchy of levels of invariancy. At the highest levels of invariancy, I can even predict the chair when I am not in the room. So if I am about to walk into my office, I can predict that my chair will be on the right side of the room. If I walked in and my chair was not on the right side, I would be surprised and I’d have to update my memory with a new pattern.

On this account, representation and prediction is intimately tied into our memory, our stored knowledge of reality that helps us make predictions to better cope with our lives. But what is memory really? If we are going to be neurally realistic, it seems like it is going to have to be cashed out in terms of various dispositions of brain cells to react in certain ways. So memory is the collective dispositions of many different circuits of brain cells, particularly their synaptic activities. Dispositions can be thought of as mechanical mediations between input and output. Invariancies can thus be thought of as invariancies in mediation. Low-level mediation is variant with respect to the fine-grained features of the input. High-level mediation is less variant with respect to fine-grain detail. What does this tell us about visual experience? I believe the mediational view of representation offers an alternative account of illusions.

I am still working out the details of this idea, so bear with me. My current thought is that the brain’s “guess” that square B is lighter can be understood dispositionally rather than intentionally. Let’s imagine that we reconstruct the 2D visual illusion in the real world, so that we experience the same illusion that the B square is lighter. What would it mean for my brain to make this prediction? Well, on the dispositional view, it would mean that in making such a prediction my brain is essentially saying “If I go over and inspect that square some more I should expect it to be lighter”. If you actually did go inspect the square and found it is is not a light square, you would have to make an update to your memory store. However, visual illusions are persistent despite high-level prediction. This is because the entirety of the memory store for low-level visual processing overrides the meager alternate prediction generated at higher levels.

What about qualia? The representational view says that the qualitative features of the B square result from the square being represented as lighter. But if we understand representations as mediations, we see that representations don’t have to be these spooky things with strange properties like “aboutness”. Aboutness is just cashed out in terms of specificity of response. But the problem of qualia is tricky. In a way I kind of think the “lightness” of the B square is just an illusion added “on top” of a more or less veridical acquaintance. So I feel like I should resist inferring from this minor illusional augmentation that all of my visual experience is massively illusory in this way. Instead, I think we could see the “prediction” of the B square as lighter as a kind of augmentation of mediation. The brain augments the flow of mediations such that if this illusion was a real scene and someone asked you to “go step on all the light squares” you would step on the B square. For this reason, I think the phenomenal impressiveness of the illusions are amplified because of their 2Dness. If it were a 3D scene, the “prediction” would take the form of possible continuations of mediated behavior in response to a task demand (e.g. finding light squares). But because it’s a 2D image, the “qualia” of the B square being light takes on a special form, pressing itself upon us as being a “raw visual feel” of lightness that on the surface doesn’t seem to be linked to behavior. But I think if we understand the visual hierachy of invariant mediation, and the ways in which the higher and lower levels influence each other, we don’t need to conclude that all visual experience is massively illusory because we live behind a Kantian screen of representation. Understanding brain representations as mediational rather than intentional helps us strip the Kantian image of its persuasive power.


Filed under Consciousness, Philosophy

Noncomputational Representation

I’ve been thinking about representations a lot lately. More specifically, I have been thinking about the possibility of noncomputational representation. On first blush, this sounds strange because representationalism has for a long time been intimately connected with the Computational Theory of Mind, which basically says that the brain is some kind of computer, and that cognition is most basically the manipulation of abstract quasi-linguaform representations by means of a low-level syntactic realizer base. I’ve never been quite sure how this is supposed to work, but the gist of it is captured by the software/hardware distinction. The mind is the software of the computing brain. Representations, in virtue of their supposed quasi-linguaform nature, are often thought of in terms of propositions. For a brain to know that P, it must have a representation or belief to the effect of that P. As it commonly goes, computation is knowledge, knowledge is representational, the brain represents, the brain is a computer.

But in this post I want to explore the idea of noncomputational representation. The basic idea under question is whether we can say that the brain traffics in representations even though it is not a computer i.e. if the brain is not a computer, does it still represent things, if so, how and in what sense? Following Jeff Hawkins, I think it is plausible to suppose that the brain is not a digital computer. But if the brain is not computing like a computer in order to be so intelligent, what is it doing? Hawkins thinks that the secret of the brain’s intelligence is the neocortex. He thinks that the neocortex is basically a massive memory-prediction machine. Through experience, patterns and regularities flow into the nervous system in terms of neural patterns and regularities. These patterns are then stored in the brain’s neocortex as memory. It is a well-known fact that cortical memories are “stored” in the same place as where they were originally taken in and processed.

How is this possible? Hawkins’ idea is that the reason why we see memory as being “stored” in the original cortical areas is that the function of storing patterns is to aid in the prediction of future patterns. As we experience the world, the sensory details change based on things like our perspective. Take my knowledge of where my chair is in my office. After experiencing this chair from various positions in the room, I now have a memory of where the chair is in relation to the room, and I have a memory of where the room is in relation to the house, and the house in relation to the neighborhood, and the neighborhood to the city, and so on. In terms of the chair, what the memory allows me to do is to “know” things about the chair which are independent of my perspective. I can look at the chair from any perspective and recognize that it is my chair, despite each sensory profile displaying totally different patterns. How is this possible? Hawkins idea is that the neocortex creates an invariant representation of the chair which is based on the integration of lower-level information into a higher-order representation.

What does it mean to create an invariant representation? The basic idea here can be illustrated in terms of how information flows into and around the cortex. At the lowest levels, the patterns of regularities of my sensory experience of the chair are broken up into scattered and modality-specific information. The processing at the lowest levels is carried out by the lowest neocortical layers. Each small region in these layers has a receptive field that is very narrow and specific, such as firing only when a line sweeps across a tiny upper-right quadrant in the visual field. And of course, when the information comes into the brain it is processed by contralateral cortical areas, with the right lower cortical layers only responding to information streaming in from the left visual field, and vice-versa. As the modality specific and feature-oriented information flows up the cortical hierarchy, the receptive fields of the cells becomes broader, and more steady in the firing patterns. Whereas the lower cortical areas only respond to low-level details of the chair, the higher cortical areas stay active while in the presence of the chair under any experential condition. These higher cortical areas can thus be said to have created an invariant representation of the patterns and regularities which are specific to the chair. The brain is able to create these representations because the world actually is patterned and regular, and the brain is responding to this.

So what is the cash value of these invariant representations? To understand this, you have to understand how once the information flows to the “top” of the hierarchy (ending in the hippocampus, forming long-term memories), it flows back down to the bottom. Neuroanatomists have long known that 90% of the connections at the lower cortical layers are streaming in from the “top”, and not the “bottom”. In other words, there is a massive amount of feedback from the higher levels into the lower levels. Hawkins’ idea is that this feedback is the physical instantiation of the invariant representations aiding in prediction. Because my brain has stored a memory/representation of what the chair is “really” like abstracted from particular sensory presentations, I am able to predict where the chair will be before I even walk into the room. However, if I walked into the room and the chair was on the ceiling, I would be shocked, because I have nothing in my memory about my chair, or any chair, ever being on the ceiling. Except I might have a memory about people pulling pranks by nailing furniture to ceilings, so after some shock, I would “re-understand” my expectations about future perceptions of chairs, being less surprised next time I see my chair on the ceiling.

Hawkins think that it is this relation between having a good memory and the ability to predict the future based on that memory which is at the heart of intelligence. In the case of memories flowing down to the sensory cortices, the “prediction” is one that predicts what future patterns of sensory activity are like. For example, the brain learns Sensory Pattern A and creates a memory of this pattern throughout the cortical hierarchy. The most invariant representation in the hierarchy flows down to the lower sensory areas and fires the Pattern A again based on the memory-based prediction about when it will experience Pattern A again. If the memory-prediction was accurate, the incoming pattern will match Pattern A, and the memory will be confirmed and strengthened. If the pattern comes in is actually Pattern B, then the prediction will be incongruous with the incoming information. This will cause the new pattern to shoot up the hierarchy to form a new memory, which then feedbacks down to make predictions about future sensory experience. In the case of predictions flowing down into the motor cortices, the “predictions” are really motor commands. If I predict that if I walk into my office and turn right I will see my chair, and if the prediction is in the form of a motor commond, the prediction will actually make itself come true if the chair is where the brain predicted it will be. Predictive motor commands are confirmed when the prediction is accurate, and disconfirmed if inaccurate.

So, a noncomputational representation is based on the fact that the brain (particularly the neocortex) is organized in an hierarchical memory system based on neuronal patterns and regularities, which in turn are composed of synaptic mechanisms like long-term potentiation. According to Hawkins, it is the hierarchy from bottom to top and back which gives the brain its remarkable powers of intelligence. The intelligence of humans for Hawkins is really a product of having a very good memory and being able to anticipate and hence understand the future in incredibly complex ways. If you understand a situation, you will not be surprised because your memory is so accurate. If you do not understand it, you cannot predict what will happen next.

An interesting feature of Hawkins’ theory is that it predicts that the neocortex is fundamentally running a single algorithm: memory-prediction. So what gives the brain its adult modularity and specialization? It is the specific nature of the patterns and regularities of each specific sensory modality flowing into the brain. But the common currency of the brain is patterns of neuronal activity. Thus, every area of the cortex, could, in principle, “handle” any other neuronal pattern. Paul Bach-y-Rita’s research on sensory substitution is highly relevant here. Bach-y-Rita’s research has shown that the common currency of the perception is the detection and learning of sensory regularities. His research has, for example, allowed blind patients to “see” light by wiring a camera onto their tongues. This is to be expected if the neocortex is running a single type of algorithm. So what actually “wires” a cortical subregion is the type of information which streams in. Because auditory data and visual data always enter the brain from unique points, it is not surprising that specialized regions of the cortex “handle” this information. But the science shows that if any region is damaged, a certain amount of plasticity is capable of having other areas “take over” the input. This is especially true in childhood. What Micah Allen and I have tried to show in our  recent paper is that higher-order functions of humans are based on the kinds of information available for humans to “work with”, namely, social-linguistic information. So the key to human unique cognitive control is not having an evolutionary unique executive controller in the brain. Rather, the difference is in what kinds of information can be funneled into the executive controller. For humans, a huge amount of the data streaming in is social-linguistic. Our memory-prediction systems thus operate with more complexity and specialization because of the unique social-linguistic nature of the patterns which stream into the executive. So to answer Daniel Wegner’s question of “Who is the controller of controlled processes?“, the uniqueness of “voluntary” control is based on the higher-level invariant memories being social-linguistic in nature. The uniqueness of the information guarantees that the predictions, and thus behavior, of the human cognitive control system will be unique. So we are not different from chimps insofar as we have executive control. The difference lies in what kinds of information that control has to work with in terms of its memory and predictive capacities.


Filed under Philosophy, Psychology

A crude theory of perception: thoughts on affordances, information, and the explanatory role of representations

Perception is the reaction to meaningful information, inside or outside the body. The most basic information is information specific to affordances. An affordance is a part of reality which, in virtue of its objective structure, offers itself as affording the possibility of some reaction (usually fitness-enhancing, but not necessarily so). A reaction can be understood at multiple levels of complexity and mechanism. Sucrose, in virtue of its objective structure, affords the possibility of maintaining metabolic equilibrium to a bacteria cell. Water, in virtue of its objective structure, affords the possibility of stable ground for the water strider. Water, in virtue of its objective structure, does not afford the possibility of a stable ground for a human being unless it is frozen. An affordance then is, as J.J. Gibson said, both subjective and objective at the same time. Objective, because what something affords is directly related to its objective structure; subjective, because what something affords depends on how the organism reacts to it (e.g. human vs. water strider)

The objective structure of a proximal stimulus can only be considered informationally meaningful if that stimulus is structured so as to be specific to an affordance property. If a human is walking on the beach towards the ocean, the ocean will have the affordance property it has regardless of whether the human is there to perceive information specific to it. The “success” or meaningfulness of the human’s perception of the ocean is determined by whether the proximal stimulus contains information specific to that affordance property. A possible affordance property might be “getting you wet”, which is usually not useful, but can be extremely useful if you are suddenly caught on fire. Under normal viewing conditions, the objective structure of the ambient array of light in front of the human contains information specific to the ocean’s affordance properties in virtue of its reflective spectra off the water and through the airspace. But if the beach was shrouded in a very thick fog, the ambient optic array would stimulate the human’s senses, but the stimulus wouldn’t be meaningful because it only conveys useless information about the ocean, even though that information is potentially there for the taking if the fog was cleared. An extreme version of “meaningless stimulus without perception” is the Ganzfeld effect. On these grounds, we can recreate, without appealing to any kind of representational theory, the famous distinction between primary and secondary qualities i.e. the distinction between mere sensory transduction of meaningless stimuli and meaningful perception.

Note too how perception is most basically “looking ahead” to the future since the affordance property specifies the possibility of a future reaction. This can be seen in how higher animals can “scan” the environment for information specific to affordances, but restrain themselves from acting on that information until the moment is right. This requires inhibition of basic action schemas either learned or hardwired genetically as instinctual. In humans, the “range” of futural cognition is uniquely enhanced by our technology of symbols and linguistic metaphor. For instance, a human can look at a flat sheet of colored paper stuck to a refrigerator and meaningfully think about a wedding to attend one year in the future. A scientist can start a project and think about consequences ten years down the road. Humans can use metaphors like “down the road” because we have advanced spatial analogs which allow us to consciously link disparate bits of neural information specific to sensorimotor pathways into a more cohesive, narratological whole so as to assert “top-down” control by a globally distributed executive function sensitive to social-cultural information.

This is the function which enables humans to effortlessly “time travel” by inserting distant events into the present thought stream or simulating future scenarios through conscious imagination. We can study the book in our heads of what we have done and what we will do, rehearse speech acts for a future occasion, think in our heads what we should have said to that one person, and use external symbolic graphs to radically extend our cognitive powers. Reading and writing, for example, has utterly changed the cognitive powers of humans. Math, scientific methodology, and computer theory have also catapulted humans into the next level of technological sophistication. In the last few decades, we have seen how the rise of the personal computer, internet, and cellphone has radically changed how humans cope in this world. We are as Andy Clark said, natural born cyborgs. Born into a social-linguistic milieu rich in tradition and preinstalled with wonderful learning mechanisms that soak up useful information like sponges, newborn humans effortlessly adapt to the affordances of the most simple environmental elements (like the ground) to the most advanced (the affordance of a book, or a website).

So although representations are not necessary at the basic level of behavioral reaction shared by the unicellulars (bacteria reacting to sucrose by devouring it and using it metabolically), the addition of the central nervous system allows for the storage of affordance information into representational maps. A representational map is a distributed pattern of brain activity which allows for the storage of informational patterns which can be utilized independently of the stimulus event which first brought you into contact with that information. For example, when a bird is looking right at a food cache, it does not need its representational memory to be able to get at the food; it simply looks at the cache and then reacts by means of a motor program for getting at the food sparked by a recognition sequence. However, when the cache is not in sight and the bird is hungry, how does the bird get itself to the location of the cache? By means of a re-presentation of the cache’s spatial location which was originally stored in the brain’s memory upon first caching the food. By accessing stored memory-based information about a place even when not actually at that place, the bird is utilizing representations to boost the cognitive prowess of its nonrepresentational affordance-reaction programs. Representations are thus a form of brain-based cognitive enhancement which allow for the reaction to information which is stored within the brain itself, rather than just contained in the external proximal stimulus data. By developing the capacity to react to information stored within itself, the brain gains the capacity to organize reactions into more complicated sequence of steps, delaying and modifying reactions and allowing for the storage of information for later retrieval and the capacity to better predict events farther into the future (like the bird predicting food will be at its cache even though it is miles away).


Filed under Consciousness, Philosophy