The embodied, embedded, extended, or enacted (4E) approach to cognitive science seeks to replace the dominant views of representationalism or minds that deal with the world in terms of semantic content exclusively, and internalism or the idea of the mind being located entirely in the brain. In the current paper we focus on the sensorimotor development of the infant, therefore making it possible to discuss the formation of the bodily skill aspect of intelligence which is most compatible with the embodied view   . Further, it allows us to develop an approach without speculating too much about the development of semantic content in the human mind, which frees us to discuss the problem in enactive terms   .
Our focus is on the learning of unreflective behavior, described in phenomenological terms as “skillful coping” or “ready-to-hand” behavior   . This activity is “animalistic” in nature and consists of the routine practiced behavior that we partake in when situated in familiar environments and contexts. This has also been called “motor intentional behavior” by Merleau-Ponty, as a mode of behavior that lies between the “purely reflexive” and the “properly cognitive”   . In this mode of behavior we do not reflect upon or deliberate about objects in the environment, but instead automatically utilize aspects of our environment for whatever purpose fits our current concerns or goals. The phenomenological viewpoint of skillful coping will be the backdrop of our exploration into infant behavior, which will be seen to be composed of this mode of activity almost exclusively at the sensorimotor stage of development.
This stands in opposition to the more logical, deliberative way of behavior known as “unready-to-hand”, which allows one to “step back” from the current situation and analyze it methodically to deduce facts logically, something that is unique to human beings  . While we do not cover “unready-to-hand” behavior in this paper, we can point to direct connections between the mechanisms covered here such as internal generative models and their “offline” use to run simulations uncoupled from the present environment   . In this way the same mechanisms can be employed to transform motor content into conceptual content, making it compatible with deliberate analysis  . This also accounts for simulation-based accounts which point to a kind of content similar to that originally experienced but with less “directness” or “richness”  .
We are aiming for an account of development that can explain how a human being can develop a “sense” of how familiar situations regularly transpire and can learn how to deal with them over the course of multiple experiences so that one can react adaptively and fulfill one’s own needs, or in other words engage in “concerned absorption in the world”  . Dreyfus describes this issue with respect to how one deals with the situation of “being in a room” as follows:
“In dealing with rooms I am skilled at not coping with the dust, unless I am a janitor, and not paying attention to whether the windows are opened or closed, unless it is hot, in which case I know how to do what is appropriate. My competence for dealing with rooms determines both what I will cope with by using it, and what I will cope with by ignoring it, while being ready to use it should the appropriate occasion arise.” (Dreyfus 1993, p. 11)
Here we see an instance of the “intra-context frame problem” as discussed in  , which deals with how a mechanistic system can exhibit flexible and appropriate behavior within a context, as opposed to an inter-context frame problem which deals with how such a system can deal with innumerable such contexts in the real world. It is of note that Wheeler proposes “special-purpose adaptive couplings” (SPACs) or closed sensorimotor loops between the body/environment, to deal with the intra-context frame problem. This is not unlike the treatment we propose here which is an embodied view dependent on the development of sensorimotor behavioral structures that we refer to in this paper as “schemes”. Further, we believe that the use of a model with multiple hierarchical layers could encode appropriate behavioral schemes to solve the inter-context frame problem, thus acting as an extension or generalization of the SPAC approach.
From an enactive standpoint this only solves half the puzzle, as an adaptive agent capable of appropriate behavior in a context still must be motivated by goals, “intrinsic teleology”  , or “for-the-sake-of-which”  for it to be considered truly capable of meaningful behavior. Discussing development as a truly ground up process starting from the fetus, we incorporate the idea of purpose grounded on the maintenance of homeostasis in a “centrifugal” fashion, as higher order desires and allostasis-related behavior can all be traced back to the necessity to maintain life itself. Later on, we will discuss this further in terms of intrinsic motivations or a “need to minimize interoceptive prediction error”.
An advantage to our approach is that the ready-to-hand behaviors we discuss are encoded as “procedural memories” which are reenacted in an online fashion in response to appropriate opportunities for their execution or “affordances”  as sensed from the environment. This “direct perception” approach makes certain that the responses elicited from the organism are always contextually appropriate and situated  , which appears to be the correct approach when tackling the frame problem.
Throughout this paper, we will look at the tangible progression from the learning of sensorimotor combinations and a personal perceptual space, to the extension of encoding full behaviors which can ultimately compete against one another to create intentional dynamic behavior in the infant. We will see how an infant could learn to develop perceptual and behavioral skills from tabula rasa conditions, without assuming overly complex mechanisms built into the genetic code, but rather relying on the existing dynamics of the physical body and environment in combination with neural dynamics of the brain to allow the learning process to progress.
2. Infant Development
2.1. Fetus Development
We begin our story of development from semi-tabula rasa conditions. The fetus does not begin without any characteristics at all but is endowed genetically not only with the capability to but also the propensity to learn. What it learns initially are the associations between “actions” and their “outcomes”. In other words, basic sensorimotor behaviors are learned and their results eventually predicted, as the behaviors themselves become more “anticipatory” in nature.  explains how actions across the body must be coordinated in order to account for the complex resulting dynamics caused by each individual action. This leads to a necessity for prediction of outcomes at a local (for example limb-based) level and a global (posture-based) level in order to assure coherent movements that do not damage the body and efficiently utilize the internal and external forces to move. The nature of this prospective control is explained in terms of Tau Theory  which posits perception in terms of “action-gaps” between the current and desired states (of the body). Later we will describe much more of the perception/action functionality of organisms in terms of a similar theory known as Predictive Processing.
 describes general movements (GMs) executed from early fetus up until around 5 months of age post-partum. They are not necessarily random and can contain complex sequences of activity, however, they generally require no external stimulus for activation to occur.  demonstrates how dynamical patterns of exploratory behavior such as GMs can occur spontaneously in the fetus through the use of a “minimal body” neuro-musuclo-skeletal model containing requisite muscles, motor nerves, medulla, spinal cord, and the primary somatosensory (S1) and primary motor (M1) areas of the brain. Using coupling of chaotic elements, self-organization of coordinated motor patterns occurs and maintains stability due to mutual entrainment among motor elements. The included sensors and motors communicate across the dynamics of the body, forming a “coupled chaos system”. These self-organized dynamical attractors lead to emergent order in the developed coordination patterns which is dependent on the body and environment, and can be seen as a model of dynamical self-organization as seen in the developing fetus; this sort of self-organization is not necessarily dependent on cortical “commands” but is attuned to the physical dynamics external to the brain itself. The cortical model connected to this body system then learns the sensorimotor patterns as a cortical “body-image” via Hebbian learning. What this model shows us, in particular, is that complex movements could be learned by compressing the complicated brain-body-environment dynamics into maps or sensorimotor loops in the brain that could later reinitiate those movements in the right context.
By compressing countless sensorimotor combinations into a more manageable set of behavioral primitives, it allows for learning complex behaviors without searching in an impossibly large sensorimotor combination space. Full behaviors can be encoded in areas such as frontostriatal/parietal cortex, leading to the encoding of action opportunities or “affordances” which can directly compete against one another in an online fashion  . The effects of this can already be seen in the behavior of bringing the hand to the mouth in the fetus, as it learns to open its mouth before the hand reaches it, demonstrating that the destination of the hand is predicted at action initiation due to the entire behavior being encoded as a holistic unit.
Furthermore,  describes how it could be possible for a newborn that has never seen a face before to be capable of mimicking facial gestures such as sticking out his tongue by the construction of an internal body-image, built up from associations between multiple modalities. For example learning the position of the tongue could be done indirectly by integrating one’s own motor movements to approximate the location of the hand in “body space”, which in turn could be associated with the tactile sensation of touching the tongue, itself also associated with the motor action of sticking out the tongue. Many more studies on body image construction are given in the review in  .
Learning of the structure of the body and environment, as well as the interactions between them in an autonomous way without any prior knowledge has been covered with respect to perceived sensory regularities dependent on one’s own actions, or “sensorimotor contingencies”    . In these studies involving artificial agents, perception itself is not built-in to the system, and the agent must learn how to perceive structure from the “firehose” of continuous sensory data  .
Following the spontaneous reflex and initial prospective behavior of the fetus, the newborn engages in “writhing” movements or alternating flexion and extension of limbs  . This leads to a characteristic U-curve in the complexity of behaviors engaged in by the infant after 2 months post-term, as the writhing movements disappear and are replaced by “fidgeting” of the limbs at various directions and speeds. After around 5 months post-term, voluntary motor activity takes over from the random movements.
The U-curve of behaviors has been linked to the temporary freezing and freeing of degrees of freedom (DOF) in the body  . This seems like a plausible explanation in light of the solutions discussed in active learning models such as R-IAC  , where the sensorimotor space is cut into separate partitions based on the learning progress of each, in order to efficiently optimize the learning of each subspace. This is a way of breaking up the problem into smaller chunks, and is related to the discussion regarding local vs. global resolution of uncertainty in  which shows how depending on the complexity of the task it may be more effective to concentrate on local distinctions pair-by-pair as opposed to trying to choose the best hypothesis to test from the entire set of hypotheses. This could be an answer to the long examined problem of harnessing the high DOF of the body during motor development  .
2.2. Piaget and Infant Development
Jean Piaget did a great amount of work in cataloging and formalizing the developmental stages of infants  . He divided cognitive development into four broad stages: Sensorimotor, Preoperational, Concrete Operational and Formal Operational. We shall give special attention to the Sensorimotor Stage which occurs during the first 18 - 24 months after birth, as it shows us how to get from simple “reflex-like” actions to more complex, manifestly intentional forms of behavior. Another aspect of Piaget’s work was to describe the exploratory behavior of children generally referred to as “babbling” or at a higher level “play”. He defined these behaviors as “circular reactions”, repetitions of certain movements that confer the child affective reward. Through such behaviors the child learns to take control of his own body, and in this way bootstraps the ability to engage in more complex interactions with the world.
In this aspect of playful behavior, we see an inherent “directedness” or “intentionality” that motivates the child to explore his capabilities. This sort of curiosity has been previously described as a need to restore cognitive/perceptual “coherence” by reducing “uncertainty”, which is described as an inherently pleasurable or rewarding process   . Here we see an interesting parallel to the idea of uncertainty minimization related to the free energy principle  which we will cover in detail later as a governing force in motivating the behaviors of all organisms.
We see in  an account of sensorimotor intentionality that affords a developing human fetus the ability to explore and learn its action space in the limited physical environment in the womb. But how does the postnatal infant proceed to advance after being exposed to the outside world, with orders of magnitude higher levels of complexity? In addition to the comparatively limitless possibilities for action in the world, the infant now also acquires new abilities for sensation―no longer limited to the proprioceptive and tactile modalities, as his eyes, ears, and nose open up for the first time  .
While the development of skill in such an unbounded space seems like an impossible problem, humans are capable of following a structured path through various phases of development which leads them from simple sensorimotor intentionality to the kind of abstract planning and reasoning seen in adults. In order to follow this trajectory, we refer to the work of Jean Piaget to guide us through the ensuing development of the infant.
During the Sensorimotor Stage, in an analogous fashion to the developing embryo, infants can be seen constantly experimenting, learning the multimodal perceptual effects of their actions on the surrounding environment. Again we can ascribe an intentional aspect to this hypothesis testing form of behavior, as the child constructs his own sensorimotor contingency library, and continuously heightens his perceptive abilities as a result.
It follows that if the child has built up a significant library of skills (for example eye and head movements), then he would be able to truly perceive objects and understand via counterfactual prediction that an object is still present even when hidden. In this way, we can tie mastery of sensorimotor contingencies to the concept of object permanence, which indeed is one of the landmarks of the Sensorimotor Stage.
2.3. Assimilation, Accommodation, and Equilibration
As introduced by Piaget, equilibration refers to the activity of encapsulating a new aspect of the environment into the body-environment behavioral loop in order to restore the coherency lost upon the first appearance of that new feature  . This process consists of complementary assimilation and accommodation processes. Assimilation refers to the process in which an aspect of the external environment (such as an object or situation) is coupled to an internal sensorimotor structure or to form a skill or “scheme”. For example, the coupling of a grasping sensorimotor reflex with a toy to form a “grasp-toy” scheme. Accommodation, in turn, consists of the process by which a pre-existing scheme is modified to incorporate a new aspect of the environment, for example, the ability to grasp a differently shaped toy.
 shows that the same concept of equilibration can be also be applied at a higher level between individual sensorimotor schemes to form relations between them, leading to longer, more complex behavior. It is also shown in  how local exploration can be performed during temporary perturbation of the system, allowing not only for accommodation to take place in the original scheme but also the branching-off of new schemes if new metastable regions are found in the search space.
This plasticity allows for the sort of continuously improved discrimination of situations as described in  , with the proficient child showing the ability to interact with more nuanced variations of the environment, such as the ability to play with a wide variety of toys each with different methods of interaction. This can also lead to the development of habits, which have been modeled previously as self-sustaining patterns of sensorimotor coordination  .
2.4. Development of Complex Behavior
So far we have looked at the role that equilibration plays in developing new basic behaviors. But how can we get from these simple unitary actions to more complex patterns of behavior that take place over longer timescales and with more than one object or set of motions in play? Further, how can the infant learn to plan entire routes of behavior to achieve a distal outcome?
Piaget also specifies an equilibration between individual schemes (relating them via a sensorimotor strategy with timing, duration, and intensity)  . Later we will talk about how high-level models or policies can be learned based on a generative model which allows for learning of sequenced actions. Equilibration on a global scale can only be achieved after tensions both within and between schemes have been resolved. Sequences of schemes can then form higher level schemes as behavior is chunked into higher order primitives   . Using sequences of sequences it is possible to expand this even further in a hierarchical fashion, allowing for time-extended behaviors to be constructed that achieve abstract or distal goals based on complex affordances for higher action  .
After learning a new set of schemes, the infant should then perceive the world differently, in terms of the newer affordances     . Here we can see how new behaviors can be both learned and absorbed into the infant’s repertoire in order to be effectively used when similar situations arise in the future.
While it is true that internal brain dynamics and generative models likely play a big part in the development of new behaviors, we should not develop an overly internalist approach. Instead, we should also focus on the role of the environment and supporting social figures such as caregivers in allowing the infant to develop his repertoire of skills in a tractable, sequential fashion.
2.5. The Role of the Environment and Scaffolding
We should not discount the role that the environment has to play in the development process. In addition to the bodily constraints which restrict the behavior of the infant to an incremental process of learning which at each stage opens up a new aspect of the environment to explore, the physical limitations of the external space also provide bounds within which to direct the process of development. For example, toys placed near the child such as rattles or stuffed animals will be noticed earlier on and much more frequently. Their affordances for gripping, shaking, and so on will lead to the development of specific sub-behaviors which influence the order of behavioral learning. It is important to focus on the limitations present at each stage of learning, as these allow the complex innumerable aspects of reality to be fenced off and reduced to a successive series of limited opportunities, each of which plays some significant part in spurring the bodily and cognitive development of the child.
Adaptation is another important aspect of the development process. Existing schemes are either expanded to assimilate more features or branched-off and accommodated to generalize to a wider range of contexts. In order to realize this sort of adaptation, it can be seen that randomness in behavior or “babbling” is required to ensure the infant acquires enough relevant experience with which to build their skill set  . There is a tension between fully random exploration which may put the infant too far outside of familiar territory to be useful (that is, to be learnable in terms of assimilation and accommodation processes) and too narrow or derivative exploration which does not afford the opportunity for fully open-ended learning. We refer back to the limitations imposed on the learning process by both the body and the surrounding environment itself as possible solutions to this dilemma. We expect that through the course of biological and cultural evolution, humans have obtained just the right balance between purely random and purely derivative exploration which relies on epigenetic, physical, and external factors in reducing exposure throughout infancy to a tractable subset of the environment while still retaining enough complexity to learn useful behavioral repertoires from.
It is important not to internalize the process of skill acquisition or focus purely on the immediate environment of the developing infant. As the child grows, his caregiver plays a huge role in what aspects of the environment he will be exposed to by engaging in “scaffolding” of the learning environment. In turn, this will also determine which sensorimotor skills develop in which order  .
In addition to selecting the objects present in the environment, the caregiver will also provide direct opportunities for learning by, for example, increasing the salience of particular toys by shaking them or allowing the child to touch them, directing the child’s attention to the toy or a single aspect of it. Imitation can also play a huge role as the caregiver demonstrates how to use a toy, or pushes the infant’s limbs through the motions required to play with it. This can be seen to rapidly advance the progress of learning new affordances   .
In addition to learning the aforementioned perception skills via scheme acquisition, the infant also learns plenty of other action-outcome contingencies such as pulling, pushing, dropping, bouncing, shaking, and so on, each of which provides a little more information about the sorts of interactions to be had in the physical environment. Furthermore, the infant may also begin to learn contingencies based purely on external physics not necessarily attributed to one’s own doing, such as gravity and friction. These observed outcomes are thought to spur the creation of an internal model which approximate real-world physics   .
Now that we have reviewed the developmental process of the infant, let us examine what attempts have been made towards modeling them and how they fared.
2.6. Previous Modeling Attempts
The principles of infant development discovered by Piaget can be seen to be fruitful sources of inspiration for constructing accounts of cognitive development   . However, previous modeling efforts have been unable to produce a tangible way to connect the tenets of Piaget’s original theory to a scalable implementation capable of developing complex (cognitive) behaviors from the ground up. Here we outline previous work in modeling stages of open-ended Piagetian development and investigate the limitations they faced.
 presents one of the first attempts at computational modeling of Piagetian development and relies on a symbolized account of Piaget’s schemas―the building blocks of knowledge which are manipulated throughout the processes of assimilation, accommodation, and equilibrium. Drescher’s schemas are made up of a context, an action to take in that context, and an outcome of taking the action in that context. While interesting, Drescher’s propositional approach harkens back to GOFAI, with all its inherent limitations  . Most notably, neither Drescher’s approach nor its extension of “neural schemas”  possesses the ability to generalize to the extent necessary for application in a real-world scenario. It also assumes that observations about the state of the world can be made perfectly, which is not a valid assumption in the case of real-life organisms and requires a more probabilistic view, such as the predictive processing account discussed later.
 develops an approach similar to that of the current paper, by examining the developmental process from fetus to infant, and applies Piagetian principles to the development of an intelligence embodied in a humanoid robot. While we feel this is a promising direction, it is limited by its use of the LCAS algorithm  to do most of the heavy lifting. The LCAS algorithm simply selects appropriate motor acts in response to stimuli based on learned stimuli-motor act mappings. In the absence of known stimuli, it performs a random action. These “sensorimotor” associations are then strengthened base on usage, similar to the “mesoscale” or habit-based approach in  . There are two issues in particular with the approach given by Law and colleagues, namely that the purely random “babbling” action in the absence of known stimuli does not take the dynamics of the body and environment into account, as in the account of infant GMs outlined in  , and the individual strengthening of associations between action-outcome pairs cannot be reused or generalized via equilibration processes as in Piaget’s original theory.
 gives a thorough overview of the many lines of research which have attempted to model Piaget’s notion of “schemas” in particular. Many of these can be seen to suffer from the same limitations as described above.
A mechanism for chaining of sensorimotor schemas into behavioral sequences as described earlier can be found in  , which gives an account using a hierarchy of stable heteroclinic channels (SHCs) together with a generative model based on the free energy principle, and shows how metastability can be used by a dynamical system to push it through learned sequences of behavior. This account is particularly attractive due to the other work that has been done in showing the biological plausibility of the free energy principle and the power of hierarchical generative models that deal with prediction errors, which we describe later. Additionally, it allows for the separation of timescales which can implement behaviors at both the slower high level such as complete plans, and the faster low level such as individual movements, as described in the context of competition in  .
We have taken a look at various aspects of infant development and how they pertain to the initially discussed problem of open-ended learning of complex behaviors in an intractable environment. We also looked at previous attempts to model Piaget’s principles of development and the limitations they faced. Armed with this knowledge we next take a look at a set of frameworks compatible with the 4E cognition approach that more closely resonates with Piagetian development and in the process make strides towards solving the issues inherent in traditional AI and cognitive science such as the frame problem.
3. Sensorimotor Contingencies, Predictive Processing, and Free Energy
In this section, we look at three newer theories in cognitive science compatible with the 4E approach―Sensorimotor Contingency Theory, the Free Energy Principle, and Predictive Processing. We then rely on the introduced principles to develop a consistent account of infant development that could be realistically based on the known limitations and features of the human brain and body and its surrounding environment.
3.1. Sensorimotor Contingencies
Previously we have seen evidence that the development of simple reflex-like bodily behaviors into more fully formed voluntary movements hinges upon the learning of “sensorimotor mappings” in cortex which can be replayed in the correct context to re-enact previously experienced behaviors. While many models of such mappings have been attempted    , few have been able to explain how it would be possible to develop the complex properties of perception and action such as personal spaces and perceptual presence from the ground up―necessary if one is to build an agent without making too many a priori assumptions about its innate mechanistic library.
Here we focus on one theory in particular, the Sensorimotor Contingency Theory (SMCT)  , which is centered on the concept of a sensorimotor contingency (SMC), or a probabilistic account of what outcomes follow from executing certain bodily actions in certain contexts, and is a noteworthy member of the enactive tradition.
While this account shares a lot in common with the forward model approach in robotics   , the difference lies in what this theory uses those models for. In essence, instead of merely predicting the consequences of actually executed actions (for example, to compensate for sensory delays), SMCT instead puts an emphasis on the “counterfactual”, or outcomes that would happen if one were to execute some particular action, in some particular context. This is linked to the ability to perceive entities as “perceptually present” or “subjectively veridical” by being able to predict a great variety of counterfactual outcomes of interactions with that entity. For example, when I see a tomato I can predict what I would see if I were to turn it around, and I can predict what it would feel like if I were to squeeze it in my hand. Thus, even if certain aspects of an object’s visual and tactile nature are not presently exposed to a subject, they can still be “known” to be present, due to that subjects “mastery” of the SMCs related to that object. This is the central idea of SMCT.
A key aspect of equilibration theory is that only those elements of the environment which have been incorporated into a scheme can be perceived. This aligns closely with SMCT in which it is ones sensorimotor skills which allow one to perceive of entities in the environment. It follows that undergoing equilibration can be seen, analogously to SMCT, as a process of skill accumulation by the infant.  presents a dynamical formalization which ties the two ideas of equilibration and SMCT closer together. The result is a process model which gives an explanation of the processes of assimilation and accommodation in terms of SMCT.
It is clear that an account of the cognitive development of behavior linked to SMCT would be much more powerful than something built on simpler accounts of forward models or sensorimotor maps, as it would explain not just the ability to behave, but also the perception of affordances to behave. Perhaps most importantly, it would allow us to construct an account of behavior development from blank state conditions, as both simple and complex behaviors could be reduced in terms of factual and counterfactual action-outcome mappings. We show how this can be achieved in Development of an Infant Revisited. First, however, it is necessary to demonstrate that SMCT can be implemented using biologically realistic mechanisms. To this end, we now investigate predictive processing, the free energy principle and their relation to SMCT.
3.2. Predictive Processing and the Free Energy Principle
Predictive processing (PP) is a computational theory which provides an explanation for many aspects of intelligent behavior―namely perception, action, and cognition. It puts emphasis on top-down “prediction” based perception as opposed to traditional theories of bottom-up “passive” perception and rests upon the idea dating back to Helmholtz (Helmholtz 1866/1962)  of the brain as an “unconscious inference engine” which constantly makes and updates predictions about the world. Prediction errors or mismatches between predictions and the actual sensory data we receive are then used to update our internal beliefs (as in Bayesian inference) and are not the content of our percepts per se. In this respect, the workings of the brain can be described in terms of the development of a probabilistic model which attempts to predict future sensory input. Already we start to see parallels between this theory and SMCT, which also deals with predicted future outcomes.
The prediction errors between predicted and actual input can be used in two ways―either to update our internal belief models (perception) or to spur action which can affect the external world in such a way that it will conform to our internal predictions or expectations of it. The latter of these is known as Active Inference  . Both perception and behavior depend on the existence of the internal “generative model” which can be extended to a hierarchically generative model or HGM  , involving much more complex predictions over longer timescales as the hierarchy is ascended. Predictions are sent down the hierarchy, while prediction errors ascend the hierarchy. Different hierarchies can exist for different sensorimotor modalities, and proprioceptive prediction errors which descend to the lowest layers of the hierarchy can be enacted as a physical response in the manner of a classic motor reflex arc.
In the free energy principle (FEP)  , we find the link between PP and its biological mechanisms based on the idea that neuronal populations encode predictions about the external “hidden” state of the world by changing the strength of their synaptic connections so as to descend a free energy gradient, defined as an upper bound on surprisal (in information theoretic terms) about encountered states. This idea can be traced back to early cybernetics, as  tells us that the objective of any organism is to minimize the dispersion of its external states with respect to action.
As full Bayesian inference about the world is an intractable problem, the next best thing is to attempt to represent the posterior distribution of possible hidden states given sensory input in an approximate fashion using variational Bayes techniques, and this lies at the heart of the FEP. The divergence between the aforementioned distributions cannot be calculated directly, but the free energy represents an upper bound on this divergence or surprisal, and by giving a biologically realistic account of perception and action in terms of free energy minimization, it is possible to tie PP to a plausible mechanistic account of neural activity and learning  . To summarize, updating of internal beliefs and selection of action as just described can be cast in terms of the free energy formulation, simply by performing gradient descent with respect to the free energy; that is, all perception and action as well as the learning of synaptic connections serves to decrease the free energy of the system. This, in turn, leads to behavior which reduces the surprisal encountered by the organism, which translates to behavior that keeps the organism within safe ecological bounds, or in other words, “surviving”.
3.3. Sensorimotor Contingencies in Terms of Predictive Processing
The principle of descent down a free energy gradient encapsulates and formalizes the cybernetic idea of dispersion minimization, based on the notion of an internal generative model which predicts the hidden causes of sensory signals as they are experienced, updating its approximation of this posterior density as new information arrives. However, we are interested in expanding this idea past the “seen”, toward the “unseen” or fictive future events which would be experienced if we were to engage in a particular action―or in other words, the sensorimotor contingencies of the current (bodily and environmental) context.
Additions to PP allow it to describe exactly these fictive events, allowing the generative model to explicitly incorporate counterfactual “would-be” probabilities of how sensory input would change dependent on possible future actions.  explains in detail how such an account of SMCT allows one to explain aspects of Synesthesia and its accompanying lack of perceptual presence, which had heretofore been a notable omission from the theory. This work draws heavily on the additions to FEP outlined in (Friston et al. 2012)  , which describes a model of saccadic eye-movements which are used to test hypotheses of external causes. The saccades chosen by the model are those which are known beforehand to be the best actions for reducing counterfactual uncertainty in a particular (visual) context. That is, by encoding counterfactual probabilities, the most “salient” action can be selected as that which will best prove or disprove the current working hypotheses regarding the hidden state. By casting extrinsic reward and intrinsic information in the same currency (negative free energy), it is then possible to compare both infotropic and reward-seeking behavior under the same framework, which at higher levels produces optimal behavior for survival, incorporating the benefits from both exploration of the environment and exploitation of its resources.
Furthermore,  develops this idea further, incorporating multiple types of epistemic value: the aforementioned contingency uncertainty resolution or “ignorance reduction”, ambiguity reduction (about the current context), and risk reduction (about the current behavioral policy). This provides a fuller account of curious, epistemically valuable behavior in general.
Meanwhile,  takes a more pragmatic approach at combining SMCT and PP by investigating what computational principles can be developed that incorporate the requisite aspects of the two paradigms in a manner accessible to autonomous robots. In particular, the goal of this work is the acquisition of generic knowledge via a world model which can then be used to support adaptive behavior in novel environments. It too is capable of testing hypotheses and using the resulting information to change its beliefs and of selecting an appropriate model with which to predict external causes.
We now rely on the merits of SMCT and PP to provide an account of sensorimotor development from fetus to infant.
3.4. Development of an Infant Revisited
Previous work has been done on tying the work of Piaget to sensorimotor contingency theory  , in which Piaget’s theory of equilibration is used as a way of explaining how SMCs can be learned by humans in an open-ended fashion. Here, we attempt a mechanistic account of the development process of an infant as introduced earlier which utilizes the principles of SMCT, PP, and FEP.
According to the generative model described by Friston, as long as the infant’s HGM has prior beliefs that he will acquire the correct sensory information (fictive sampling) necessary to minimize the uncertainty about its cause, then he will perform actions that achieve this. This works because in the HGM, there is a feedback loop allowing the transmission of (prediction error) messages to the hidden controls (the subset of hidden states which the infant has control over), and allowing the receipt of indirect feedback from the hidden controls via the hidden states they affect, which in turn affect the sensory information acquired by the model again (perception). This allows the infant to actively sample the sensory data which best reveals the hidden causes of that data―or in other words to test his hypotheses about the world and minimize the uncertainty of his inferences.
For example, since the infant expects hidden controls to minimize counterfactual uncertainties about hidden states, he can internally postulate a hypothesis such as the following: “Would a previously experienced novel body configuration occur again if I selected a particular torso-turning hidden control?”
This can be seen as the reason why, upon encountering a novel body configuration state or sensory outcome, the child can be seen to attempt trying to recreate that outcome  ; by maintaining a high prior belief that it will experience the novel state again, the child reconstructs the previous behavior that led to it. Aside from the fact that there is a complex sequence of motions leading up to the novel outcome, this can be explained in the same terms as Friston’s saccade-to-butterfly, as stated:
“similar principles should apply to active sampling of any sensory inputs. For example, they should apply to motor control when making inferences about objects causing somatosensory sensations.” (Friston et al. 2012, p. 2)
That is, the infant’s HGM holds a high prior belief about the novel body configuration experienced. This creates a proprioceptive prediction error, leading to action in the reflex arcs which bring the body into this configuration. Another way to state this is that the behavior had high salience, which in turn gave it high competitive value over competing behaviors  . Engaging in this behavior resolved a portion of uncertainty about the dynamics of the infant’s body, and so it had epistemic value. Once the contingencies connecting this behavior to its possible sensory outcomes have been learned to a certain degree, the behavior in question loses its salience due to it no longer providing epistemic value, and the babbling process continues to follow the “explorative policy” defined in the higher level, allowing the infant to select other movements which increase novelty.
Going back to the notion of selective restriction of exploratory movements required for incremental learning  , we can now point to a mechanism for achieving this―namely, giving a low prior to movements involving certain DOF. In this case, the “competition space” of hypotheses would be effectively reduced as not only would the infant not attempt to move in those DOF, but the sensory outcome of the ensuing movements of the actually performed babbling would also be analyzable in terms of a reduced hypothesis space, allowing for development of subareas of the sensorimotor space at a faster pace. Once enough subareas had been learned the DOF restriction could be (epigenetically) removed, allowing for the newly learned behaviors to be used together and the coordination thereof would lead to much more complex behavior, as is seen at 3 months onward as the infant ascends out of the U-shape of GM behavior.
3.5. Bayes Optimal Control under Uncertainty
The infant expects hidden controls to minimize counterfactual uncertainty about hidden states (in other words, it acts to reveal information about hidden states). The hidden controls selected are those which minimize the entropy or “surprise” of the counterfactual density or “fictive outcomes”. This density depends on the selected hidden control itself and the future conditional expectations about hidden states resulting from selecting that control.
By defining salience as a measure of certainty of counterfactual belief then, we can restate the above by saying that hidden controls are expected to sample salient features (thus minimizing the dispersion of the counterfactual density)―“If I were to raise my arm in this way, the uncertainty of the body state that followed would be less than that of other body states following other actions.”
At the highest level of the HGM we can expect a sequence or policy of prior beliefs which produces predictions about the entire body that involve the coordination of all limbs and their mutual effects. Not only this, but this level could also explicitly code a counterfactual density that dictates which (whole-body) behaviors would minimize counterfactual uncertainty about fictive future states. This density in effect encodes sensorimotor contingencies at the full body level.
As we descend the stages of the hierarchy, we see similar densities, albeit more localized to particular regions or limbs, ultimately ending in the reflex arc of the spinal cord. Any prediction error which makes it to the spinal cord has now been heavily contextualized by the layers above, meaning that any simple action at this layer is already coordinated with the other innumerable muscles of the body, and is cooperating to achieve a “higher level” behavioral activity in a global sense.
3.6. An Account of Babbling
As the infant develops, he engages in semi-random babbling movements in an effort to discover his own body. How can we explain this process in terms of an infant in possession of a generative model? Friston believes that the exploration-exploitation dilemma  can be overcome by attributing infants prior beliefs that confer them Bayes optimal behavior if they act to minimize expected free energy of future outcomes  .
“an agent should select actions that improve learning or prediction, thus avoiding behaviors that preclude learning (either because these behaviors are already learned or because they are unlearnable).” (Friston et al. 2015, p. 21)
The key here is the use of a hierarchical generative model with separation of timescales. While it is true that pure free energy minimizing behavior on the lower scale (such as at the level of reflex-arcs) would be necessarily reactive and exploitative in character, by employing the same principles at all layers of the hierarchy, increasingly higher layers can contextualize the lower layers with ever more subtle and long-term “plan-like” behavioral biases. Since the higher levels also seek to minimize future expected free energy, it follows that the model would incorporate explorative behaviors into its policy in order to acquire more knowledge about the world and act on it in a more sophisticated manner. In other words, the infant will necessarily acquire epistemic behavior and be drawn to actively sample novel contingencies. This is the essence of “babbling” in motor and other modalities.
Babbling could work to create new behaviors as follows. While executing sensorimotor sequences, the infant encounters novel experiences (in terms of body configuration/dynamics) via perturbations which allowed the behaviors to momentarily go outside of the known behavioral pattern. If such perturbations were large enough it would cause a tension of uncertainty, and this could be incorporated into the counterfactual density of the HGM, reutilized later on to generate behaviors repeatedly in that direction. These would sample the salient features of the environment or body configuration in question, thus shedding light on their causes and reducing their associated uncertainty―providing a rewarding experience in the process. In turn, equilibration would allow a new scheme to branch off, as described in  . To an external observer, this whole process seems to be nothing more than random flailing movements, however, upon deeper inspection we see a clear route of intentionality, purpose, and focused learning from start to finish.
There is another form of purpose we might also look into. In  , the likelihood of selection of any particular policy (behavior) is linked not only to the minimization of uncertainty about related hidden states, but also the (stochastic) causal relation between hidden states and their sensory outcomes. In essence, the infant must not only increase his knowledge of states in the world and how his own actions influence them, but also the probabilistic relation between those states and what he actually perceives. Friston’s experiments demonstrate this in the act of rule or context learning, in which the agent must infer the currently active rule in order to make correct decisions. This leads to an agent which is capable of making explorative actions in the short term in order to minimize his long-term free energy with respect to behavior. We can imagine that a similar process is occurring during the learning process of the babbling infant.
So far we have looked at three notable additions to the study of cognitive science―namely, Sensorimotor Contingency Theory, the Free Energy Principle, and Predictive Processing. Based on these theories we have been able to posit an account of infant development in a manner compatible with the embodied and enactive views of cognitive science. In Learning and Responding to Affordances for Behavior, we expand upon this and focus on the complex behavioral repertoire acquired by and used by an infant and how it is able to select among competing opportunities for action, something that we will see is the cornerstone of human development itself. We then complete our account by tying all of these aspects together in An Account of Learning.
4. Learning and Responding to Affordances for Behavior
In the previous two sections, we examined the developmental process of an infant and several new ideas from the field of cognitive science which could help explain how an infant learns his behavioral repertoire in terms of Piagetian sensorimotor scheme acquisition and development of an internal generative model. In this section, we focus more on the interaction between infant and environment as the infant learns to utilize his repertoire in an appropriate way. Specifically, we will examine how the infant develops the skill to perceive affordances in the environment and to select between multiple possible behaviors in any given instance, by learning to become sensitive to “solicitations”, or those affordances most relevant to his own concerns.
4.1. Competition between Behaviors
There is strong evidence to suggest that multiple opportunities for action are specified simultaneously in the primate cortex depending on the currently perceived context   . A key question is how such a primate can select the most appropriate action, given the available affordances. Not only this, but how does one avoid being simply “drawn” to everything in the immediate environment, as seen in Utilization Behavior  , and instead divert resources to “plans” over longer timescales, while avoiding short-term rewards? What separates those affordances which “solicit” behavior, and those that do not   ? Furthermore, how does one select between more “abstract” or underspecified goals, with no particular implementation immediately available   ?
The emergence of behavioral sequences and their connection to short-term goals and intentions has been previously addressed from a dynamical systems point of view    . The idea of selecting behavioral sequences or policies based on prior probabilities of success has also been covered with respect to the free energy principle (Friston et al. 2014). There appears to be a hierarchy of expectations and desires at play, and in order to develop a coherent account of behavioral development we must address which possible mechanisms could implement this hierarchy. Notably, we are interested in how such a complex variety of interacting processes can be assembled and tuned to act appropriately, given a history of experience in the world. In essence, this will lead us to an account of how “ready-to-hand” behavior comes about, as the adult effortlessly traverses his surroundings, responding appropriately to the affordances present in his environment.
More specifically, we wish to find biologically realistic mechanisms that could be utilized by an infant to develop complex behavior, while also demonstrating how short-term actions can compete with behaviors that take longer to carry out. The Affordance Competition Hypothesis (ACH)  and its hierarchical extension  make great strides towards this end. Here we can see how two behaviors on different timescales can compete against one another, such as grasping a nearby small fruit vs. grasping a larger fruit some distance away. The key is in the idea of a “distributed consensus”, which allows localized behaviors, presumably represented in sensorimotor cortex, to compete against one another and against longer term “plans” for action, presumably represented in frontal cortex, via a hierarchical message passing scheme. We also see how biasing inputs from several other brain regions can provide more information for resolving the competition  , for example, information about valence or biomechanical costs  , as well as information regarding opportunities to create future affordances  . This last point shows how it is possible to bridge the gap between immediate competition of affordances and longer-term plans which incorporate upcoming opportunities. We see that in order to learn adaptive behavior, it is necessary for the infant to remember such things as what opportunities follow from what scenarios, the valence of particular outcomes, and the biomechanical costs associated with performing particular behaviors. Once learned, these sources of information can aid the distributed competition or “consensus” among various layers of sensorimotor hierarchy, triggered by the relevant contextual stimuli in the environment.
It is possible to connect this to the idea of a hierarchical generative model which encodes beliefs about the environment―except now instead of just encoding sensory (exteroceptive) and motor (proprioceptive) components, we extend the idea to other modalities of information such as “valence” and “biomechanical cost”.  describes an account involving different types of prediction errors (PEs). In addition to “lower level” perceptual and “higher level” cognitive PEs, we are also introduced to a motivational or “signed” PE, which contains information about the valence or direction in which a prediction about the environment was incorrect―for example whether an outcome was “better than” or “worse than” expected. Such prediction errors are analogous to the reward prediction error treated by traditional reinforcement learning theory and give a normative or motivational character to behavioral outcomes, which is directly related to the ideas of cost and reward.
We can also rely on an HGM account to interpret the results in  , which show how performing actions can lead to better perceptual judgments about affordances. We interpret these actions as “hypothesis testing” or “infotropic” in nature, as their execution is necessary in order to update beliefs (in the internal HGM) about the applicability of the affordances in question, based on physical and internal measures of normativity.
In sum, by learning a multimodal, valenced generative model of the environment, we see how an infant could choose between simultaneously presented opportunities for behavior in complex environments by taking into account previous similar experiences and favoring those choices which previously led to rewarding or positively valenced outcomes. In this sense, the infant can be said to become “sensitive” to relevant affordances or solicitations that favor his own concerns.
4.2. Planning Behavior
We have seen how behaviors could compete and complement each other in an affordance competition scenario. We now wish to ascertain the differences between immediate “reflex-like” and long-term “planned” behavior.
Behavioral learning has been previously organized into two classes: model-based and model-free learning, and the relation of these two types of learning to the FEP has also been previously discussed   . Model-based learning relies on constructing an internal probabilistic model of external states and transitions, and calculating the reward of a particular action consists of aggregating the predicted future expected reward based on the transitions expected by taking said action. In model-free learning, simple stimulus-reward mappings are created so that any action can be taken swiftly based only on the immediate context. It has also been suggested that these two forms of learning could be the extremes of a continuum of learning approaches, each using more or less predictive behavior, or prospection, to determine the value of immediate actions  . Further, implementations of these strategies in a PP approach could be mixed by utilizing task or context-based variations on precision control, thus tuning the “gain” on each particular strategy  .
A promising way to combine this paradigm with inference based models can be found in the idea of internally generated sequences (IGSs)  . Based on the idea of sequences of activity generated in the brain during periods of vicarious trial and error in the rat (simulation of future paths at decision points), IGSs allow for planning ahead or “sampling” of future probabilities over multiple timescales. We can liken this sampling or simulation not only to model-based approaches on a larger timescale for policy sampling as described in   but also to reflex-like model-free approaches on a shorter timescale, selecting actions by sampling probability distributions of immediate outcomes. For example, in vicarious trial and error in a T-maze, a rat can serially simulate each possible choice in the maze by sampling complete policies for each direction available or “leg” of the maze. Here we can also see how model-free “habit” like behavior can be formed through repeated experience utilizing the same HGM framework, as after multiple encounters with the same environment the rat will eventually learn to associate the perceptual (context) cues with the correct decision directly  , and no longer have to go through the costly process of simulating. Pezzulo describes this process in detail as follows:
“When a given action plan has reliably yielded reward in the past, this prior information has a high precision. Under this condition, inference may be avoided to save energetic costs and enable rapid action. The agent can directly drive behavior using the cached action values of the model-free system.” (Pezzulo et al. 2014, p. 8)
We thus see how the training of a hierarchical generative model could constitute the platform required for an infant to develop longer sequences of behaviors or planned behaviors directed towards a distal goal, with sensorimotor schemes and sequences of schemes/higher order schemes taking on the role of “action” and “policy” respectively in terms of the HGM. However, what this account is still lacking is precisely how the present environmental/bodily context can be perceived in order to reliably generate appropriate behavior without storing an individual stimulus-response mapping for each perturbation in context as would be necessary in models such as that posited by  . Here we must turn to the issue of categorization.
4.3. Model Selection and Categorization
When optimizing generative models during the process of model selection   , it can be seen that in the case of choosing among models with the same accuracy, those with lesser complexity should be prioritized. This is due to the fact that Bayesian model evidence, a benchmark for the usefulness of the model, is equal to accuracy minus complexity  . There is a necessity for organisms to use their finite mental resources to act in an appropriate and efficient manner to contingencies in the environment, specifically based on a finite, limited repertoire of possible mappings from context to behavior. This necessitates the construction of generative models which generalize over specific instances to produce “categories”, or more broader definitions of stimulus to which the organism can respond. Here we obtain a glimpse at how an infant could possibly learn to operate in response to the constant deluge of information that bombards his sensory apparatuses after leaving the womb. It is this discretization of a continuous flow of information into individual invariances  that makes possible any such context-behavior mappings as seen in adaptive behavior.
 shows how a categorization of “feature” causes can lead to a more efficient and desirable model when the part of the environment to be categorized consistently displays a fixed set of features together in a given context. For example, the model can simply code for a specific category like “carnivore” and sidestep the process of calculating the probabilities over separate features such as teeth, claws, body shape, and so on, as these share a high covariance and need not be perceived independently in order to achieve the task of something like predator detection in the wild. This also allows the organism to generalize to new situations in which case an unidentified entity can be classified as “predator” without requiring the necessary experience to identify what species that animal is per se, offering an adaptive advantage to perception and response. Categorization can also be extended to hierarchies of categories and sub-categories, giving all the hallmarks of “concept generation”, and is likely to be heavily related to the development of semantic concepts in higher cognition.
There is a tight connection between this sort of reductive activity and the Bayesian model reduction described in  , which points to the constant process of reducing internal models to minimize their complexity and explain latent causes of the world in simpler terms, and has been related to the pruning of neural connections during sleep  . Such a model reduction is describable in terms of (long-term) free energy minimization by expressing free energy in terms of complexity caused by “redundant parameters” of the model, and when posited as a mechanism of the brain becomes the biological equivalent of Occam’s razor. The benefit of constructing a simpler model to explain events is not only metabolic in nature but also allows for more abstract or “context-free” accounts of events which can be recycled over more situations and thus increases the utility of the model. Further, such generalized or reduced models could then be branched into sub-models later on in the development process as the infant learns to discriminate between contexts which are prima facie similar but lead to different rewards or outcomes. Here we see a strong parallel to the scheme branching that occurs during equilibration  , and in fact, the contextualization of reduced models could very well be the mechanism that implements this branching process.
When a full hierarchy of relevant categories has been constructed, it would also make it easier to represent policies and plans at various timescales, with macro-scale “situational contexts” guiding the selection of policies, such as in the classic “script” selection procedures for attending a restaurant, going grocery shopping, etc.  , and shorter timescale “immediate contexts” guiding dynamical ongoing interaction with the present environment and its affordances. At any scale the process of detecting relevant context and acting appropriately towards it has been described as “reducing tension” or “tending towards optimal grip”  , and the ensuing interaction between hierarchical levels of affordance, internal concerns, and the external world is a key focus of research on action selection. We discuss this further below.
4.4. Intentional Directedness of Behavior
We have talked about behavioral selection in terms of mechanisms which could allow for its implementation in the brain. But we should also consider the phenomenological implications of such a selection process, especially as this topic is given great attention in the enactive literature. A key example of this is the “maximal grip” (also known as optimal grip) described by Merleau-Ponty  . It is described as the state of least tension that we constantly aim towards as we select our behaviors in the world. The term comes from a particular example of this process, grasping an object, whereby we unreflectively try to use the optimal grip possible on it. The same can be said about our other behaviors, as our “unfolding motor intentions” attempt to receive expected responses from the world, or in other words, attempt to minimize free energy in terms of prediction error. In fact, a free energy account of optimal grip has indeed been made with the “Skilled Intentionality Framework”   .
The idea of optimal grip makes possible another key idea of Merleau-Ponty’s, that of the intentional arc. This is the circular feedback loop between perception and action which occurs when the solicitations of the environment are too numerous to allow for (successful) unreflective action, and require practice in order to narrow down responses to be directed at those solicitations that are most beneficial to the organism. This is an account of skill acquisition, or the constant improvement of context discrimination that allows one to progress from a novice to an expert, as laid out in  . In essence, this is a longer-term development originating from the need to maintain an optimal grip (or minimal prediction error) on the world. Once acquired, such expert skill allows for the world-to-mind causation of behavior that is characteristic of skillful coping.
The intentionality inherent in such optimal grip seeking is observed not only in the burgeoning infant but also in the adult, as he engages in more complex behaviors such as sociocultural practices.  examines the manifestation of optimal grip seeking behavior at many levels of abstraction, from the singular action of gripping a fork, to the social cooperative action of collaborative architectural design. For Bruineberg & Rietveld, it is not just the unreflective part of our behavior that constitutes responsiveness to affordances. The definition is also expanded to account for more complex sociocultural scenarios and practices that have traditionally thought to be based on logical deliberation or higher cognition  .
What are the benefits of achieving optimal grip on the environment? It can be seen that any animal that does so can be maximally flexible given spontaneous occurrences that require an immediate response. The desire to achieve an optimal grip can be felt explicitly by humans as a real tension―when something is not in clear view we try to get a better look at it; when we are unsure if a friend will arrive from the left or the right we stand facing forward in order to be able to detect both cases. We seek to maximize the information currently available to us, but also the counterfactual richness of the situation. By doing so we feel “in control” of the situation.
The fact is that by acting in this way we actually are increasing our control of the situation―by increasing its counterfactual richness we at the same increase the sensorimotor contingencies, and thus the affordances for action, available to us. The more affordances available to us at any given moment, the more chances for success we have, and any planning of action would necessarily take the ensuing richness of future states into account prospectively, as in the example given by Bruineberg of climbers searching for footholds that allow for more flexible behavior in the long run.
We can connect the idea of optimal grip to the notion of inherent curiosity. As we develop from a zygote we constantly experiment and seek to improve our predictive abilities regarding our environment. In a sense, this is also a form of optimal grip, but on a longer timescale. By understanding the limitations of our own body and the surrounding environment, it leads us to better position ourselves to respond to new situations and allows us to more efficiently plan ahead (using prospective action) to maximize the available affordances and accuracy of our own beliefs.
An interesting aspect of this is that we can tie the ideas of tendency to optimal grip and pre-reflective “sensorimotor intentionality” to free energy minimization, which seeks to constantly decrease our prediction errors and reduce the tension of “incorrect beliefs” about the environment. In this way we can point to a clear mechanism based on brain-body-environment dynamics that could implement the propensity and capability to behave in a curious, goal-directed fashion in the world, developed from an intrinsic teleology which stems from our status as a purposeful organism  , extending from the developing infant to the formed adult as “a grouping of lived-through meanings which moves towards its equilibrium.” (Merleau-Ponty 1962, p. 153.)
4.5. Embodied Decision Making and Emotion
We previously looked at how various kinds of predictions and prediction errors could be incorporated into a multimodal HGM in order to take into account things like biomechanical cost and valence when making a decision. However, by putting too much emphasis on the generative model per se we run the risk of an overly internalized account of cognition. A more enactive approach should consider the part that the body and environment themselves play towards the decision making process. Indeed, while the previously offered evidence shows that information regarding cost and reward certainly plays a part in behavioral selection, encoding all possible variables in each scenario seems like an impossible task. An alternative approach is to rely on the inherent dynamics of the body itself to give feedback during the decision process, or to make “embodied decisions”  .
Throughout multiple experiences, an infant can begin to associate specific situations with specific emotions and be “drawn to” or “solicited by” particular affordances depending on the state of the environment and his own concerns at that time  . After much practice he becomes an “expert” in the Dreyfus sense, and it will be those (and only those) characteristics of the environment that would best fulfill his needs that will beckon her, as each affordance is experienced as a state of bodily readiness for action   , which can be interpreted as either a “prediction for future action”, a “situational appraisal”, an “affective state” or simply an “emotion”. By getting “bodily ready” for a situation, the infant manages resources in a manner appropriate for the following actions, by readying specific subsets of muscles, tuning the autonomic systems such as digestion and the immune system (for instance in high stress situations), and subconsciously developing plans for action  . This sort of prospective action affords the infant an adaptive advantage and could be the selected trait which allowed predictive processing to develop to the extent that it did in humans  .
We believe this complex modulatory behavior in conjunction with the generative model is the key to understanding “background coping” or “affordances on the horizon”    , the nuanced milieu within which we effortlessly traverse the environment and skillfully cope with the familiar external surroundings in order to fulfill our internal needs. Further research will be required to tease apart the bodily contribution of affective states and their influence on the operations of inference based perception, action, and learning.
We see that modulation via affective information or “emotion” could play a large role in the biasing of certain actions in certain contexts  . But what are the possible mechanisms behind this? There is evidence that suggests it is possible to frame emotion itself in terms of PP and active inference.  explains how subjective emotional states can be generated via “active interoceptive inference”, which works in the same way as proprioceptive active inference except for the fact that the content generated by higher levels is “emotional” in nature and is an inference regarding the cause of interoceptive (physiological state) as opposed to exteroceptive input. The subjective emotional experience can be described as both internal physiological conditions or “bodily state” and its cognitive “appraisal”   . Put more precisely, top-down interoceptive predictions are sent down the HGM whereas interoceptive prediction errors are sent upwards. As with perception and active inference, these prediction errors can be resolved by either updating beliefs about internal physiological states or by triggering autonomic reflexes to regulate the homeostatic condition, with each solution biased by corresponding precision weighting on lower or higher levels respectively.
Utilizing this framework for describing emotion becomes extremely useful as it can be combined with the exteroceptive and proprioceptive HGMs in a multimodal network which starts to resemble something very “human” in nature. Higher levels can integrate all three of these sources to create complex predictions about future states that reflect exteroceptive, proprioceptive, and interoceptive experience in previously encountered similar conditions. It also explains such innately “human” aspects such as how looking at a work of art (exteroceptive cue) can elicit an emotional reaction (interoceptive response).
 expands upon this idea of a multimodal HGM and shows how an interoceptive prediction error such as “hunger” can be reduced indirectly through cascading proprioceptive prediction errors that result in the activation of reflex arcs for approaching and consuming food (allostatic behavior) or directly through interoceptive prediction errors leading to metabolic activity such as burning fat stores (autonomic behavior). In this way hunger is said to “contextualize” the resulting behavior, which itself can be understood as a belief about counterfactual actions which will avoid low preference outcomes such as “starvation”.
Building on this, further abstraction of interoceptive “drives” can lead to complex sociocultural behaviors such as going to a restaurant which consists of many temporally extended or separated events each of which only indirectly relates back to the physiological origin of “hunger” related prediction errors. In this way, goals are said to be “autonomized” from their originating drives  . Further, balancing of the precisions of each level of the HGM also affects how each homeostatic need will affect the ensuing behavior. For example, a prediction error involving hunger can produce either low-level autonomic control and anticipatory regulation of the internal milieu or high-level food seeking and consumption, both of which are forms of active interoceptive inference that are directed either internally or externally respectively    .
We can thus trace the development of the infant from primary desires of food, water, and shelter to complex behaviors such as manipulating and drinking from a bottle, communication with the mother, and playing. In this way, we can also trace various behaviors back to primary purpose or teleological value, which is a key aspect of the enactive view of cognition  and is seen as necessary for “sense-making” or the meaningful behavior engaged in by organisms in relation to their environment. Further, we can connect the practices taught to the infant from the caregiver to the more “top-down” view of meaning provided to us from human culture itself as we are “marinated in statistics of an optimal world” during development   . This form of “baton-passing” of societal value allows us to set our prior beliefs of how to act without having to go through the dangerous process of trial and error in the harsh external environment in each individual lifetime. However, it is of note that these “bottom-up” and “top-down” sources of purpose may not be the full picture. Indeed, a key future direction of enactive research would be exploring the idea of a dynamical purpose hierarchy in more detail which incorporates the concerns of not only the species and culture but also the individual  .
In the final section, we re-examine the learning process of the infant in detail based on the frameworks and theories discussed throughout this paper and look for a coherent thread which ties brain, body, and environmental factors together to result in the development of an intelligent, bodily capable infant human being.
5. An Account of Learning
In light of the previous chapters, we can now formulate an account of how a developing infant could acquire complex, appropriate behavioral patterns which allow it to thrive in the multifaceted sociomaterial environment he grows up in. Driven by the tensions and affective states felt by the child from the time before he even leaves the womb, the infant learns to act in a way that reduces his interoceptive prediction error. The learning processes of model parameter optimization (strengthening of synaptic connections) and model selection (pruning of connections) then allow the infant’s internal HGM to progressively represent the contingencies of his world more optimally.
First by engaging in spontaneous voluntary behavior or GMs, then leading to more voluntary purposive behavior, the infant begins to gain more control over his body, allowing his to fulfill his own desires and as time goes on be less reliant on external help―for example, from a very early age the infant learns to suck his thumb in lieu of feeding, which in turn leads to the ability to suck on a bottle or pacifier through the processes of assimilation and accommodation. Here we see burgeoning autonomization of goals from behavior more directly related to the reduction of homeostatic drives such as feeding to abstractions such as sucking on a pacifier. This, in turn, can lead to behaviors on a longer timescale such as crying in the absence of his pacifier then obtaining it and sucking on it or seeking for the pacifier before using it himself. Freed from direct stimulus-response like behavior, the infant can build longer sequences of behavior, and in conjunction with the development of his physical body can acquire more complex sensorimotor routines, each in turn revealing a new part of the world with which to interact as the landscape of affordances for the infant broadens and in turn allows for further development of intelligence to proceed in an incremental fashion.
Through accommodation of schemes first between infant and environment and second between the existing repertoire and new schemes, constant recalibration can allow the infant’s behavioral pattern to evolve in coordination with the new aspects of the environment that incrementally open up due to the development of the infant’s body and at the will of an external caregiver. Further calibration is also made between the internal processors of each sensory modality in the infant’s brain, and so associative mappings are made to construct body and environment images or schemas during the process of “skill attunement”  . We posit that this process is undertaken mechanistically as the construction of a hierarchical generative model covering the spectrum of sensory inputs, that infers hidden states and captures the physical and structural statistics of the body and surrounding environment at ever greater levels of the hierarchy in a manner that doesn’t necessarily represent external entities per se, but does afford the infant with behavior that becomes successively adapted to the environment, or in other words more skillful or “expert like” in nature.
The infant perceives affordances directly  and in particular, is solicited to that “field” of affordances which more adequately meets his concerns  as his ability for affordance perception advances during development. That is, throughout the learning process he doesn’t just acquire bodily skills but instead is involved in a much more interactive process of evolution between brain, body, and environment. Through constant experimentation from fidgeting to babbling stemming from an innate desire for improvement, the infant’s advancing bodily abilities lead to more discriminative perceptive abilities which in turn engenders the opportunity to develop more skills and opens up even more of the physical environment for exploration. This is the intuition behind the intentional arc, and neatly ties together the equilibration of sensorimotor skills with the purposive driven nature of free energy minimization and active inference. It allows us to see a clear intentionality present in the manner that the infant skillfully copes with the world, as he constantly puts himself in optimal grip with the situation and through practice and habit acquires rapid responses to complex scenarios perceived in terms of opportunities for action.
We have given an embodied, enactive account of infant development which speaks to both biological mechanistic notions of uncertainty reduction and phenomenological accounts of intentional behavior, as well as enactive accounts of intrinsic teleology. Not only this, but we have shown to get to very human-like behavior in the burgeoning infant from the humble beginnings of primary sensorimotor intentionality in the fetus. By constant exploration of one’s bodily and environmental limits, the sensorimotor contingency based generative models required for perceiving and forecasting the world are built up incrementally, with ever higher layers of abstraction allowing for the counterfactual effects of behavior to be perceived “directly” from the environment as affordances for action. Purpose, possibility, and prediction abound throughout all levels of behavior from the very first cluster of cells, and it is because of this that we believe an embodied, enactive account of development is important for fully explaining not just how but why we develop in the way we do.
Does not apply.
I have no competing interests.
No funding was received for this work.
Kole Harvey: entire study and paper.
Does not apply.
Does not apply.
Permission to Carry out Fieldwork
Does not apply.
 Pezzulo, G., Barsalou, L.W., Cangelosi, A., Fischer, M.H., McRae, K. and Spivey, M.J. (2011) The Mechanics of Embodiment: A Dialog on Embodiment and Computational Modeling. Frontiers in Psychology, 2, 1-21.
 Froese, T. and Ziemke, T. (2009) Enactive Artificial Intelligence: Investigating the Systemic Organization of Life and Mind. Artificial Intelligence, 173, 466-500.
 Di Paolo, E.A., Rohde, M. and Jaegher, H. (2010) Horizons for the Enactive Mind: Values, Social Interaction, and Play. In: Stewart, J., Gapenne, O. and Di Paolo, E.A., Enaction: Towards a New Paradigm for Cognitive Science, MIT Press, Cambridge, MA.
 Dreyfus, H.L., Wrathall, M.A. and Malpas, J.E. (2000) Heidegger, Coping, and Cognitive Science. Essays in Honor of Hubert L. Dreyfus.
 Barsalou, L.W. (2009) Simulation, Situated Conceptualization, and Prediction. Philosophical Transactions of the Royal Society B: Biological Sciences, 364, 1281-1289.
 Wheeler, M. (2008) Cognition in Context: Phenomenology, Situated Robotics and the Frame Problem. International Journal of Philosophical Studies, 16, 323-349.
 Delafield-Butt, J.T. and Gangopadhyay, N. (2013) Sensorimotor Intentionality: The Origins of Intentionality in Prospective Agent Action. Develop-mental Review, 33, 399-425.
 Kuniyoshi, Y. and Sangawa, S. (2006) A Neural Model for Exploration and Learning of Embodied Movement Patterns. Biological Cybernetics, 95, 589-605.
 Cisek, P. and Centreville, C.P.S. (2007) Cortical Mechanisms of Action Selection?: The Affordance Competition Hypothesis. Philosophical Transactions of the Royal Society B: Biological Sciences, 362, 1585-1599.
 Fuke, S., Ogino, M. and Asada, M. (2007) Body Image Constructed from Motor and Tactile Images with Visual Information. International Journal of Humanoid Robotics, 4, 347-364.
 Schillaci, G., Hafner, V.V. and Lara, B. (2016) Exploration Behaviors, Body Representations, and Simulation Processes for the Development of Cognition in Artificial Agents. Frontiers in Robotics and AI, 3, 1-18.
 Laflaquière, A., O’regan, J.K., Argentieri, S., Gas, B. and Terekhov, A.V. (2015) Learning Agent’s Spatial Configuration from Sensorimotor Invariants. Robotics and Autonomous Systems, 71, 49-59.
 Le Clec’H, G., Gas, B. and O’Regan, J.K. (2016) Acquisition of a Space Representation by a Naive Agent from Sensorimotor Invariance and Proprioceptive Compensation. International Journal of Advanced Robotic Systems, 13, 1-15.
 Law, J., Lee, M., Hulse, M. and Tomassetti, A. (2011) The Infant Development Timeline and Its Application to Robot Shaping. Adaptive Behavior, 19, 335-358.
 Taga, G., Takaya, R. and Konishi, Y. (1999) Analysis of General Movements of Infants towards Understanding of Developmental Principle for Motor Control. IEEE International Conference on Systems, Man, and Cybernetics, 5, 678-683.
 Baranès, A. and Oudeyer, P.Y. (2009) R-IAC: Robust Intrinsically Motivated Exploration and Active Learning. IEEE Transactions on Autonomous Mental Development, 1, 155-169.
 Trevarthen, C.B. (1986) Neuroembryology and the Development of Perceptual Mechanisms. In: Falkner, F. and Tanner, J.M., Eds., Postnatal Growth Neurobiology, Springer, Boston, MA, 301-383.
 Kamii, C. (1986) The Equilibration of Cognitive Structures: The Central Problem of Intellectual Development. Jean Piaget, Terrance Brown, Kishore Julian Thampy. American Journal of Education, 94, 574-577.
 Buhrmann, T. and Di Paolo, E. (2017) The Sense of Agency—A Phenomenological Consequence of Enacting Sensorimotor Schemes. Phenomenology and the Cognitive Sciences, 16, 207-236.
 Di Paolo, E.A., Barandiaran, X.E., Beaton, M. and Buhrmann, T. (2014) Learning to Perceive in the Sensorimotor Approach: Piaget’s Theory of Equilibration Interpreted Dynamically. Frontiers in Human Neuroscience, 8, 1-16.
 Rabinovich, M.I., Varona, P., Tristan, I. and Afraimovich, V.S. (2014) Chunking Dynamics: Heteroclinics in Mind. Frontiers in Computational Neuroscience, 8, 1-10. https://doi.org/10.3389/fncom.2014.00022
 Franchak, J.M., van der Zalm, D.J. and Adolph, K.E. (2010) Learning by Doing: Action Performance Facilitates Affordance Perception. Vision Research, 50, 2758-2765. https://doi.org/10.1016/j.visres.2010.09.019
 Tessitore, G. and Borriello, M. (2009) How Direct Is Perception of Affordances? A Computational Investigation of Grasping Affordances. Proceedings of the ICCM Conference.
 Wokke, M.E., Knot, S.L., Fouad, A., Ridderinkhof, K.R. (2016) Conflict in the Kitchen: Contextual Modulation of Responsiveness to Affordances. Consciousness and Cognition, 40, 141-146.
 Ishak, S., Franchak, J.M. and Adolph, K.E. (2014) Perception-Action Development from Infants to Adults: Perceiving Affordances for Reaching through Openings. Journal of Experimental Child Psychology, 117, 92-105.
 Ramstead, M.J.D., Veissière, S.P.L. and Kirmayer, L.J. (2016) Cultural Affordances: Scaffolding Local Worlds through Shared Intentionality and Regimes of Attention. Frontiers in Psychology, 7, 1090.
 Pezzulo, G. and Cisek, P. (2016) Navigating the Affordance Landscape: Feedback Control as a Process Model of Behavior and Cognition. Trends in Cognitive Sciences, 20, 414-424.
 MacDorman, K.F. (2000) Responding to Affordances: Learning and Projecting a Sensorimotor Mapping. Pro-ceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065), San Fran-cisco, CA, 24-28 April 2000, 3253-3259.
 Law, J., Shaw, P., Lee, M. and Sheldon, M. (2014) From Saccades to Grasping: A Model of Coordinated Reaching through Simulated Development on a Humanoid Robot. IEEE Transactions on Autonomous Mental Development, 6, 93-109.
 Shaw, P., Law, J. and Lee, M. (2015) Representations of Body Schemas for Infant Robot Development. Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), Providence, RI, 13-16 August 2015, 123-128.
 Dearden, A. and Demiris, Y. (2005) Learning Forward Models for Robots. In: Kaelbling, L.P. and Saffiotti, A., Eds., IJCAI 2005, Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, 30 July-5 August 2005, 1440-1445.
 Seth, A.K. (2014) A Predictive Processing Theory of Sensorimotor Contingencies: Explaining the Puzzle of Perceptual Presence and Its Absence in Synesthesia. Cognitive Neuroscience, 5, 97-118.
 Friston, K.J., Lin, M., Frith, C.D., Pezzulo, G., Hobson, J.A. and Ondobaka, S. (2017) Active Inference, Curiosity and Insight. Neural Computation, 29, 2633-2683.
 Friston, K., Rigoli, F., Ognibene, D., Mathys, C., Fitzgerald, T. and Pezzulo, G. (2015) Active Inference and Epistemic Value. Cognitive Neuroscience, 6, 187-224.
 Cisek, P. and Kalaska, J.F. (2005) Neural Correlates of Reaching Decisions in Dorsal Premotor Cortex: Specification of Multiple Direction Choices and Final Selection of Action. Neuron, 45, 801-814.
 Archibald, S.J., Mateer, C.A. and Kerns, K.A. (2001) Utilization Behavior: Clinical Manifestations and Neurological Mechanisms. Neuropsychology Review, 11, 117-130.
 Rietveld, E. (2012) Context-Switching and Responsiveness to Real Relevance. In: Kiverstein, J. and Wheeler, M., Eds., Heidegger Cognitive Science: New Directions in Cognitive Science and Philosophy, Palgrave Macmillan, Basingtoke, 105-135.
 Basso, D. (2013) Planning, Prospective Memory, and Decision-Making: Three Challenges for Hierarchical Predictive Processing Models. Frontiers in Psychology, 3, 623.
 Newell, K.M., Liu, Y.T. and Mayer-Kress, G. (2003) A Dynamical Systems Interpretation of Epigenetic Landscapes for Infant Motor Development. Infant Behavior and Development, 26, 449-472.
 Liu, Y.-T., Mayer-Kress, G. and Newell, K.M. (2006) Qualitative and Quantitative Change in the Dynamics of Motor Learning. Journal of Experimental Psychology Human Perception & Performance, 32, 380-393.
 Cisek, P. and Kalaska, J.F. (2010) Neural Mechanisms for Interacting with a World Full of Action Choices. An-nual Review of Neuroscience, 33, 269-298.
 Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., O’Doherty, J. and Pezzulo, G. (2016) Active Inference and Learning. Neuroscience & Biobehavioral Reviews, 68, 862-879.
 Pezzulo, G., Rigoli, F. and Friston, K. (2015) Active Inference, Homeostatic Regulation and Adaptive Behavioural Control. Progress in Neurobiology, 134, 17-35.
 Pezzulo, G., Rigoli, F. and Chersi, F. (2013) The Mixed Instrumental Controller: Using Value of Information to Combine Habitual Choice and Mental Simulation. Frontiers in Psychology, 4, 1-15.
 Pezzulo, G., van der Meer, M.A.A., Lansink, C.S. and Pennartz, C.M.A. (2014) Internally Generated Sequences in Learning and Executing Goal-Directed Behavior. Trends in Cognitive Science, 18, 647-657.
 Bruineberg, J. and Rietveld, E. (2014) Self-Organization, Free Energy Minimization, and Optimal Grip on a Field of Affordances. Frontiers in Human Neuroscience, 8, 1-14.
 Dreyfus, H.L. (2002) Intelligence without Representation—Merleau-Ponty’s Critique of Mental Representation the Relevance of Phenomenology to Scientific Explanation. Phenomenology and the Cognitive Sciences, 1, 367-383.
 Rietveld, E. and Brouwers, A.A. (2017) Optimal Grip on Affordances in Architectural Design Practices: An Eth-nography. Phenomenology and the Cognitive Sciences, 16, 545-564.
 Pezzulo, G. and Ognibene, D. (2012) Proactive Action Preparation: Seeing Action Preparation as a Continuous and Proactive Process. Motor Control, 16, 386-424.
 Phelps, E.A., Lempert, K.M. and Sokol-Hessner, P. (2014) Emotion and Decision Making: Multiple Modulatory Neural Circuits. Annual Review of Neuroscience, 37, 263-287.
 Pezzulo, G. and Castelfranchi, C. (2009) Thinking as the Control of Imagination: A Conceptual Framework for Goal-Directed Systems. Psychological Research PRPF, 73, 559-577.
 McGann, M. (2007) Enactive Theorists Do It on Purpose: Toward an Enactive Account of Goals and Goal-Directedness. Phenomenology and the Cognitive Sciences, 6, 463-483.