Attendees & Sessions
(Note: This is a preliminary schedule of sessions. Assignment of speakers to session and order within session are subject to change. Only the presenter is listed for each talk. The full list of research co-authors, full titles, and abstracts may be found by consulting the list of abstracts, or by clicking on the presenter's name.)
Learning and Memory
Stephan Lewandowsky: Knowledge partitioning in categorization
Actions, Agents, and Learning:
Low-level Visual Processes:
Neural network modeling:
(Note: All presentations listed below are tentative. No decision has yet been made as to which presentations will be spoken and which will be posters. That information will be posted when available.)
Knowledge Partitioning in Categorization
Knowledge partitioning is a theoretical concept that holds that knowledge is not always integrated, but may often be
separated into different independent parcels that may contain mutually contradictory information (e.g., Lewandowsky & Kirsner, 2000, Lewandowsky, Kalish, & Ngang, 2002). One characteristic of knowledge
partitioning is that once people choose to rely on a parcel to solve a problem, they ignore knowledge contained in other parcels. We applied this concept to categorization and examined whether people will create
independent knowledge parcels if a common complex categorization
problem is presented in different contexts. We report several experiments that consistently identify the presence of knowledge
partitioning. The results showed that people learn different strategies for categorization in different contexts, and that the strategy used in one context is unaffected by knowledge that is demonstrably present in other contexts.
Direct experience of any kind implicitly activates related memories, and this activation affects our ability to remember what was actually experienced. We model this problem by asking people to study a list of familiar words that activate related memories, called associates (as SPACE activates earth, universe, etc.). These associates are densely or sparsely connected to each other as a result of prior experience. Our findings indicate that dense connections among a studied word’s associates facilitate its cued recall and recognition. The question is why? How do unconsciously activated connections among related memories affect memory for the actual event? Spreading activation models assume that activation spreads to, among, and from a word’s associates, and that the return of activation is what strengthens the studied word’s representation. Spread models predict that connections among a word’s associates will have a greater effect on memory when there are more pre-existing connections back to the studied word. In contrast, an activation at a distance model assumes such strengthening is produced by the synchronous activation of the target’s associates. The distance model predicts that the total number of connections is what is important, not their direction. These predictions were tested by manipulating both the number of connections among a studied word’s associates and the number of connections returning to the studied word from its associates. For stronger and weaker cues and for younger and older participants, connections among the associates facilitated recall regardless of how many associates were linked to the studied word. Such additivity is inconsistent with spreading activation models and provides support for the distance model. Somewhat like force particles, unconsciously activated memories may obey a principle of simultaneity.
High and low frequency words were studied 0, 1, 3, 6, or 12 times each, each in either plural or singular form. At test participants gave judgments of frequency of a word in its exact plurality (0 representing a judgment of 'new'). This is a variant of the 'registration without learning' effect first studied by Hintzman, Curran, and Oppy (1992). The REM model of recognition memory (Shiffrin & Steyvers, 1997) applied to this paradigm predicts a performance advantage for low frequency words, a mirror effect for dissimilar foils, but a change in the mirror effect for similar foils (foils differing only in plurality from studied words). These predictions were upheld qualitatively, but a good quantitative fit required an extension of the single process REM model to include a dual process in which recall could be used to reject similar foils.
In associative recognition, subjects must discriminate between test pairs composed of items studied together (intact) or studied separately (rearranged). A number of investigators have asked whether this task is accomplished through global familiarity or cued recall. In this talk, we are interested in an orthogonal issue - how is information about a studied pair stored in memory. Various models propose that the stored representation of a studied pair consists of separate traces for the individual items (each with relational information), a single trace consisting of the simple juxtaposition of the two items (e.g. REM, MINERVA), or a single trace containing relational information linking the two items (e.g. TODAM, CHARM). We address these possibilities by presenting word-word, face-face, and word-face pairs and varying the number of each type of pair. Results show that performance is governed by the number of pairs of the same type and not by the total list length. Furthermore, performance in harmed when the study list requires switching between types of pairs relative to a list with only one pair type. A model based on the REM framework will be presented and implications for other global familiarity models will be discussed.
The effects of observational vs. feedback training (Experiment 1) and immediate vs. delayed feedback (Experiment 2) on rule-based and information-integration category learning are examined. The training and delay manipulations had no effect on the accuracy of responding or on the distribution of best fitting models in the rule-based category-learning task. However, observational training and delayed feedback resulted in less accurate responding in the information-integration category learning task, and to an increase in the use of rule-based strategies to solve the information-integration task. These results suggest that rule-based and information-integration categories are solved by different category learning systems that are mediated by different neural circuits.
Many studies have implicated lateral prefrontal cortex in working memory. Recent results reinforce this idea and provide evidence that various subcortical structures may also help mediate working memory function (e.g., head of the caudate nucleus, globus pallidus, medial dorsal nucleus of the thalamus). We will describe a computational model in which working memory is mediated by parallel, prefrontal cortical-thalamic loops. Activation reverberates in these loops because prefrontal cortical excitation of the head of
the caudate leads to disinhibition of the thalamus (by inhibiting the globus pallidus, which tonically inhibits the thalamus). Switching from one loop to another is mediated by dopamine release from the substantia nigra and the ventral tegmental area. The "cells" in the model mimic the behavior of real cells at the same time that the
entire network effectively mimics human spatial working memory
behavior. In particular, the model successfully accounts for the
results of a variety of different single-cell recording studies and
human behavioral data in spatial delayed matching-to-sample studies.
The idea that brain oscillations could play a useful role in human memory is suggested by a number of computational models of neural networks. Because of the difficulties faced in
measuring human brain activity with high spatial and temporal resolution these physiological models never made contact with actual human data, and were thus of little relevance to human cognitive science. The possibility of analyzing intracranial recordings taken while neurosurgical patients performed cognitive tasks allowed us to close this gap between physiological models and psychological data. Using this method, we found that cortical oscillations occur predominantly in the theta band(4-8 Hz) and appear to be related to the cognitive demands of both spatial and non-spatial memory tasks. I will describe oscillatory correlates of performance in Sternberg's (1966) memory scanning paradigm, a standard measure of verbal working memory. Recordings from many regions of cortex reveal widespread increases in theta power during all phases of the task, but this "theta gating" effect is not modulated by memory load. In contrast, gamma power increases reliably with memory load at a smaller number of brain locations. These two findings have implications for neurobiological models of working memory that are close cousins of the standard cognitive models of this task.
Neuropsychological evidence has revealed that several kinds of learning (e.g., category learning) depend upon the integrity of the striatum. Striatal structures are major targets of ascending dopamine (DA) projection systems, raising questions about the role that dopaminergic neurotransmission might play in learning within the striatum. Two candidate functions (reinforcement; modulation of competition between neurons) are considered via neural network modelling studies, and these functions are speculatively mapped onto D1-like and D2-like DA receptor subtypes respectively. Simulated individual differences in the hypothetical DA system functions are compared with behavioural evidence of individual differences in learning task performance.
Behavioral phenomena underlying category-specific semantic deficits have played an important role in the development of theories of semantic memory organization. Extending this line of research, we identified seven behavioral trends with respect to the categories that tend to be relatively impaired or preserved together in category-specific semantic deficits patients. The main hypothesis is that, given the numerous sources of variation that exist in patient testing, these consistent trends must arise because a number of factors converge. An account of these factors is provided using a large set of semantic feature production norms plus other norming and corpus data. We show that the primary category-specific deficits data can be accounted for by the confluence of the following factors: (1) knowledge types, with analyses based on a taxonomy that is grounded in current imaging data regarding the types of knowledge that are stored in
various brain regions; (2) distinguishing features; (3) feature
distinctiveness; (4) confusability of a concept with its nearest
neighbors, in terms of both visual and overall semantic similarity; (5)
visual complexity; (6) concept familiarity; and (7) concept name
frequency. We conclude that these factors combine to produce category-specific deficits because they are important aspects of the organization of semantic memory.
The pharmaceutical midazolam causes dense, but
temporary, anterograde amnesia. We consider two hypotheses about
the way in which midazolam affects memory: Midazolam causes the
hippocampus to store less information in memory or midazolam causes the hippocampus to store episodic information less accurately in memory. The REM model (Shiffrin & Steyvers, 1997) can predict the
effects of midazolam, study time, and normative word-frequency on
both yes-no and remember-know recognition memory (Hirshman, et al., in press). According to the current REM model, storing information less accurately is necessary and sufficient to predict the data, but
storing less information is neither necessary nor sufficient.
The talk moves from the general assumption that concepts are not abstract entities but are based on situated action. I will present the results of different studies, performed with an experimental method or with neural network simulations. The studies show that thematic relations, and particularly spatial and action ones, play a major role not only in children but also in adult categorization, that even superordinate and abstract concepts activate instances and are situated, that the way we interact with objects and the perspective we adopt influences the way we group and
Theories of actions, and events are investigated in various research disciplines, in particular physics, philosophy, artificial intelligence and computer science, linguistics, and psychology. These schools have different scientific approaches and goals but nevertheless their terminologies, concepts and questions overlap. In this presentation we will describe an integrated, inter-disciplinary model of action descriptions and performance, with a view towards intelligent, communicating agents. The model is based on fundamental aspects of actions and events, based on philosophical theories and studies, but integrating basic physical knowledge and a computer science oriented view. The model in this aspect describes the ontological status of actions in terms of time aspects, involved agency, and objects affected by the event or action. This description is then further developed into a classification of time-related events and actions. Linguistic and AI theories lead to a categorization of actions in terms of their general purpose, e.g. communicate actions, mental actions, and social actions, and - with physics - to a detailed description of actions in terms of their physical effects. The model also suggests three levels of descriptions, with the lowest level relating to the physical realization of an action (Robotics), the intermediate level relating to a semantic characterization of the action (Linguistics), and the highest level describing actions with a view to their pragmatic aspects, and long-term or higher-goal intentions and achievements (Psychology, Philosophy). Action descriptions on these levels also reflect Artificial Intelligence and Cognitive Science related issues and concepts. Further, the model integrates views from philosophy and psychology in the particular context of differentiating the aspects of goals, needs, and desires. This integrative work led overall to
The focus of the action taxonomy is on providing descriptions of action concepts, whereas the general agent architecture is a framework for modeling the integrated process of sensing, perception, knowledge, motives, planning and acting.
The taxonomy of actions has been partially tested in an interactive help system for Unix. A computer model of the general architecture with an emphasis on the study of interactions between (internal) goals related to needs and desires, and (external) task-related goals has been investigated further in a student project and a detailed description and implementation is currently under development.
According to an 'external grounding' theory of meaning, a concept¹s meaning
depends on its connection to the external world. By a 'conceptual web' account, a concept¹s meaning depends on its relations to other concepts within the same system. We explore one aspect of meaning, the identification of matching concepts across systems (e.g. people, theories, or cultures). We present a computational algorithm that uses only within-system similarity relations to find between-system translations. While illustrating the sufficiency of a conceptual web account for translating between systems, simulations also indicate powerful synergistic interactions between intrinsic, within-system information and extrinsic information. Applications of the algorithm to issues in object recognition, shape analysis, automatic translation, human analogy and comparison making,
pattern matching, neural network interpretation, and statistical analysis will be discussed.
The mass of an object cannot be seen. In certain
circumstances, however, observers are good at identifying the heavier
of two colliding balls. The theory of direct perception maintains that observers are able to detect mathematically complex patterns in the visual field that accurately specify relative mass without invoking high-level processing (Runeson, et al, 2000). In contrast,
constructivists argue that observers use imperfect, rudimentary cues
that must be augmented through cognitive processes (Gilden & Proffitt,
1989). This research explores a number of new models of relative mass
perception that speak to the larger issue of what type of information
organisms can utilize to perform complex tasks. First, high-level
optical patterns consistent with the direct perception approach are
considered. Each pattern is sufficient to accurately determine mass
judgments and is embedded in a simple perceptual model. Second, a more constructivist, similarity-based model is developed in which the
detection of relative mass is viewed as a categorization task. Each
collision resides in a multidimensional space, the dimensions of which
represent perceptual aspects of the collisions. Similarity between
collisions is a decreasing function of distance in the space. Relative mass judgments are based on the similarity of the test
collision to all learned collisions in which Ball 1 or Ball 2 was
heavier. Experiments contrasting these models will be discussed.
According to Barsalou (1999), conceptual representations consist of perceptual symbols. Perceptual symbols are records of the neural states that underlie perception. The neural systems that are used in perception are also used for conceptual knowledge. Perceptual symbols can have any aspect of perceived experience, such as vision, audition, touch, smell, taste, and proprioception. In perception there is a cost associated to modality shifts. We investigated whether this also holds for conceptual representations. In a property verification task in which all stimuli were words a critical trial (e.g., eggplant-purple) was preceded by a trial in the same modality (gemstone-glittering) or a different modality (marble-cool). The results showed that responses were slower after a modality shift. These results provide evidence for embodied theories of cognition.
Spatial models for semantic representation (e.g. LSA) are currently popular for explaining and predicting the similarity between words. Such models have been criticized for being unable to relate the meaning of words to their perception in the world (the symbol grounding problem). We propose a probabilistic generative model for semantic representation that naturally leads to models for symbol grounding by considering how words are used in describing visual scenes. The model learns a set of cross-modal topics, with each topic representing separate probability distributions over words and visual objects. We train the model on a
database of 30,000+ images that includes a verbal description for each
image (e.g., "A vacationer wading in the ocean, Cancun Mexico") as well as a list of visual objects that appear in the image (e.g., "WATER", "BEACH", "SKY"). The model found 500 interpretable topics to
describe among other things beaches, mountains, various animals, cities, and gardens. We show how the model predicts what visual objects would be found in an image given only a verbal description or what words might describe a scene given only a set of visual objects. In addition, the model can generate imageability ratings for sets of word by calculating the likelihood of generating visual descriptors for those words.
Language can be thought of as a set of processing cues that guide the comprehender's perceptual simulations of the described events. I will discuss the results of several experiments that demonstrate that perceptual information is routinely activated during language processing. These findings are not predicted by traditional amodal theories of language processing.
We have developed novel neural network model and
computational theory for updating post-synaptic efficacy using cooperative recurrent inputs and feed-foward retinal inputs. The model is used for contour integration, key-point detection and feature tracking of visual data. We also show analytically and empirically the importance of sub-sampling at the retina and resolution enhancement at the cortical level for reliable coding of visual inputs. Our neural network model performs such operations.
We present the results of several forced-choice
perceptual identification studies. On each trial, a masked target presentation followed one or two sequentially presented primes. Each prime presentation consisted of a single word presented both above and below fixation. We separately manipulated prime duration and vertical eccentricity between the primes. Neither, one, or both of the choice words repeated a prime word. In keeping with the result of Huber, Shiffrin, Quach, and Lyle (in press), short prime presentations produced a preference for repeated words whereas longer prime presentations produced a small preference against repeated words. In some cases these effects were modulated by eccentricity, with more centrally fixated primes producing larger preference effects. In one experiment, the same prime word was first presented for a long duration with a large eccentricity, followed by a brief, near threshold, more central presentation. Surprisingly, there was a strong preference to choose such a prime, despite the overall extended prime duration. All conditions were quantitatively handled with the ROUSE model of Huber, Shiffrin, Lyle, and Ruys (2001), which includes the offsetting components of source confusion and discounting. In particular, the sequential prime
presentation result was explained by assuming the brief central
presentation resulted in additional source confusion, but not additional
Using a forced-choice testing procedure, Huber, Shiffrin, Lyle, and Ruys (2001) measured identification accuracy for briefly flashed and masked target words. These target words immediately followed the presentation of prime words that were either identical or dissimilar. There was a preference for repeated choice words following short prime exposures and a preference against repeated choice words following longer prime exposures. Huber and O'Reilly (in press) proposed that persistent prime activation explains the results with short durations and accommodated prime activation explains the switch with longer durations. They presented a neural network model with a modified activation function that diminishes beyond the initial peak value due to synaptic depression. According to the theory, accommodation of identified items through synaptic depression is the mechanism by which perceptual systems clear activation, allowing for accurate perception of subsequent items. To test the theory, we used a 128-electrode array to measure event-related potentials (ERPs) to repeated and novel words following short and long prime presentations. In accord with model predictions, persistence was observed regardless of prime duration, for visual areas. Also in accord with model predictions, persistence was observed to switch to accommodation as a function of prime duration, for higher perceptual areas.
We have modeled various aspects of face perception using connectionist networks. We have shown how developmentally reasonable constraints can lead to a "face expert" network, how expertise with one domain can lead to faster learning
of expertise in another domain (e.g., why the Fusiform Gyrus might get recruited for Greeble processing if it is already a face expert), and how disparate theories of facial expression recognition can be resolved in a single model. In the latter domain, we have shown how a single model can accomodate categorical perception theories as well as "dimensional" theories of facial expression
perception, and how a single model can explain the apparent
independence between identity and expression processing without positing separate representations for these. We review these results in this presentation.
This talk will demonstrate the utility of advanced data mining and information visualization techniques to support science and technology management. Large amounts of, e.g., publication, patent, and grant data can be analyzed, correlated, and visualized to map the semantic space of researchers, publications, funding, etc.. The resulting visualizations can be utilized to objectively identify major research areas, experts, institutions, grants, publications, journals, etc. in a research area of interest. In addition, they can assist identify interconnections, the import and export of research between fields, the dynamics (speed of growth, diversification) of scientific fields, scientific and social networks, and the impact of strategic and applied research funding programs among others. This knowledge is not only interesting for funding agencies but also for companies, researchers, and society.
Multidimensional scaling (MDS) is a method which
associates distances with similarities for a set of stimuli, producing a representational space in which the stimuli are points. These spaces are geometrically flat, such as distances measured in a plane. An example of an alternative, curved geometry is points on a sphere, with distances measured on the surface. The present work demonstrates that MDS flattens curvature that might be present in psychological data, and returns a distorted representation. We describe new tools to uncover curvature in data, present evidence for curvature in facial expression space, and discuss implications.
Cross-language data suggest that reading acquisition develops more rapidly and accurately in children who are learning to read and spell consistent orthographies (e.g., German, Italian) than in children who are learning to read and spell less consistent orthographies (e.g., English). In this study, we investigate whether or not current connectionist learning models (Plaut et al., 1996; Zorzi et al., 1998) correctly predict cross-language differences in learning rate.
Environmental-noise effects on cognitive performance latencies are modelled. Predictions address published data from a prominent proof-reading paradigm. Modelling provision is made for: the rate of inspecting character segments within a line of prose; opposite effects of noise-stress and practice on inspection rate; and approximate sizes of search segments. Multiple tests of fit are supportive. The model proffers formal explanations for non-additive effects of stress and practice. Certain mathematical properties, along with selective sensitivity to experimental maniplations, bolster parameter interpretation. Finally, Bayesian method indicate individualized parameter estimates.
Recent research on numerical processing in animals and humans converge on the view that knowledge of numbers constitutes a domain-specific cognitive ability, with a specific neural substrate located in the left and right inferior parietal cortices. A far more complex issue, however, is that of the nature of these representations. One popular view is that numbers are represented as a compressed mental number line, which obeys Weber-Fechner logarithmic law (e.g., Dehaene, 1992). I will present neural network simulations of basic numerical abilities (number comparison, single-digit addition) designed to investigate this issue in a more formal way. Results show that linear representations of numerical magnitude
(“numerosity”, Zorzi & Butterworth, 1999) provide the best account of
human performance. Numerosity representations exploit the basic property of cardinal meaning.
This AI starts from a clean slate and, using only the most basic programming functions, defines a comprehensive system that can be realized with today’s technology. A foundation of control, memory, sensory input, and logic functions are brought together by an XML-based linking structure. The result is an AI that can exhibit normal mental processes, either human or animal. It will be able to learn and will show a level of consciousness. More importantly, however, it will have the potential to grow beyond the scope of its human designers, a thought that is both exciting and frightening.
Does learning words start as associations and becomes something else? In a recent study, Woodward and Hoyne (1999) showed that 13-month-olds readily associate both words and non-linguistic sounds with object categories. In contrast, 20-month-olds associate words but not non-linguistic sounds with object categories. Are children learning what forms count as words? If so, just what defining features are they learning? The research reported here addressed these questions in a network-simulation study and in empirical tests with children. Our results support the idea that, although what younger and older children are doing may seem different, their early mapping of both words and non-words to objects and their progress in limiting word forms can be explained by associative processes: at the start, at the end, and as the mechanism of change.
Compositionality is the property of natural language and symbolic thought that the interpretation of an expression consisting a some combination of words (or symbols) is a function of the interpretations of the words (symbols). All linguistic descriptions, from cognitive grammar to Chomskyan minimalism, rely on some form of compositionality. Yet these accounts tell us nothing about how children come to master composition or how human language got to be compositional in the first place. For compositionality to be grounded, we would need an account of how it emerges out what is available to infants. In this talk, I'll discuss what it would take to get quasi-compositionality out of sub-symbolic representations and mechanisms. I'll focus on the processes involved in the emergence of syntactic roles (such as SUBJECT), proposing a form of relationa competitive learning as a fundamental mechanism.
A computational model of story comprehension is presented, in which story situations are represented distributively as points in a high-dimensional `situation-state space'. This state space organizes itself on the basis of a constructed microworld description. From the same description, causal/temporal world knowledge is extracted. The distributed representation of story situations is more flexible than Golden and Rumelhart's (1993) localist representation. Limitations concerning the story situations and world knowledge that can be represented in the Golden and Rumelhart model, are solved by the distributed representation. A story taking place in the microworld corresponds to a trajectory in the situation-state space. During comprehension of a story, world knowledge is applied to the story trajectory. This results in an adjusted trajectory, reflecting the inference of propositions that are likely to be true. The results of simulations correspond to empirical data concerning reading time and story recall.
My work explores the use of a mathematical framework, with the use of psychologically meaningful parameter interpretations, to derive and compare different preferential choice models. These models should accurately describe behavior in realistic environments: response measures, process measures, and choice properties. Additionally, I propose the use of appropriate parameter restriction and specification to model individual differences. I will review the essential characteristics of a mathematical framework for representing a wide variety of preferential choice models. Then, I will introduce a procedure that allows for the modeling of a large number of different weighting schemes, initial biases, memory assumptions, computation processes, thresholds, constraints, and other individual characteristics within this framework.
In order to diagnose a particular disease, physicians have to set a « patient profile » (based on Risk Factors, Symptoms, and Signs) that is compared to hundreds of « disease profile » established by the physician mind and related to the knowledge and experience. The disease profile that is the closest to the patient profile is selected as the most probable diagnosis. The Objective is to develop a computer based Medical Diagnosis Tool with respect to Medical Reasoning and « Matching Process » made by the physician mind. The Normalized Hamming Distance (Fuzzy sets) is similar to the process of comparing a user profile to different diseases profiles. This formula has been adapted to Medical Reasoning with the use of exclusion criteria, and sanction system. Medical data have been stored with a scoring system enabling the use of Hamming Distance. The result is a software with a computer program useful for both patients and physicians. Identification informations, Risk Factors, Symptoms and Signs are collected. Therefore, a list of suspected diseases appears with the probability of each disease. Fuzzy sets is a very « physiologic » approach to Medical Reasoning. When used with other medical principles, Fuzzy sets becomes a powerful diagnostic tool enabling a brighter medical diagnosis.
We consider decision problems in which a decision maker must choose a subset from a set of objects such that the value of the chosen subset is maximum (or satisfactorily high). In many real world situations the value of a subset of objects is not the simple sum of the values of its individual elements. We model the value-structure over a set of objects by a weighted hypergraph, where each hyperedge represents the added value of the combination of its vertices. We define two general problems: for a given value-structure and some integer p (1) the General Choice Subset (GCS) problem asks to determine a subset such that its value is at least p; (2) the General Rejected Subset (GRS) problem asks to determine a subset such that the total negative value that is removed from the set of objects is at least p. Both problems, GCS and GRS, turn out to be computational intractable. To characterize the difficulty of these problems, we study a taxonomy of special cases of GCS and GRS and study their computational difficulty (classical and parameterized).
Traditional process models of old-new recognition have not addressed
differences in accuracy and response time between individual stimuli.
Two new process models of recognition are presented, and applied to
response time and accuracy data from a series of old-new recognition
experiments. The first model was derived from a feature-sampling
account of the time course of categorization, whereas the second
model is a generalization of a random-walk model of categorization.
The experiments employed a new technique, which yielded reliable
individual- stimulus data through repeated presentation of structurally
equivalent items. The model applications showed that a random-walk
account provided the best account of the results. The implications of
the results for process models of recognition are discussed.
It is well known that a violation of the independence assumption can have profound negative implications for
standard statistical hypothesis testing. We show to what extent specific types of serial correlations invalidate analysis of variance,
conditional on the experimental randomization scheme. We discuss the
merits and debits of three methods for dealing with serial correlations. The first method increases the required significance level for detecting a difference, the second method uses a time-series based whitening procedure, and the third method requires that the serial correlations are attributable to simple trends.
Model selection methods are designed to infer which of a set of models most likely generated a sample of data. The
success of any one selection method is not fixed, but depends greatly on the similarity of the models themselves and the data on which they are evaluated. The interrelatedness of models and selection methods makes it desirable to know how well various selection methods are likely to perform across a range of testing situations. Knowing this information helps justify which method is safe to use. Landscaping is a technique that provides such information over such a large number of conditions that a selection landscape is created in which selection methods can be compared and the ability to distinguish the models computationally can also be assessed. Application examples of
forgetting models and categorization models will be presented.