In an effort to vary length and strength in a recognition study without having the criterion vary across conditions, we used a single long study list with exemplars of many categories spaced throughout the list. We varied across categories the number of exemplars (category-length), the number of spaced repetitions of an exemplar (item-strength), and the number of repetitions of other exemplars in the category of the test item (category-strength). Distractors, prototypes and targets (of varying strengths) were given frequency and confidence ratings. Performance rose with item-strength but did not vary with category-length or category-strength. Hits and false alarms rose with category-length but remained constant across variations in category strength. We suggest that distributions of familiarity do not change (much) with changes in strength of other items, but grow when additional items are studied. This pattern of results was predicted qualitatively by the SAM model presented in Shiffrin, Ratcliff and Clark (1990), and an instantiation of that model was fit to the data. The list and category results together provide strong constraints for recognition models, as illustrated by an analysis of two models recently developed to handle the list findings; these models mispredict the category findings.