Date: Sept 23, 2020 03:00pm -4:30pm
Location: online
Title: The “A Day in the Life” project: A preliminary report
Speaker: Garrison W. Cottrell – CSE Department – UCSD- La Jolla, CA
Abstract:
The goal of this research project is to create a model of the human visual system with anatomical and experiential constraints. The anatomical constraints implemented in the model so far include a foveated retina, the log-polar transform between the retina and V1, and the bifurcation between central and peripheral pathways in the visual system (Wang & Cottrell, 2017). The experiential constraint consists of a realistic training set that models human visual experience. The dataset most often used for training deep networks is ImageNet, a highly unrealistic dataset of 1.2M images of 1,000 categories. The categories are organized almost as a parody of Jorge Luis Borges’ Celestial Emporium of Benevolent Knowledge, including (among more common ones) “abacus”, “lens cap”, “whiptail lizard”, “ptarmigan”, “abaya”, “viaduct”, “maypole”, “monastery”, and 120 dog breeds. Any network trained on these categories becomes a dog expert, as well as a system that knows some very idiosyncratic categories. Only a small fraction of the human population are dog experts. The goal of the “Day in the Life” project is to collect a more realistic dataset of what humans observe and fixate upon in daily life. Through the use of a wearable eye-tracker with an Intel Realsense scene camera that gives depth information, we are recording data from subjects as they go about their day. We then use a deep network to segment and label the objects that are fixated. The goal is to develop a training set that is faithful to the distribution of what individuals actually look at in terms of frequency, dwell time, and distance. Training a visual system model with this data should result in representations that more closely mimic those developed in visual cortex. This data should also be useful in vision science, as frequency, probably the most important variable in psycholinguistics, has not typically been manipulated in human visual processing experiments for lack of norms. Here we report some initial results from this project. To forestall disappointment, I will mention here that the dataset is not yet ready for prime time.
Biography of Speaker:
Arrison W. (Gary) Cottrell is a Professor of Computer Science and Engineering and the Director of the Interdisciplinary Ph.D. Program in Cognitive Science at UC San Diego. He was a founding PI of the Perceptual Expertise Network, and directed the Temporal Dynamics of Learning Center for over 10 years. Professor Cottrell’s research is strongly interdisciplinary. His main interest is Cognitive Science and Computational Cognitive Neuroscience. He focuses on building working models of cognitive processes, and using them to explain psychological or neurological processes. In recent years, he has focused on anatomically-inspired deep learning models of the visual system. He has also worked on unsupervised feature learning (modeling precortical and cortical coding), face & object processing, visual salience, and visual attention. His other interest is applying AI to problems in other areas of science. Most recently he has been using deep learning to elucidate the structure of small (natural product) molecules in collaboration with Bill Gerwick at the Scripps Institute of Oceanography. He received his PhD. in 1985 from the University of Rochester under James F. Allen (thesis title: A connectionist approach to word sense disambiguation). He then did a postdoc with David E. Rumelhart at the Institute of Cognitive Science at UCSD until 1987, when he joined the CSE Department.