Large-scale Image Memorability

Aditya Khosla, Akhil Raju, Antonio Torralba, Aude Oliva
Massachusetts Institute of Technology

Progress in estimating visual memorability has been limited by the small scale and lack of variety of benchmark data. Here, we introduce a novel experimental procedure to objectively measure human memory, allowing us to build LaMem, the largest annotated image memorability dataset to date (containing 60,000 images from diverse sources). Using Convolutional Neural Networks (CNNs), we show that fine-tuned deep features outperform all other features by a large margin, reaching a rank correlation of 0.64, near human consistency (0.68). Analysis of the responses of the high-level CNN layers shows which objects and regions are positively, and negatively, correlated with memorability, allowing us to create memorability maps for each image and provide a concrete method to perform image memorability manipulation. This work demonstrates that one can now robustly estimate the memorability of images from many different classes, positioning memorability and deep memorability features as prime candidates to estimate the utility of information for cognitive systems.

 

News: LaMem and MemNet are now available for download!


Download our paper

Please cite the following paper if you use this service:
Understanding and Predicting Image Memorability at a Large Scale
A. Khosla, A. S. Raju, A. Torralba and A. Oliva
International Conference on Computer Vision (ICCV), 2015
DOI 10.1109/ICCV.2015.275

Image Memorability API

Usage: http://memorability.csail.mit.edu/cgi-bin/image.py?url=IMG_URL

Example: http://memorability.csail.mit.edu/cgi-bin/image.py?url=http://memorability.csail.mit.edu/imgs/1.jpg

Notice: Please do not overload our server by querying repeatedly in a short period of time. This is a free service for academic research and education purposes only. It has no guarantee of any kind. For any questions or comments regarding this API or potential commercial applications, please contact Aditya Khosla.


Acknowledgements

We thank Wilma Bainbridge, Phillip Isola and Hamed Pirsiavash for helpful discussions. This work is supported by a National Science Foundation grant (1532591), the McGovern Institute Neurotechnology Program (MINT), MIT Big Data Initiative at CSAIL, research awards from Google and Xerox, and a hardware donation from Nvidia. We would also like to thank Adam Conner-Simons for the media outreach. To keep up to date on other CSAIL research, be sure to find CSAIL on Facebook and Twitter.