ImageNet classification with deep convolutional neural networks

Authors:
Alex Krizhevsky

Google Inc

Google Inc
View Profile

,
Ilya Sutskever

Google Inc

Google Inc
View Profile

,
Geoffrey E. Hinton

OpenAI

OpenAI
View Profile

Authors Info & Claims

Communications of the ACM Volume 60 Issue 6June 2017pp 84–90https://doi.org/10.1145/3065386

Published:24 May 2017Publication History

Communications of the ACM

Abstract

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0%, respectively, which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overfitting in the fully connected layers we employed a recently developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

References

Bell, R., Koren, Y. Lessons from the netflix prize challenge. ACM SIGKDD Explor. Newsl. 9, 2 (2007), 75--79. Google ScholarDigital Library
Berg, A., Deng, J., Fei-Fei, L. Large scale visual recognition challenge 2010. www.image-net.org/challenges. 2010.Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 1 (2001), 5--32. Google ScholarDigital Library
Cireşan, D., Meier, U., Masci, J., Gambardella, L., Schmidhuber, J. High-performance neural networks for visual object classification. Arxiv preprint arXiv:1102.0183, 2011.Google Scholar
Cireşan, D., Meier, U., Schmidhuber, J. Multi-column deep neural networks for image classification. Arxiv preprint arXiv:1202.2745, 2012.Google Scholar
Deng, J., Berg, A., Satheesh, S., Su, H., Khosla, A., Fei-Fei, L. In ILSVRC-2012 (2012).Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In CVPR09 (2009).Google Scholar
Fei-Fei, L., Fergus, R., Perona, P. Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. Comput. Vision Image Understanding 106, 1 (2007), 59--70. Google ScholarDigital Library
Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 4 (1980), 193--202.Google ScholarCross Ref
Griffin, G., Holub, A., Perona, P. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007.Google Scholar
He, K., Zhang, X., Ren, S., Sun, J. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015.Google Scholar
Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012).Google Scholar
Jarrett, K., Kavukcuoglu, K., Ranzato, M.A., LeCun, Y. What is the best multi-stage architecture for object recognition? In International Conference on Computer Vision (2009). IEEE, 2146--2153.Google ScholarCross Ref
Krizhevsky, A. Learning multiple layers of features from tiny images. Master's thesis, Department of Computer Science, University of Toronto, 2009.Google Scholar
Krizhevsky, A. Convolutional deep belief networks on cifar-10. Unpublished manuscript, 2010.Google Scholar
Krizhevsky, A., Hinton, G. Using very deep autoencoders for content-based image retrieval. In ESANN (2011).Google Scholar
LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., Jackel, L., et al. Handwritten digit recognition with a back-propagation network. In Advances in Neural Information Processing Systems (1990). Google ScholarDigital Library
LeCun, Y. Une procedure d'apprentissage pour reseau a seuil asymmetrique (a learning scheme for asymmetric threshold networks). 1985.Google Scholar
LeCun, Y., Huang, F., Bottou, L. Learning methods for generic object recognition with invariance to pose and lighting. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004, CVPR 2004. Volume 2 (2004). IEEE, II--97. Google ScholarDigital Library
LeCun, Y., Kavukcuoglu, K., Farabet, C. Convolutional networks and applications in vision. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS) (2010). IEEE, 253--256.Google ScholarCross Ref
Lee, H., Grosse, R., Ranganath, R., Ng, A. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning (2009). ACM, 609--616. Google ScholarDigital Library
Linnainmaa, S. Taylor expansion of the accumulated rounding error. BIT Numer. Math. 16, 2 (1976), 146--160.Google ScholarCross Ref
Mensink, T., Verbeek, J., Perronnin, F., Csurka, G. Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In ECCV -- European Conference on Computer Vision (Florence, Italy, Oct. 2012).Google Scholar
Nair, V., Hinton, G.E. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (2010). Google ScholarDigital Library
Pinto, N., Cox, D., DiCarlo, J. Why is real-world visual object recognition hard? PLoS Comput. Biol. 4, 1 (2008), e27.Google ScholarCross Ref
Pinto, N., Doukhan, D., DiCarlo, J., Cox, D. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Comput. Biol. 5, 11 (2009), e1000579.Google ScholarCross Ref
Rumelhart, D.E., Hinton, G.E., Williams, R.J. Learning internal representations by error propagation. Technical report, DTIC Document, 1985.Google Scholar
Russell, BC, Torralba, A., Murphy, K., Freeman, W. Labelme: A database and web-based tool for image annotation. Int. J. Comput Vis. 77, 1 (2008), 157--173. Google ScholarDigital Library
Sánchez, J., Perronnin, F. High-dimensional signature compression for large-scale image classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011 (2011). IEEE, 1665--1672. Google ScholarDigital Library
Simard, P., Steinkraus, D., Platt, J. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the Seventh International Conference on Document Analysis and Recognition. Volume 2 (2003), 958--962. Google ScholarDigital Library
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), 1--9.Google ScholarCross Ref
Turaga, S., Murray, J., Jain, V., Roth, F., Helmstaedter, M., Briggman, K., Denk, W., Seung, H. Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Comput. 22, 2 (2010), 511--538. Google ScholarDigital Library
Werbos, P. Beyond regression: New tools for prediction and analysis in the behavioral sciences, 1974.Google Scholar

Index Terms

ImageNet classification with deep convolutional neural networks

Recommendations

ImageNet classification with deep convolutional neural networks
NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% ...
Read More
A dyadic multi-resolution deep convolutional neural wavelet network for image classification

For almost the past four decades, image classification has gained a lot of attention in the field of pattern recognition due to its application in various fields. Given its importance, several approaches have been proposed up to now. In this paper, we ...
Read More
Automatic Fish Species Classification Using Deep Convolutional Neural Networks
Abstract
In this paper, we presented an automated system for identification and classification of fish species. It helps the marine biologists to have greater understanding of the fish species and their habitats. The proposed model is based on deep ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Communications of the ACM Volume 60, Issue 6
June 2017
93 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/3098997
Editor:
Moshe Y. Vardi
Association for Computing Machinery, New York, NY
Issue’s Table of Contents
Copyright © 2017 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 May 2017
Check for updates
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24,656
  Total Citations
  View Citations
- 205,503
  Total Downloads
- Downloads (Last 12 months)50,364
- Downloads (Last 6 weeks)6,223
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

ImageNet classification with deep convolutional neural networks

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

ImageNet classification with deep convolutional neural networks

A dyadic multi-resolution deep convolutional neural wavelet network for image classification

Automatic Fish Species Classification Using Deep Convolutional Neural Networks