Abstract
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0%, respectively, which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overfitting in the fully connected layers we employed a recently developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.
- Bell, R., Koren, Y. Lessons from the netflix prize challenge. ACM SIGKDD Explor. Newsl. 9, 2 (2007), 75--79. Google ScholarDigital Library
- Berg, A., Deng, J., Fei-Fei, L. Large scale visual recognition challenge 2010. www.image-net.org/challenges. 2010.Google Scholar
- Breiman, L. Random forests. Mach. Learn. 45, 1 (2001), 5--32. Google ScholarDigital Library
- Cireşan, D., Meier, U., Masci, J., Gambardella, L., Schmidhuber, J. High-performance neural networks for visual object classification. Arxiv preprint arXiv:1102.0183, 2011.Google Scholar
- Cireşan, D., Meier, U., Schmidhuber, J. Multi-column deep neural networks for image classification. Arxiv preprint arXiv:1202.2745, 2012.Google Scholar
- Deng, J., Berg, A., Satheesh, S., Su, H., Khosla, A., Fei-Fei, L. In ILSVRC-2012 (2012).Google Scholar
- Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In CVPR09 (2009).Google Scholar
- Fei-Fei, L., Fergus, R., Perona, P. Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. Comput. Vision Image Understanding 106, 1 (2007), 59--70. Google ScholarDigital Library
- Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 4 (1980), 193--202.Google ScholarCross Ref
- Griffin, G., Holub, A., Perona, P. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007.Google Scholar
- He, K., Zhang, X., Ren, S., Sun, J. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015.Google Scholar
- Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012).Google Scholar
- Jarrett, K., Kavukcuoglu, K., Ranzato, M.A., LeCun, Y. What is the best multi-stage architecture for object recognition? In International Conference on Computer Vision (2009). IEEE, 2146--2153.Google ScholarCross Ref
- Krizhevsky, A. Learning multiple layers of features from tiny images. Master's thesis, Department of Computer Science, University of Toronto, 2009.Google Scholar
- Krizhevsky, A. Convolutional deep belief networks on cifar-10. Unpublished manuscript, 2010.Google Scholar
- Krizhevsky, A., Hinton, G. Using very deep autoencoders for content-based image retrieval. In ESANN (2011).Google Scholar
- LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., Jackel, L., et al. Handwritten digit recognition with a back-propagation network. In Advances in Neural Information Processing Systems (1990). Google ScholarDigital Library
- LeCun, Y. Une procedure d'apprentissage pour reseau a seuil asymmetrique (a learning scheme for asymmetric threshold networks). 1985.Google Scholar
- LeCun, Y., Huang, F., Bottou, L. Learning methods for generic object recognition with invariance to pose and lighting. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004, CVPR 2004. Volume 2 (2004). IEEE, II--97. Google ScholarDigital Library
- LeCun, Y., Kavukcuoglu, K., Farabet, C. Convolutional networks and applications in vision. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS) (2010). IEEE, 253--256.Google ScholarCross Ref
- Lee, H., Grosse, R., Ranganath, R., Ng, A. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning (2009). ACM, 609--616. Google ScholarDigital Library
- Linnainmaa, S. Taylor expansion of the accumulated rounding error. BIT Numer. Math. 16, 2 (1976), 146--160.Google ScholarCross Ref
- Mensink, T., Verbeek, J., Perronnin, F., Csurka, G. Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In ECCV -- European Conference on Computer Vision (Florence, Italy, Oct. 2012).Google Scholar
- Nair, V., Hinton, G.E. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (2010). Google ScholarDigital Library
- Pinto, N., Cox, D., DiCarlo, J. Why is real-world visual object recognition hard? PLoS Comput. Biol. 4, 1 (2008), e27.Google ScholarCross Ref
- Pinto, N., Doukhan, D., DiCarlo, J., Cox, D. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Comput. Biol. 5, 11 (2009), e1000579.Google ScholarCross Ref
- Rumelhart, D.E., Hinton, G.E., Williams, R.J. Learning internal representations by error propagation. Technical report, DTIC Document, 1985.Google Scholar
- Russell, BC, Torralba, A., Murphy, K., Freeman, W. Labelme: A database and web-based tool for image annotation. Int. J. Comput Vis. 77, 1 (2008), 157--173. Google ScholarDigital Library
- Sánchez, J., Perronnin, F. High-dimensional signature compression for large-scale image classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011 (2011). IEEE, 1665--1672. Google ScholarDigital Library
- Simard, P., Steinkraus, D., Platt, J. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the Seventh International Conference on Document Analysis and Recognition. Volume 2 (2003), 958--962. Google ScholarDigital Library
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), 1--9.Google ScholarCross Ref
- Turaga, S., Murray, J., Jain, V., Roth, F., Helmstaedter, M., Briggman, K., Denk, W., Seung, H. Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Comput. 22, 2 (2010), 511--538. Google ScholarDigital Library
- Werbos, P. Beyond regression: New tools for prediction and analysis in the behavioral sciences, 1974.Google Scholar
Index Terms
- ImageNet classification with deep convolutional neural networks
Recommendations
ImageNet classification with deep convolutional neural networks
NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% ...
A dyadic multi-resolution deep convolutional neural wavelet network for image classification
For almost the past four decades, image classification has gained a lot of attention in the field of pattern recognition due to its application in various fields. Given its importance, several approaches have been proposed up to now. In this paper, we ...
Automatic Fish Species Classification Using Deep Convolutional Neural Networks
AbstractIn this paper, we presented an automated system for identification and classification of fish species. It helps the marine biologists to have greater understanding of the fish species and their habitats. The proposed model is based on deep ...
Comments