On Taxonomies for Multi-class Image Categorization

Binder, Alexander; Müller, Klaus-Robert; Kawanabe, Motoaki

doi:10.1007/s11263-010-0417-8

On Taxonomies for Multi-class Image Categorization

Open access
Published: 15 January 2011

Volume 99, pages 281–301, (2012)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computer Vision Aims and scope Submit manuscript

On Taxonomies for Multi-class Image Categorization

Download PDF

Alexander Binder^1,2,
Klaus-Robert Müller¹ &
Motoaki Kawanabe^1,2

1602 Accesses
28 Citations
3 Altmetric
Explore all metrics

Abstract

We study the problem of classifying images into a given, pre-determined taxonomy. This task can be elegantly translated into the structured learning framework. However, despite its power, structured learning has known limits in scalability due to its high memory requirements and slow training process. We propose an efficient approximation of the structured learning approach by an ensemble of local support vector machines (SVMs) that can be trained efficiently with standard techniques. A first theoretical discussion and experiments on toy-data allow to shed light onto why taxonomy-based classification can outperform taxonomy-free approaches and why an appropriately combined ensemble of local SVMs might be of high practical use. Further empirical results on subsets of Caltech256 and VOC2006 data indeed show that our local SVM formulation can effectively exploit the taxonomy structure and thus outperforms standard multi-class classification algorithms while it achieves on par results with taxonomy-based structured algorithms at a significantly decreased computing time.

Article PDF

Object Classification Using a Semantic Hierarchy

Can computer vision problems benefit from structured hierarchical classification?

Article Open access 06 May 2016

Large Scale Image Classification with Many Classes, Multi-features and Very High-Dimensional Signatures

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D.M., & Jordan, M. I. (2003). Matching words and pictures. Journal of Machine Learning Research, 3, 1107–1135.
MATH Google Scholar
Blaschko, M. B., & Gretton, A. (2009). Learning taxonomies by dependence maximization. In Advances in neural information processing systems.
Google Scholar
Bosch, A. (2007). Image classification for a large number of object categories. Ph.D. thesis, University of Girona.
Cai, L., & Hofmann, T. (2004). Hierarchical document categorization with support vector machines. In Proceedings of the conference on information and knowledge management.
Google Scholar
Cortes, C., & Vapnik, V. (1995). Support-vector networks. In Machine Learning (pp. 273–297).
Google Scholar
Dollár, P., Babenko, B., Belongie, S. J., Perona, P., & Tu, Z. (2008). Multiple component learning for object detection. In ECCV (pp. 211–224).
Google Scholar
Everingham, M., Zisserman, A., Williams, C. K. I., & Van Gool, L. (2006). The PASCAL visual object classes challenge 2006 (VOC2006) results. http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2007). The PASCAL visual object classes challenge 2007 (VOC2007) results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2008). The PASCAL visual object classes challenge 2008 (voc2008) results. http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2009). The PASCAL visual object classes challenge 2009 (voc2009) results. http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html.
Fan, X. (2005). Efficient multiclass object detection by a hierarchy of classifiers. In CVPR (pp. 716–723).
Google Scholar
Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. A. (2009). Describing objects by their attributes. In CVPR (pp. 1778–1785).
Google Scholar
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2009). Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 99(1).
Fergus, R., Perona, P., & Zisserman, A. (2007). Weakly supervised scale-invariant learning of models for visual recognition. International Journal of Computer Vision, 71(3), 273–303.
Article Google Scholar
Gehler, P., & Nowozin, S. (2009). On feature combination for multiclass object classification. In ICCV.
Google Scholar
Griffin, G., & Perona, P. (2008). Learning and using taxonomies for fast visual categorization. In IEEE conference on computer vision and pattern recognition (CVPR).
Google Scholar
Griffin, G., Holub, A., & Perona, P. (2007). Caltech-256 object category dataset (Technical Report 7694). California Institute of Technology.
Har-Peled, S., Roth, D., & Zimak, D. (2002). Constraint classification for multi–class classification and ranking. In Advances in neural information processing systems.
Google Scholar
Joachims, T. (1999). Making large-scale SVM learning practical. In B. Schölkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods—support vector learning. Cambridge: MIT Press.
Google Scholar
Kishida, K. (2005). Property of average precision and its generalization: an examination of evaluation indicator for information retrieval experiments (Technical report). National Institute of Informatics, Japan.
Lafferty, J., Zhu, X., & Liu, Y. (2004). Kernel conditional random fields: representation and clique selection. In Proceedings of the international conference on machine learning.
Google Scholar
Lampert, C. H., & Blaschko, M. B. (2008). A multiple kernel learning approach to joint multi-class object detection. In Proceedings of the 30th DAGM symposium on pattern recognition.
Google Scholar
Lampert, C. H., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In CVPR (pp. 951–958).
Google Scholar
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In IEEE computer society conference on computer vision and pattern recognition (Vol. 2, pp. 2169–2178). New York, USA.
Google Scholar
Lowe, D. (2004). Distinctive image features from scale invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Marszalek, M., & Schmid, C. (2007). Semantic hierarchies for visual object recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Google Scholar
Marszalek, M., & Schmid, C. (2008). Constructing category hierarchies for visual recognition. In Proceedings of the European conference on computer vision.
Google Scholar
Moosmann, F., Nowak, E., & Jurie, F. (2008). Randomized clustering forests for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(9), 1632–1646.
Article Google Scholar
Müller, K. R., Mika, S., Rätsch, G., Tsuda, S., & Schölkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12(2), 181–202.
Article Google Scholar
Ommer, B., & Buhmann, J. M. (2010). Learning the compositional nature of visual object categories for recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 501–516.
Article Google Scholar
Ommer, B., Sauter, M., & Buhmann, J. M. (2006). Learning top-down grouping of compositional hierarchies for recognition. In CVPRW’06: proceedings of the 2006 conference on computer vision and pattern recognition workshop (p. 194), Washington, DC, USA. Los Alamitos: IEEE Comput. Soc.
Chapter Google Scholar
Platt, J. (1999). In Probabilistic outputs for support vector machine and comparison to regularized likelihood methods.
Google Scholar
Qi, G. J., Hur, X. S., & Zhang, H. J. (2009). Learning semantic distance from community-tagged media collection. In MM’09: proceedings of the seventeen ACM international conference on Multimedia (pp. 243–252).
Chapter Google Scholar
Schölkopf, B., & Smola, A. J. (2001). Learning with Kernels: support vector machines, regularization, optimization, and beyond. Adaptive computation and machine learning. Cambridge: MIT Press.
Google Scholar
Shahbaz Khan, F., van de Weijer, J., & Vanrell, M. (2009). Top-down color attention for object recognition. In IEEE conference on computer vision (ICCV’09).
Google Scholar
Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., & Jain, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transaction on Pattern Analysis and Machine Intelligence, 22(12), 1349–1380.
Article Google Scholar
Sonnenburg, S., Rätsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., de Bona, F., Binder, A., Gehl, C., & Franc, V. (2010). The SHOGUN machine learning toolbox. Journal of Machine Learning Research, 11, 1799–1802.
Google Scholar
Tahir, M., van de Sande, K., Uijlings, J., Yan, F., Li, X., Mikolajczyk, K., Kittler, J., Gevers, T., & Smeulders, A. (2008). SurreyUVA SRKDA method. http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2008/workshop/tahir.pdf.
Taskar, B., Guestrin, C., & Koller, D. (2004). Max–margin Markov networks. In Advances in neural information processing systems.
Google Scholar
Tibshirani, R., & Hastie, T. (2007). Margin trees for high-dimensional classification. JMLR, 8, 637–652.
MATH Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484.
MathSciNet MATH Google Scholar
van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2010). Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1582–1596. http://doi.ieeecomputersociety.org/10.1109/TPAMI.2009.154.
Article Google Scholar
Weston, J., & Watkins, C. (1999). Support vector machines for multi-class pattern recognition. In ESANN (pp. 219–224).
Google Scholar
Yang, L., Jin, R., Sukthankar, R., & Jurie, F. (2008). Unifying discriminative visual codebook generation with classifier training for object category recognition. In Proceedings of IEEE conference on computer vision and pattern recognition, IEEE (pp. 1–8).
Google Scholar
Zhang, J., Marszalek, M., Lazebnik, S., & Schmid, C. (2007). Local features and kernels for classification of texture and object categories: a comprehensive study. International Journal of Computer Vision, 73(2), 213–238.
Article Google Scholar
Zweig, A., & Weinshall, D. (2007). Exploiting object hierarchy: combining models from different category levels. In ICCV (pp. 1–8).
Google Scholar

Download references

Author information

Authors and Affiliations

Dep. Computer Science, Machine Learning Group, Berlin Institute of Technology, Franklinstr. 28/29, 10587, Berlin, Germany
Alexander Binder, Klaus-Robert Müller & Motoaki Kawanabe
Dep. Intelligent Data Analysis, Fraunhofer FIRST, Kekuléstr. 7, 12489, Berlin, Germany
Alexander Binder & Motoaki Kawanabe

Authors

Alexander Binder
View author publications
You can also search for this author in PubMed Google Scholar
Klaus-Robert Müller
View author publications
You can also search for this author in PubMed Google Scholar
Motoaki Kawanabe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Binder.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Binder, A., Müller, KR. & Kawanabe, M. On Taxonomies for Multi-class Image Categorization. Int J Comput Vis 99, 281–301 (2012). https://doi.org/10.1007/s11263-010-0417-8

Download citation

Received: 30 March 2010
Accepted: 17 December 2010
Published: 15 January 2011
Issue Date: September 2012
DOI: https://doi.org/10.1007/s11263-010-0417-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On Taxonomies for Multi-class Image Categorization

Abstract

Article PDF

Similar content being viewed by others

Object Classification Using a Semantic Hierarchy

Can computer vision problems benefit from structured hierarchical classification?

Large Scale Image Classification with Many Classes, Multi-features and Very High-Dimensional Signatures

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On Taxonomies for Multi-class Image Categorization

Abstract

Article PDF

Similar content being viewed by others

Object Classification Using a Semantic Hierarchy

Can computer vision problems benefit from structured hierarchical classification?

Large Scale Image Classification with Many Classes, Multi-features and Very High-Dimensional Signatures

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation