Multi-label classification using error correcting output codes

Kajdanowicz, Tomasz; Kazienko, Przemysław

doi:10.2478/v10006-012-0061-2

Artykuł - szczegóły

Czasopismo

International Journal of Applied Mathematics and Computer Science

2012 | 22 | 4 | 829-840

Tytuł artykułu

Multi-label classification using error correcting output codes

Autorzy

Tomasz Kajdanowicz , Przemysław Kazienko

Treść / Zawartość

Pełne teksty:

http://matwbn.icm.edu.pl/ksiazki/amc/amc22/amc2244.pdf [zdalny]

Warianty tytułu

Języki publikacji

EN

Abstrakty

EN

A framework for multi-label classification extended by Error Correcting Output Codes (ECOCs) is introduced and empirically examined in the article. The solution assumes the base multi-label classifiers to be a noisy channel and applies ECOCs in order to recover the classification errors made by individual classifiers. The framework was examined through exhaustive studies over combinations of three distinct classification algorithms and four ECOC methods employed in the multi-label classification problem. The experimental results revealed that (i) the Bode-Chaudhuri-Hocquenghem (BCH) code matched with any multi-label classifier results in better classification quality; (ii) the accuracy of the binary relevance classification method strongly depends on the coding scheme; (iii) the label power-set and the RAkEL classifier consume the same time for computation irrespective of the coding utilized; (iv) in general, they are not suitable for ECOCs because they are not capable to benefit from ECOC correcting abilities; (v) the all-pairs code combined with binary relevance is not suitable for datasets with larger label sets.

Słowa kluczowe

EN

machine learning supervised learning multi-label classification error-correcting output codes ECOC ensemble methods binary relevance framework

Wydawca

University of Zielona Gora Press

Czasopismo

International Journal of Applied Mathematics and Computer Science

Rocznik

2012

Tom

22

Numer

4

Strony

829-840

Opis fizyczny

Daty

wydano

2012

otrzymano

2011-09-10

poprawiono

2012-04-12

Twórcy

autor

Tomasz Kajdanowicz

Institute of Informatics, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland

autor

Przemysław Kazienko

Institute of Informatics, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland

Bibliografia

Boutell, M.R., Luo, J., Shen, X. and Brown, C.M. (2004). Learning multi-label scene classification, Pattern Recognition 37(9): 1757-1771.
Clare, A. and King, R.D. (2001). Knowledge discovery in multi-label phenotype data, in L.D. Raedt and A. Siebes (Eds.), PKDD: 5th European Conference on Machine Learning and Knowledge Discovery, Lecture Notes in Computer Science, Vol. 2168, Springer, Berlin/Heidelberg, pp. 42-53.
Crammer, K. and Singer, Y. (2003). A family of additive online algorithms for category ranking, Journal of Machine Learning Research 3: 1025-1058.
Dietterich, T.G. and Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes, Journal of Artificial Intelligence Research 2: 263-286.
Diplaris, S., Tsoumakas, G., Mitkas, P. and Vlahavas, I. (2005). Protein classification with multiple algorithms, in P. Bozanis and E.N. Houstis (Eds.), 10th Panhelllenic Conference on Informatics (PCI 2005), Lecture Notes in Computer Science, Vol. 3746, Springer-Verlag, Berlin/Heidelberg, pp. 448-456.
Duan, K., Keerthi, S.S., Chu, W., Shevade, S.K. and Poo, A.N. (2003). Multi-Category Classification by Soft-Max Combination of Binary Classifiers, Lecture Notes in Computer Science, Vol. 2709, Springer, Berlin/Heidelberg.
Elisseeff, A. and Weston, J. (2001). A kernel method for multi-labelled classification, in T.G. Dietterich, S. Becker and Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems 14, MIT Press, Cambridge, MA, pp. 681-687.
Ferng, C.-S. and Lin, H.-T. (2011). Multi-label classification with error-correcting codes, Journal of Machine Learning Research 20: 281-295.
Ghamrawi, N. and McCallum, A. (2005). Collective multi-label classification, in O. Herzog, H.-J. Schek, N. Fuhr, A. Chowdhury and W. Teiken (Eds.), International Conference on Information and Knowledge Management, CIKM, ACM, New York, NY, pp. 195-200.
Hong, J., Min, J., Cho, U. and Cho, S. (2008). Fingerprint classification using one-vs-all support vector machines dynamically ordered with naive Bayes classifiers, Pattern Recognition 41(2): 662-671.
Hullermeier, E., Furnkranz, J., Cheng, W. and Brinker, K. (2008). Label ranking by learning pairwise preferences, Artificial Intelligence 172(16-17): 1897-1916.
Jankowski, N. (2012). Graph-based generation of a meta-learning search space. International Journal of Applied Mathematics and Computer Science 22(3): 647-667, DOI: 10.2478/v10006-012-0049-y.
Kajdanowicz, T. and Kazienko, P. (2009a). Hybrid repayment prediction for debt portfolio, in N.T. Nguyen, R. Kowalczyk and S.-M. Chen (Eds.), Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems, Lecture Notes in Artificial Intelligence, Vol. 5796, Springer, Berlin/Heidelberg, pp. 850-857.
Kajdanowicz, T. and Kazienko, P. (2009b). Prediction of sequential values for debt recovery, in E. Bayro-Corrochano and J.-O. Eklundh (Eds.), Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Lecture Notes in Computer Science, Vol. 5856, Springer, Berlin/Heidelberg, pp. 337-344.
Kajdanowicz, T., Wozniak, M. and Kazienko, P. (2011). Multiple classifier method for structured output prediction based on error correcting output codes, in N. Nguyen, C.-G. Kim and A. Janiak (Eds.), Intelligent Information and Database Systems, Lecture Notes in Computer Science, Vol. 6592, Springer, Berlin/Heidelberg, pp. 333-342.
Kuncheva, L.I. (2005). Using diversity measures for generating error-correcting output codes in classifier ensembles, Pattern Recognition Letters 26(1): 83-90.
Kuriata, E. (2008). Creation of unequal error protection codes for two groups of symbols, International Journal of Applied Mathematics and Computer Science 18(2): 251-257, DOI: 10.2478/v10006-008-0023-x.
Loza Mencia, E. and Furnkranz, J. (2008). Pairwise learning of multilabel classifications with perceptrons, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN-08), Hong Kong, China, pp. 2900-2907.
Mackay, D.J.C. (2003). Information Theory, Inference, and Learning Algorithms, Cambridge University Press, Cambridge.
Morelos-Zaragoza, R. (2006). The Art of Error Correcting Coding, Wiley, West Sussex.
Pestian, J., Brew, C., Matykiewicz, P., Hovermale, D., Johnson, N., Bretonnel Cohen, K. and Duch, W. (2007). A shared task involving multi-label classification of clinical free text, Proceedings of ACL BioNLP, Association of Computational Linguistics, Stroudsburg, PA.
Read, J., Pfahringer, B., Holmes, G. and Frank, E. (2009). Classifier chains for multi-label classification, 13th European Conference on Principles and Practice of Knowledge Discovery in Databases/20th European Conference on Machine Learning, Bled, Slovenia, pp. 254-269.
Read, J., Pfahringer, B., Holmes, G. and Frank, E. (2011). Classifier chains for multi-label classification, Machine Learning 85(3): 333-359.
Reed, I.S. and Chen, X. (1999). Error-Control Coding for Data Networks, Kluwer Academic Publishers, Norwell, MA.
Sammut, C. and Webb, G.I. (2011). Encyclopedia of Machine Learning, Springer, Berlin/Heidelberg.
Schapire, R.E. and Singer, Y. (2000). Boostexter: A boosting-based system for text categorization, Machine Learning 39(2/3): 135-168.
Trohidis, K., Tsoumakas, G., Kalliris, G. and Vlahavas, I. (2008). Multilabel classification of music into emotions, 9th International Conference on Music Information Retrieval (ISMIR 2008), Philadelphia, PA, USA, pp. 325-330.
Tsoumakas, G., Katakis, I. and Vlahavas, I. (2011). Random k-labelsets for multilabel classification, IEEE Transactions on Knowledge and Data Engineering 23(7): 1079-1089.
Tsoumakas, G. and Vlahavas, I. (2007). Random k-labelsets: An Ensemble Method for Multilabel Classification, Lecture Notes in Artificial Intelligence, Vol. 4701, Springer, Berlin/Heidelberg.
Zhang, M.-L. and Zhou, Z.-H. (2006). Multilabel neural networks with applications to functional genomics and text categorization, IEEE Transactions on Knowledge and Data Engineering 18(10): 1338-1351.
Zhang, M. and Zhou, Z. (2007). ML-KNN: A lazy learning approach to multi-label learning, Pattern Recognition 40(7): 2038-2048.
Zhang, Y. and Schneider, J. (2011). Multi-label output codes using canonical correlation analysis, Journal of Machine Learning Research 15: 873-882.

Typ dokumentu

Bibliografia

Identyfikatory

DOI

10.2478/v10006-012-0061-2

Identyfikator YADDA

bwmeta1.element.bwnjournal-article-amcv22z4p829bwm

Artykuł - szczegóły

Czasopismo

International Journal of Applied Mathematics and Computer Science

Tytuł artykułu

Multi-label classification using error correcting output codes

Autorzy

Treść / Zawartość

Warianty tytułu

Języki publikacji

Abstrakty

Słowa kluczowe

Wydawca

Czasopismo

Rocznik

Tom

Numer

Strony

Opis fizyczny

Daty

Twórcy

Bibliografia

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA