Employing different loss functions for the classification of images via supervised learning

Boţ, Radu; Heinrich, André; Wanka, Gert

doi:10.2478/s11533-013-0342-5

Artykuł - szczegóły

Czasopismo

Open Mathematics

2014 | 12 | 2 | 381-394

Tytuł artykułu

Employing different loss functions for the classification of images via supervised learning

Autorzy

Radu Boţ , André Heinrich , Gert Wanka

Treść / Zawartość

Pełne teksty:

http://www.degruyter.com/view/j/math.2014.12.issue-2/s11533-013-0342-5/s11533-013-0342-5.pdf [zdalny]

Języki publikacji

EN

Abstrakty

EN

Supervised learning methods are powerful techniques to learn a function from a given set of labeled data, the so-called training data. In this paper the support vector machines approach is applied to an image classification task. Starting with the corresponding Tikhonov regularization problem, reformulated as a convex optimization problem, we introduce a conjugate dual problem to it and prove that, whenever strong duality holds, the function to be learned can be expressed via the dual optimal solutions. Corresponding dual problems are then derived for different loss functions. The theoretical results are applied by numerically solving a classification task using high dimensional real-world data in order to obtain optimal classifiers. The results demonstrate the excellent performance of support vector classification for this particular problem.

Słowa kluczowe

EN

Machine learning Tikhonov regularization Conjugate duality Image classification

Kategorie tematyczne

Wydawca

De Gruyter Open

Czasopismo

Open Mathematics

Rocznik

2014

Tom

12

Numer

2

Strony

381-394

Opis fizyczny

Daty

wydano

2014-02-01

online

2013-11-21

Twórcy

autor

Radu Boţ

radu.bot@univie.ac.at

University of Vienna

autor

André Heinrich

andre.heinrich@mathematik.tu-chemnitz.de

Chemnitz University of Technology

autor

Gert Wanka

gert.wanka@mathematik.tu-chemnitz.de

Chemnitz University of Technology

Bibliografia

[1] Aronszajn N., Theory of reproducing kernels, Trans. Amer. Math. Soc., 1950, 68, 337–404 http://dx.doi.org/10.1090/S0002-9947-1950-0051437-7
[2] Boţ R.I., Conjugate Duality in Convex Optimization, Lecture Notes in Econom. and Math. Systems, 637, Springer, Berlin-Heidelberg, 2010
[3] Boţ R.I., Csetnek E.R., Heinrich A., On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems, SIAM J. Optim., 2013, 23(4), 2011–2036 http://dx.doi.org/10.1137/12088255X
[4] Boţ R.I., Lorenz N., Optimization problems in statistical learning: Duality and optimality conditions, European J. Oper. Res., 2011, 213(2), 395–404 http://dx.doi.org/10.1016/j.ejor.2011.03.021
[5] Boyd S., Parikh N., Chu E., Peleato B., Eckstein J., Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trendsc in Machine Learning, 2010, 3(1), 1–122 http://dx.doi.org/10.1561/2200000016
[6] Chapelle O., Haffner P., Vapnik V.N., Support vector machines for histogramm-based image classification, IEEE Trans. Neural Networks, 1999, 10(5), 1055–1064 http://dx.doi.org/10.1109/72.788646
[7] Geiger C., Kanzow C., Theorie und Numerik Restringierter Optimierungsaufgaben, Springer-Lehrbuch Masterclass, Springer, Berlin-New York, 2002 http://dx.doi.org/10.1007/978-3-642-56004-0
[8] Joachims T., Learning to Classify Text Using Support Vector Machines, Kluwer Internat. Ser. Engrg. Comput. Sci., 668, Kluwer, Boston, 2002 http://dx.doi.org/10.1007/978-1-4615-0907-3
[9] Kim K., Financial time series forecasting using support vector machines, Neurocomputing, 2003, 55(1–2), 307–319 http://dx.doi.org/10.1016/S0925-2312(03)00372-2
[10] Lal T.N., Chapelle O., Schölkopf B., Combining a filter method with SVMs, In: Feature Extraction, Studies in Fuzziness and Soft Computing, 207, Springer, Berlin, 2006, 439–445 http://dx.doi.org/10.1007/978-3-540-35488-8_21
[11] Mota J.F.C., Xavier J.M.F., Aguiar P.M.Q., Püschel M., D-ADMM: A communication-efficient distributed algorithm for separable optimization, preprint available at http://arxiv.org/abs/1202.2805
[12] Noble W.S., Support vector machine application in computational biology, In: Kernel Methods in Computational Biology, MIT Press, 2004, 71–92
[13] Pedroso J.P., Murata N., Support vector machines with different norms: motivation, formulations and results, Pattern Recognition Lett., 2001, 22(12), 1263–1272 http://dx.doi.org/10.1016/S0167-8655(01)00071-X
[14] Rifkin R.M., Lippert R.A., Value regularization and Fenchel duality, J. Mach. Learn. Res., 2007, 8(3), 441–479
[15] Rockafellar R.T., Convex Analysis, Princeton Math. Ser., 28, Princeton University Press, Princeton, 1970
[16] Ruschhaupt M., Huber W., Poustka A., Mansmann U., A compendium to ensure computational reproducibility in high-dimensional classification tasks, Stat. Appl. Genet. Mol. Biol., 2004, 3, #37
[17] Shawe-Taylor J., Cristianini N., Kernel Methods for Pattern Analysis, Cambridge University Press, Cambridge, 2004 http://dx.doi.org/10.1017/CBO9780511809682
[18] Sra S., Nowozin S., Wright S.J., Optimization for Machine Learning, MIT Press, Cambridge-London, 2011
[19] Steinwart I., How to compare different loss functions and their risks, Constr. Approx., 2007, 26(2), 225–287 http://dx.doi.org/10.1007/s00365-006-0662-3
[20] Stoppiglia H., Dreyfus G., Dubois R., Oussar Y., Ranking a random feature for variable and feature selection, Journal of Machine Learning Research, 2003, 3, 1399–1414
[21] Suykens J.A.K., Vandewalle J., Least squares support vector machine classifiers, Neural Processing Letters, 1999, 9(3), 293–300 http://dx.doi.org/10.1023/A:1018628609742
[22] Tibshirani R., Regression shrinkage and selection via the lasso, J. Roy. Statist. Soc. Ser. B, 1996, 58(1), 267–288
[23] Tibshirani R., Saunders M., Rosset S., Zhu J., Knight K., Sparsity and smoothness via the fused lasso, J. R. Stat. Soc. Ser. B Stat. Methodol., 2005, 67(1), 91–108 http://dx.doi.org/10.1111/j.1467-9868.2005.00490.x
[24] Tikhonov A.N., Arsenin V.Ya., Solutions of Ill-posed Problems, Scripta Series in Mathematics, Winston & Sons, Washington DC, 1977
[25] Van Gestel T., Baesens B., Garcia J., Van Dijcke P., A support vector machine approach to credit scoring, Banken Financiewezen, 2003, 2, 73–82
[26] Vapnik V.N., The Nature of Statistical Learning Theory, Springer, New York, 1995 http://dx.doi.org/10.1007/978-1-4757-2440-0
[27] Vapnik V.N., Statistical Learning Theory, Adapt. Learn. Syst. Signal Process. Commun. Control, John Wiley & Sons, New York, 1998
[28] Varma S., Simon R., Bias in error estimation when using cross-validation for model selection, BMC Bioinformatics, 2006, 7, #91
[29] Wahba G., Spline Models for Observational Data, CBMS-NSF Regional Conf. Ser. in Appl. Math., 59, SIAM, Philadelphia, 1990 http://dx.doi.org/10.1137/1.9781611970128
[30] Xiang D.-H., Zhou D.-X., Classification with Gaussians and convex loss, J. Mach. Learn. Res., 2009, 10, 1447–1468
[31] Ying Y., Huang K., Campbell C., Sparse metric learning via smooth optimization, In: Advances in Neural Information Processing Systems, 22, NIPS Foundation, 2009, 2205–2213
[32] MATLAB 8.0, The MathWorks, 2012, Natick, Massachusetts

Identyfikatory

DOI

10.2478/s11533-013-0342-5

Identyfikator YADDA

bwmeta1.element.doi-10_2478_s11533-013-0342-5

Artykuł - szczegóły

Czasopismo

Open Mathematics

Tytuł artykułu

Employing different loss functions for the classification of images via supervised learning

Autorzy

Treść / Zawartość

Warianty tytułu

Języki publikacji

Abstrakty

Słowa kluczowe

Kategorie tematyczne

Wydawca

Czasopismo

Rocznik

Tom

Numer

Strony

Opis fizyczny

Daty

Twórcy

Bibliografia

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA