Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2015 | 3 | 1 |
Tytuł artykułu

A classification method for binary predictors combining similarity measures and mixture models

Treść / Zawartość
Warianty tytułu
Języki publikacji
In this paper, a new supervised classification method dedicated to binary predictors is proposed. Its originality is to combine a model-based classification rule with similarity measures thanks to the introduction of new family of exponential kernels. Some links are established between existing similarity measures when applied to binary predictors. A new family of measures is also introduced to unify some of the existing literature. The performance of the new classification method is illustrated on two real datasets (verbal autopsy data and handwritten digit data) using 76 similarity measures.
Opis fizyczny
  • Inria Grenoble Rhône-Alpes & LJK, France
  • LERSTAD-UGB, Saint-Louis, Sénégal
  • URMITE-IRD, Dakar, Sénégal
  • Inria Grenoble Rhône-Alpes & LJK, France
  • LERSTAD-UGB, Saint-Louis, Sénégal
  • URMITE-IRD, Dakar, Sénégal
  • URMITE-IRD, Dakar, Sénégal
  • [1] Andrews, J.L. and P.D. McNicholas (2012). Model-based clustering, classification, and discriminant analysis via mixtures ofmultivariate t-distributions. Stat. Comp. 22(5), 1021–1029.[Crossref]
  • [2] Batagelj, V. and M. Bren (1995). Comparing resemblance measures. J. Classif. 12, 73–90.[Crossref]
  • [3] Baulieu, F.B. (1989). A classification of presence/absence based dissimilarity coefficients. J. Classif. 6, 233–246.[Crossref]
  • [4] Bergé, L., C. Bouveyron, and S. Girard. (2012). HDclassif: an R package for model-based clustering and discriminant analysisof high-dimensional data. J. Stat. Softw. 46(6), 1–29.
  • [5] Bouguila, N., D. Ziou, and J. Vaillancourt (2003). Novel mixtures based on the Dirichlet distribution: application to data andimage classification. In Machine Learning and Data Mining in Pattern Recognition, Perner P. ed., 172–181, Springer-Verlag,Berlin Heidelberg.
  • [6] Bouveyron, C. and C. Brunet (2012). Simultaneous model-based clustering and visualization in the Fisher discriminativesubspace. Stat. Comp. 22, 301–324.[Crossref]
  • [7] Bouveyron, C., M. Fauvel and S. Girard (2015). Kernel discriminant analysis and clustering with parsimonious Gaussianprocess models. Stat. Comp., 25, 1143–1162.[Crossref]
  • [8] Bouveyron, C., S. Girard and C. Schmid (2007). High-dimensional discriminant analysis. Commun. Stat. A-Theor. 36, 2607–2623.[Crossref]
  • [9] Bouveyron, C., S. Girard and C. Schmid (2007). High-dimensional data clustering. Comput. Stat. Data An. 52, 502–519.
  • [10] Byass, P., D.L. Huong and H.V. Minh (2003). A probabilistic approach to interpreting verbal autopsies: methodology andpreliminary validation in Vietnam. Scand. J. Public Health 31(62), 32–37.[Crossref]
  • [11] Cattell, R. (1966). The scree test for the number of factors. Multivar. Behav. Res. 1(2), 245–276.[Crossref]
  • [12] Celeux, G. and G. Govaert (1991). Clustering criteria for discrete data and latent class models. J. Classif. 8, 157–176.[Crossref]
  • [13] Dundar, M.M. and D.A. Landgrebe (2004). Toward an optimal supervised classifier for the analysis of hyperspectral data.IEEE Trans. Geosci. Remote Sens. 42(1), 271–277.[Crossref]
  • [14] Fauvel, M., C. Bouveyron and S. Girard (2015). Parsimonious Gaussian process models for the classification of hyperspectralremote sensing images. IEEE Geosci. Remote Sens. Lett., to appear.
  • [15] Forbes, F. and D. Wraith (2014). A new family of multivariate heavy-tailed distributions with variable marginal amounts oftail-weight: application to robust clustering. Stat. Comp. 24(6), 971–984.[Crossref]
  • [16] Franczak, B.C., R.P. Browne and P.D. McNicholas (2014). Mixtures of shifted asymmetric Laplace distributions. IEEE Trans.Pattern Anal. Mach. Intell. 36(6), 1149–1157.[Crossref]
  • [17] Goodman, L.A and W.H. Kruskal (1954). Measures of association for cross classifications. J. Amer. Statist. Assoc. 49, 732–764.
  • [18] Goodman, L.A and W.H. Kruskal (1959). Measures of association for cross classifications II. Further discussion and references.J. Amer. Statist. Assoc. 54, 35–75.[Crossref]
  • [19] Gönen, M. and E. Alpaydin (2011). Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211–2268.
  • [20] Guermeur, Y. (2002). Combining discriminant models with new multi-class SVMs. Pattern Anal. Appl. 5(2), 168–179.
  • [21] Guermeur, Y. (2007). VC theory of large margin multi-category classifiers. J. Mach. Learn. Res. 8, 2551–2594.
  • [22] Hastie, T., R. Tibshirani and J. Friedman (2009). The Elements of Statistical Learning. Second edition. Springer, Berlin.
  • [23] Hofmann, T., B. Schölkopf and A. Smola (2008). Kernel methods in machine learning. Annals Stat. 36(3), 1171–1220.[Crossref][WoS]
  • [24] Huong, D.L., H.V. Minh and P. Byass (2003). Applying verbal autopsy to determine cause of death in rural Vietnam. Scand. J.Public Health 31(62), 19–25.[Crossref]
  • [25] LeCun, Y., L. Bottou, Y. Bengio and P. Haffner (1998). Gradient-based learning applied to document recognition. Proceedingsof IEEE 86(11), 2278–2324.[Crossref]
  • [26] Jaccard, P. (1901). Etude comparative de la distribution florale dans une portion des Alpes et du Jura. Bull. Soc. Vaudoise Sci.Nat. 37, 547–579.
  • [27] Lee, S. and G. McLachlan (2013). Finite mixtures of multivariate skew t-distributions: some recent and new results. Stat.Comp. 24(2), 181–202.[Crossref]
  • [28] Lin, T.I. (2010). Robust mixture modeling using multivariate skew t-distribution. Stat. Comp. 20, 343–356.[Crossref]
  • [29] McLachlan, G. (1992). Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York.
  • [30] McLachlan, G., D. Peel and R. Bean (2003). Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat.Data An. 41, 379–388.
  • [31] McNicholas, P. and B. Murphy (2008). Parsimonious Gaussian mixture models. Stat. Comp. 18, 285–296.[Crossref]
  • [32] Mika, S., G. Ratsch, J. Weston, B. Schölkopf and K.R. Müller (1999). Fisher discriminant analysis with kernels. In NeuralNetworks for Signal Processing IX, Y.-H. Hu, J. Larsen, E. Wilson and S. Douglas eds., 41–48. The Institute of Electrical andElectronics Engineers, Inc. New York.
  • [33] Montanari, A. and C. Viroli (2010). Heteroscedastic factor mixture analysis. Stat. Modeling 10, 441–460.
  • [34] Murphy, T.B., N. Dean and A.E. Raftery (2010). Variable selection and updating in model-based discriminant analysis forhigh dimensional data with food authenticity applications. Annals Appl. Stat. 4, 219–223.[WoS][Crossref]
  • [35] Pekalska, E. and B. Haasdonk (2009). Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Trans.Pattern Anal. Mach. Intell. 31(6), 1017–1032.[WoS][Crossref]
  • [36] Scholkopf, B. and A.J. Smola (1990). Learning with Kernels. The MIT Press, Cambridge MA.
  • [37] Seung-Seok, C., C. Sung-Hyuk and C. Tappert (2010). A survey of binary similarity and distance measures. J. Syst. Cybern.Informatics 8, 43–48.
  • [38] Shawe-Taylor, J. and N. Cristianini (2004). Kernel Methods for Pattern Analysis, Cambridge University Press.
  • [39] Reeves, B.C. and M.A. Quigley (1997). A review of data-derived methods for assigning causes of death from verbal autopsydata. Int. J. Epidemiol. 26, 1080–1089.[Crossref]
  • [40] Sneath, P.H.A. and R.R. Sokal (1973). Numerical Taxonomy: the Principles and Practice of Numerical Classification, W.H.Freeman and Company, San Francisco.
  • [41] Sylla, S., S. Girard, A. Diongue, A. Diallo and C. Sokhna (2014). Classification supervisée par modèle de mélange: Applicationaux diagnostics par autopsie verbale. 46èmes Journées de Statistique organisées par la Société Française de Statistique,Rennes.
  • [42] Tversky, A. (1977). Feature of similarity, Psychol. Rev. 84, 327–352.
  • [43] Vilca, F., N. Balakrishnan and C. Zeller (2014). Multivariate skew-normal generalized hyperbolic distribution and its properties.J. Multivar. Anal. 128, 73–85.[WoS][Crossref]
  • [44] Wang, J., J. Lee and C. Zhang (2003). Kernel trick embedded Gaussian mixture model. In Algorithmic Learning Theory,Gavalda, R., Jantke, K. P., Takimoto, E. eds., 159–174. Springer-Verlag, Berlin Heidelberg.
  • [45] Wraith, D. and F. Forbes (2015). Location and scale mixtures of Gaussians with flexible tail behaviour: properties, inferenceand application to multivariate clustering. Comput. Stat. Data An. 90, 61–73.[WoS]
  • [46] Xu, Z., K. Huang, J. Zhu, I. King and M.R. Lyu (2009). A novel kernel-based maximum a posteriori classification method.Neural Networks 22, 977–987, 2009.[WoS][Crossref]
Typ dokumentu
Identyfikator YADDA
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.