CONTENTS Introduction.......................................................................................................................................5 1. Divergence of probability measures................................................................................................8 1.1. Divergence of probability measures connected with two-class classification problems...............8 1.2. Concentration curve and its link with the Neyman-Pearson curve.............................................10 1.3. Divergence ordering $⪯_{NP}$.................................................................................................11 2. Link between divergence and inequality..........................................................................................13 2.1. Initial inequality axioms..............................................................................................................13 2.2. The Lorenz curve for nonnegative random variables..................................................................14 2.3. Inequality ordering $⪯_{L}$......................................................................................................15 2.4. Inequality versus divergence......................................................................................................17 2.5. Ratio variables...........................................................................................................................19 3. Link between divergence and dependence.....................................................................................20 3.1. Preliminary remarks...................................................................................................................20 3.2. Dependence ordering $⪯_{D}$................................................................................................22 3.3. Orderings related to $⪯_{D}$...................................................................................................22 4. Link between divergence and proportional representation.............................................................24 4.1. Formulation of the problem and definition of the ordering $⪯_{x}$..........................................24 4.2. Minimal elements for $⪯_{x}$...................................................................................................26 4.3. Maximal elements for $⪯_{x}$..................................................................................................29 5. Directed concentration of probability measures.............................................................................30 5.1. Directed concentration curve....................................................................................................30 5.2. Grade transformation of a random variable...............................................................................34 5.3. Correlation and ratio curves......................................................................................................35 5.4. Directed departure from proportionality....................................................................................40 6. Numerical measures relating to divergence....................................................................................42 6.1. Numerical inequality measures..................................................................................................42 6.2. Numerical measures of divergence............................................................................................44 6.3. Numerical measures of directed divergence..............................................................................45 6.4. Numerical measures of dependence..........................................................................................47 6.5. Numerical measures of departures from proportional representation.......................................49 References........................................................................................................................................51 Index of symbols................................................................................................................................54
Institute of Computer Science, Polish Academy of Sciences, P.O. Box 22, J. Ordona 21, 01-237 Warszawa, Poland
Bibliografia
S. M. Ali and S. D. Silvey (1965), Association between random variables and the dispersion of a Radon-Nikodym derivative, J. Roy. Statist. Soc. Ser. B, 27, 100-107.
S. M. Ali and S. D. Silvey (1965), A further result on the relevance of the dispersion of a Radon-Nikodym derivative to the problem of measuring association, ibid., 108-110.
S. M. Ali and S. D. Silvey (1966), A general class of coefficients of divergence of one distribution from another, ibid., 28, 131-142.
B. C. Arnold (1987), Majorization and the Lorenz Order: a Brief Introduction, Lecture Notes in Statist. 43, Dekker.
M. Baliński and H. P. Young (1982), Fair Representation, Yale Univ. Press, New Haven.
R. C. Blitz and J. A. Brittain (1964), An extension of the Lorenz diagram to the correlation of two variables, Metron 23 (1964), 137-143.
H. Block, A. Sampson and T. Savits (eds.) (1990), Topics in Statistical Dependence, IMS Lecture Notes Monograph Ser., Inst. Math. Statist., Hayward.
Z. Bondarczuk, T. Kowalczyk, E. Pleszczyńska and W. Szczesny (1994), Evaluating departures from fair representation, Appl. Stochastic Models Data Anal., to appear.
T. Bromek, T. Kowalczyk and E. Pleszczyńska (1988), Measurement scales in evaluation of stochastic dependence, in: S. Das Gupta and J. K. Ghosh (eds.), Proc. Internat. Conf. on Advances in Multivariate Statistical Analysis, Indian Statistical Institute, Calcutta, 83-96.
T. Bromek and T. Kowalczyk (1990), A decision approach to ordering stochastic dependence, in: A. Sampson (ed.), Topics in Statistical Dependence, IMS Lecture Notes Monograph Ser., Inst. Math. Statist., Hayward, 103-109.
M. Chandra and N. D. Singpurwalla (1981), Relationship between some notions which are common to reliability and economics, Math. Oper. Res. 6, 113-121.
D. M. Cifarelli and E. Regazzini (1987), On a general definition of concentration function, Sankhyā Ser. B 49, 307-319.
A. Ciok, T. Kowalczyk, E. Pleszczyńska and W. Szczesny (1994), Inequality measures in data analysis, Archiwum Informatyki Teoretycznej i Stosowanej, to appear.
A. Ciok, T. Kowalczyk and W. Szczesny (1992), Comparing methods of fair representation, IPI PAN, preprint, 718.
O. D. Duncan and B. Duncan (1955), A methodological analysis of segregation indexes, Amer. Sociological Rev., 210-217.
J. Fellman (1976), The effect of transformations on Lorenz curves, Econometrica 44 (4), 823-824.
G. S. Fields and J. C. H. Fey (1978), On inequality comparisons, Econometrica 46, 303-316.
S. Fogelson (1933), Miary koncentracji i ich zastosowania [Measures of concentration and their applications], Kwart. Statyst. 10(1), 149-197.
J. E. Foster (1985), Inequality measurement, in: Proc. Sympos. Appl. Math. 33, 31-68.
V. Gafrikova and T. Kowalczyk (1994), Links between measuring divergence and inequality, Metron, to appear.
D. M. Grove (1980), A test of independence against a class of ordered alternatives in a 2 × C contingency table, J. Amer. Statist. Assoc. 75, 454-459.
H. Joe (1985), An ordering of dependence for contingency tables, Linear Algebra Appl. 70, 89-103.
H. Joe (1987), Majorization, randomness and dependence for multivariate distribution, Ann. Probab. 15, 1217-1225.
H. Joe (1990), Majorization and divergence, J. Math. Anal. Appl. 148, 287-305.
B. Klefsjö (1984), Reliability interpretations of some concepts from economics, Naval Res. Logist. 31, 301-308.
T. Kowalczyk (1977), General definition and sample counterparts of monotonic dependence functions of bivariate distributions, Math. Oper. Statist. Ser. Statist. 8, 351-365.
T. Kowalczyk (1990), On measuring heterogeneity in m × k contingency tables, in: Proc. DIANA III, Conference of Discriminant Analysis and Other Methods of Data Classification, Bechyne, 111-121.
T. Kowalczyk and J. Mielniczuk (1990), Neyman-Pearson curves, properties and estimation, preprint 683, IPI PAN.
T. Kowalczyk and E. Pleszczyńska (1977), Monotonic dependence functions of bivariate distributions, Ann. Statist. 5, 1221-1227.
T. Kowalczyk, E. Pleszczyńska and W. Szczesny (1991), Evaluation of stochastic dependence, in: Statistical Inference: Theory and Practice, Theory Decis. Lib. Ser. B: Math. Statist. Methods 17, Reidel, 106-136.
E. L. Lehmann (1959), Testing Statistical Hypotheses, Wiley, New York.
E. L. Lehmann (1966), Some concepts of dependence, Ann. Math. Statist. 37, 1137-1153.
R. Lerman and S. Yitzaki (1984), A note on the calculation and interpretation of the Gini index, Econom. Lett. 15, 363-368.
C. E. Rao (1982), Diversity and dissimilarity coefficients: a unified approach, Theoret. Population Biol. 21, 24-43.
A. Raveh (1989), Gini correlation as a measure of monotonicity and two of its usages, Comm. Statist. Theory Methods 18 (4), 1415-1423.
E. Regazzini (1990), Concentration comparisons between probability measures, Instituto per le Applicazioni della Matematica e dell'Informatica, preprint 90.15, Milano.
M. Scarsini (1990), An ordering of dependence, in: A. Sampson (ed.), Topics in Statistical Dependence, IMS Lecture Notes Monograph Ser., Inst. Math. Statist., Hayward, 403-414.
E. Schechtman and S. Yitzaki (1987), A measure of association based on Gini's mean difference, Comm. Statist. Theory Methods 16 (1), 207-231.
W. Szczesny (1991), On the performance of a discriminant function, J. Classification 8, 201-215.
T. Taguchi (1987), On the structure of multivariate concentration - some relationships among the concentration surface and two variate mean difference and regressions, Comput. Statist. Data Anal. 6, 307-334.
N. White (1986), Segregation and diversity measures in population distribution, Population Index 52, 198-221.