A complete algorithm is presented for the sharpening of imprecise information, based on the methodology of kernel estimators and the Bayes decision rule, including conditioning factors. The use of the Bayes rule with a nonsymmetrical loss function enables the inclusion of different results of an under- and overestimation of a sharp value (real number), as well as minimizing potential losses. A conditional approach allows to obtain a more precise result thanks to using information entered as the assumed (e.g. current) values of conditioning factors of continuous andor binary types. The nonparametric methodology of statistical kernel estimators freed the investigated procedure from arbitrary assumptions concerning the forms of distributions characterizing both imprecise information and conditioning random variables. The concept presented here is universal and can be applied in a wide range of tasks in contemporary engineering, economics, and medicine.
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The paper deals with the issue of reducing the dimension and size of a data set (random sample) for exploratory data analysis procedures. The concept of the algorithm investigated here is based on linear transformation to a space of a smaller dimension, while retaining as much as possible the same distances between particular elements. Elements of the transformation matrix are computed using the metaheuristics of parallel fast simulated annealing. Moreover, elimination of or a decrease in importance is performed on those data set elements which have undergone a significant change in location in relation to the others. The presented method can have universal application in a wide range of data exploration problems, offering flexible customization, possibility of use in a dynamic data environment, and comparable or better performance with regards to the principal component analysis. Its positive features were verified in detail for the domain's fundamental tasks of clustering, classification and detection of atypical elements (outliers).
4
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The aim of this paper is to provide a gradient clustering algorithm in its complete form, suitable for direct use without requiring a deeper statistical knowledge. The values of all parameters are effectively calculated using optimizing procedures. Moreover, an illustrative analysis of the meaning of particular parameters is shown, followed by the effects resulting from possible modifications with respect to their primarily assigned optimal values. The proposed algorithm does not demand strict assumptions regarding the desired number of clusters, which allows the obtained number to be better suited to a real data structure. Moreover, a feature specific to it is the possibility to influence the proportion between the number of clusters in areas where data elements are dense as opposed to their sparse regions. Finally, the algorithm-by the detection of oneelement clusters-allows identifying atypical elements, which enables their elimination or possible designation to bigger clusters, thus increasing the homogeneity of the data set.
5
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW