In this paper we introduce a method of classification based on data probes. Data points are considered as point masses in space and a probe is simply a particle that is launched into the space. As the probe passes by data clusters, its trajectory will be influenced by the point masses. We use this information to help us to find vertical trajectories. These are trajectories in the input space that are mapped onto the same value in the output space and correspond to the data classes.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
This paper presents an application of methods from the machine learning domain to solving the task of DNA sequence recognition. We present an algorithm that learns to recognize groups of DNA sequences sharing common features such as sequence functionality. We demonstrate application of the algorithm to find splice sites, i.e., to properly detect donor and acceptor sequences. We compare the results with those of reference methods that have been designed and tuned to detect splice sites. We also show how to use the algorithm to find a human readable model of the IRE (Iron-Responsive Element) and to find IRE sequences. The method, although universal, yields results which are of quality comparable to those obtained by reference methods. In contrast to reference methods, this approach uses models that operate on sequence patterns, which facilitates interpretation of the results by humans.
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The paper presents an application of rough sets and statistical methods to feature reduction and pattern recognition. The presented description of rough sets theory emphasizes the role of rough sets reducts in feature selection and data reduction in pattern recognition. The overview of methods of feature selection emphasizes feature selection criteria, including rough set-based methods. The paper also contains a description of the algorithm for feature selection and reduction based on the rough sets method proposed jointly with Principal Component Analysis. Finally, the paper presents numerical results of face recognition experiments using the learning vector quantization neural network, with feature selection based on the proposed principal components analysis and rough sets methods.
The paper is focused on the problem of multi-class classification of composite (piecewise-regular) objects (e.g., speech signals, complex images, etc.). We propose a mathematical model of composite object representation as a sequence of independent segments. Each segment is represented as a random sample of independent identically distributed feature vectors. Based on this model and a statistical approach, we reduce the task to a problem of composite hypothesis testing of segment homogeneity. Several nearest-neighbor criteria are implemented, and for some of them the well-known special cases (e.g., the Kullback-Leibler minimum information discrimination principle, the probabilistic neural network) are highlighted. It is experimentally shown that the proposed approach improves the accuracy when compared with contemporary classifiers.
5
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
We give a special normal form for a non-semiquadratic hyperbolic CR-manifold M of codimension 2 in ℂ⁴, i.e., a construction of coordinates where the equation of M satisfies certain conditions. The coordinates are determined up to a linear coordinate change.
6
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The feature selection problem often occurs in pattern recognition and, more specifically, classification. Although these patterns could contain a large number of features, some of them could prove to be irrelevant, redundant or even detrimental to classification accuracy. Thus, it is important to remove these kinds of features, which in turn leads to problem dimensionality reduction and could eventually improve the classification accuracy. In this paper an approach to dimensionality reduction based on differential evolution which represents a wrapper and explores the solution space is presented. The solutions, subsets of the whole feature set, are evaluated using the k-nearest neighbour algorithm. High quality solutions found during execution of the differential evolution fill the archive. A final solution is obtained by conducting k-fold crossvalidation on the archive solutions and selecting the best one. Experimental analysis is conducted on several standard test sets. The classification accuracy of the k-nearest neighbour algorithm using the full feature set and the accuracy of the same algorithm using only the subset provided by the proposed approach and some other optimization algorithms which were used as wrappers are compared. The analysis shows that the proposed approach successfully determines good feature subsets which may increase the classification accuracy.
7
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
We classify the maximal irreducible periodic subgroups of PGL(q, $$ \mathbb{F} $$ ), where $$ \mathbb{F} $$ is a field of positive characteristic p transcendental over its prime subfield, q = p is prime, and $$ \mathbb{F} $$ × has an element of order q. That is, we construct a list of irreducible subgroups G of GL(q, $$ \mathbb{F} $$ ) containing the centre $$ \mathbb{F} $$ ×1q of GL(q, $$ \mathbb{F} $$ ), such that G/$$ \mathbb{F} $$ ×1q is a maximal periodic subgroup of PGL(q, $$ \mathbb{F} $$ ), and if H is another group of this kind then H is GL(q, $$ \mathbb{F} $$ )-conjugate to a group in the list. We give criteria for determining when two listed groups are conjugate, and show that a maximal irreducible periodic subgroup of PGL(q, $$ \mathbb{F} $$ ) is self-normalising.
In this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process.
9
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The paper presents a new approach to fuzzy classification in the case of missing data. Rough-fuzzy sets are incorporated into logical type neuro-fuzzy structures and a rough-neuro-fuzzy classifier is derived. Theorems which allow determining the structure of the rough-neuro-fuzzy classifier are given. Several experiments illustrating the performance of the roughneuro-fuzzy classifier working in the case of missing features are described.
We introduce a new n-ary λ similarity classifier that is based on a new n-ary λ-averaging operator in the aggregation of similarities. This work is a natural extension of earlier research on similarity based classification in which aggregation is commonly performed by using the OWA-operator. So far λ-averaging has been used only in binary aggregation. Here the λ-averaging operator is extended to the n-ary aggregation case by using t-norms and t-conorms. We examine four different n-ary norms and test the new similarity classifier with five medical data sets. The new method seems to perform well when compared with the similarity classifier.
11
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
DNA microarrays provide a new technique of measuring gene expression, which has attracted a lot of research interest in recent years. It was suggested that gene expression data from microarrays (biochips) can be employed in many biomedical areas, e.g., in cancer classification. Although several, new and existing, methods of classification were tested, a selection of proper (optimal) set of genes, the expressions of which can serve during classification, is still an open problem. Recently we have proposed a new recursive feature replacement (RFR) algorithm for choosing a suboptimal set of genes. The algorithm uses the support vector machines (SVM) technique. In this paper we use the RFR method for finding suboptimal gene subsets for tumornormal colon tissue classification. The obtained results are compared with the results of applying other methods recently proposed in the literature. The comparison shows that the RFR method is able to find the smallest gene subset (only six genes) that gives no misclassifications in leave-one-out cross-validation for a tumornormal colon data set. In this sense the RFR algorithm outperforms all other investigated methods.
12
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Evolutionary computation is a discipline that has been emerging for at least 40 or 50 years. All methods within this discipline are characterized by maintaining a set of possible solutions (individuals) to make them successively evolve to fitter solutions generation after generation. Examples of evolutionary computation paradigms are the broadly known Genetic Algorithms (GAs) and Estimation of Distribution Algorithms (EDAs). This paper contributes to the further development of this discipline by introducing a new evolutionary computation method based on the learning and later simulation of a Bayesian classifier in every generation. In the method we propose, at each iteration the selected group of individuals of the population is divided into different classes depending on their respective fitness value. Afterwards, a Bayesian classifier-either naive Bayes, seminaive Bayes, tree augmented naive Bayes or a similar one-is learned to model the corresponding supervised classification problem. The simulation of the latter Bayesian classifier provides individuals that form the next generation. Experimental results are presented to compare the performance of this new method with different types of EDAs and GAs. The problems chosen for this purpose are combinatorial optimization problems which are commonly used in the literature.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.