Learn More
In this article we discuss a simple hash function based upon properties of a well-known combinatorial design called quasigroups. The quasigroups are equivalent to the more familiar Latin squares and one of their most important properties is that all possible element of certain quasigroup occurs with equal probability. Actual implementations are based on(More)
Large sparse matrices play important role in many modern information retrieval methods. These methods, such as clustering, latent semantic indexing, performs huge number of computations with such matrices, thus their implementation should be very carefully designed. In this paper we discuss three implementations of sparse matrices. The first one is(More)
In this article we present a new compression method, called WLZW, which is a word-based modiication of classic LZW. The modiication is similar to the approach used in the HuuWord compression algorithm. The algorithm is two-phase, the compression ratio achieved is fairly good, on average 22%-20% (see 2, 3]). Moreover, the table of words, which is side(More)
In this study we discuss a method for evolution of quasigroups with desired properties based on genetic algorithms. Quasigroups are a well-known combinatorial design equivalent to the more familiar Latin squares. One of their most important properties is that all possible elements of certain quasigroup occur with equal probability. The quasigroups are(More)
Quasigroups are a well-known combinatorial design equivalent to more familiar Latin squares. Since all possible elements of a quasigroup occur with equal probability, it could be an interesting tool for information assurance and network security. Most of the previous implementations of quasigroups were based on look-up table of the quasigroup, on system of(More)
The paper is oriented to the problem of clustering for large datasets with high-dimensions. We propose a two-phase combined method with regard to high dimensions and exploiting the standard clustering algorithm. The first step of the method is based on the learning phase using artificial neural network, especially Self organizing map, which we find as a(More)
Approach based on clustering will be described in our paper. Basic version of our system was given in [5] allows us to expand query through special index. Hierarchical agglomerative clustering of the whole document collection generates the index. Retrieving of topic development is specific problem. Standard methods of IR does not allow us such kind of(More)
Today there are many universal compression algorithms, but in most cases is for specific data better using specific algorithm-JPEG for images, MPEG for movies, etc. For textual documents there are special methods based on PPM algorithm or methods with non-character access, e.g. word-based compression. In the past, several papers describing variants of(More)
N-grams are applied in some applications searching in text documents, especially in cases when one must work with phrases, e.g. in plagiarism detection. N-gram is a sequence of n terms (or generally tokens) from a document. We get a set of n-grams by moving a floating window from the begin to the end of the document. During the extraction we must remove(More)