Christian Buchta

Learn More
A maximal vector of a set ~s one which is not less than any other vector m all components We derive a recurrence relation for computing the average number of maxunal vectors m a set of n vectors m d-space under the assumpUon that all (nl) a relative ordermgs are equally probable. Solving the recurrence shows that the average number of maxmaa is O((ln n)(More)
Write ~.-"d for the set of all convex bodies (convex compact sets with nonempty interior) in ~d. Define o@g~l d as the set of those K E 5 b "~d with vol K = 1. Fix K E .~g-i d and choose points X l , . . . , x~ E K randomly, independently, and according to the uniform distribution on K. Then K,~ = c o n v ( x l , . . . , xn} is a random polytope in K .(More)
Seriation, i.e., finding a linear order for a set of objects given data and a loss or merit function, is a basic problem in data analysis. Caused by the problem’s combinatorial nature, it is hard to solve for all but very small sets. Nevertheless, both exact solution methods and heuristics are available. In this paper we present the package seriation which(More)
Mining frequent itemsets and association rules is a popular and well researched approach for discovering interesting relationships between variables in large databases. The R package arules presented in this paper provides a basic infrastructure for creating and manipulating input data sets and for analyzing the resulting itemsets and rules. The package(More)
This paper describes the ecosystem of R add-on packages developed around the infrastructure provided by the package arules. The packages provide comprehensive functionality for analyzing interesting patterns including frequent itemsets, association rules, frequent sequences and for building applications like associative classification. After discussing the(More)
We introduce a generic simulation framework suitable for agent-based simulations featuring the support of heterogeneous agents, hierarchical scheduling and flexible specification of design parameters. One key aspect of this framework is the design specification: we use an XML-based format which is simple-structured yet still enables the design of flexible(More)
Identifying the language used will typically be the first step in most natural language processing tasks. Among the wide variety of language identification methods discussed in the literature, the ones employing the Cavnar and Trenkle (1994) approach to text categorization based on character n-gram frequencies have been particularly successful. This paper(More)