The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. SimRank is a simple and intuitive measure of this kind, based on graph-theoretic model. SimRank is typically computed iteratively, in the spirit of PageRank. However , existing work on SimRank lacks accuracy estimation of… (More)
Sedna is an XML database system being developed by the MODIS team at the Institute for System Programming of the Russian Academy of Sciences. Sedna implements XQuery and its data model exploiting techniques developed specially for this language. This paper describes the main choices made in the design of Sedna, sketches its most advanced techniques, and… (More)
We present a novel method for key term extraction from text documents. In our method, document is modeled as a graph of semantic relationships between terms of that document. We exploit the following remarkable feature of the graph: the terms related to the main topics of the document tend to bunch up into densely interconnected subgraphs or communities,… (More)
The modern XML query language called XQuery includes advanced facilities both to query and to transform XML data. An XQuery query optimizer should be able to optimize any query. For " querying " queries almost all techniques inherited from SQL-oriented DBMS may be applied. The XQuery transformation facilities are XML-specific and have no counterparts in… (More)
Today it is wildly recognized that optimization based on rewriting leads to faster query execution. The role of a query rewriting grows significantly when a query defined in terms of some view is processed. Using views is a good idea for building flexible virtual data integration systems with declarative query support. At present time such systems tend to… (More)
Micro-blogging is a new form of social communication that encourages users to share information about anything they are seeing or doing, the motivation facilitated by the ability to post brief text messages through a variety of devices. Twitter, the most popular micro-blogging tool, is exhibiting rapid growth : up to 11% of online Americans are using… (More)
With the emergence of mobile devices constantly connected to the Internet, the nature of user-generated data has changed on most Web 2.0 sites. Today, people produce and share data more often and the lifespan of the data is shorter. Analyzing this data leads to new requirements for analytical systems: real-time processing and database-intensive workloads.… (More)
We demonstrate Blognoon, a semantic blog search engine with the focus on topic exploration and navigation. Blognoon provides concept search instead of traditional keywords search and improves ranking by identifying main topics of posts. It enhances navigation over the Blogosphere with faceted interfaces and recommendations.