Maxim N. Grinev

Learn More
The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. SimRank is a simple and intuitive measure of this kind, based on a graph-theoretic model. SimRank is typically computed iteratively, in the spirit of PageRank. However, existing work on SimRank lacks accuracy estimation of(More)
We present a novel method for key term extraction from text documents. In our method, document is modeled as a graph of semantic relationships between terms of that document. We exploit the following remarkable feature of the graph: the terms related to the main topics of the document tend to bunch up into densely interconnected subgraphs or communities,(More)
Sedna is an XML database system being developed by the MODIS team at the Institute for System Programming of the Russian Academy of Sciences. Sedna implements XQuery and its data model exploiting techniques developed specially for this language. This paper describes the main choices made in the design of Sedna, sketches its most advanced techniques, and(More)
Micro-blogging is a new form of social communication that encourages users to share information about anything they are seeing or doing, the motivation facilitated by the ability to post brief text messages through a variety of devices. Twitter, the most popular micro-blogging tool, is exhibiting rapid growth [3]: up to 11% of online Americans are using(More)
The modern XML query language called XQuery includes advanced facilities both to query and to transform XML data. An XQuery query optimizer should be able to optimize any query. For “querying” queries almost all techniques inherited from SQLoriented DBMS may be applied. The XQuery transformation facilities are XML-specific and have no counterparts in other(More)
We discuss the problem of information overload in social media streams. We identify two groups of approaches to solve the problem. The first group is based on filtering social media streams. These methods are already quite mature and successfully used in practice. The second group of approaches proposes completely different paradigms for information sharing(More)
We present a native XML database management system, Sedna, which is implemented from scratch as a full-featured database management system for storing large amounts of XML data. We believe that the key contribution of this system is an improved schema-based clustering storage strategy efficient for both XML querying and updating, and powered by a novel(More)
Today it is wildly recognized that optimization based on rewriting leads to faster query execution. The role of a query rewriting grows significantly when a query defined in terms of some view is processed. Using views is a good idea for building flexible virtual data integration systems with declarative query support. At present time such systems tend to(More)
With the emergence of mobile devices constantly connected to the Internet, the nature of user-generated data has changed on most Web 2.0 sites. Today, people produce and share data more often and the lifespan of the data is shorter. Analyzing this data leads to new requirements for analytical systems: real-time processing and databaseintensive workloads.(More)