A Modified Tripartite Model for Document Representation in Internet Sociology

  title={A Modified Tripartite Model for Document Representation in Internet Sociology},
  author={Mikhail Alexandrov and Vera Danilova and Xavier Blanco},
Seven years ago Peter Mika (Yahoo! Research) proposed a tripartite model of actors, concepts and instances for document representation in the study of social networks. We propose a modified model, where instead of document authors we consider textual mentions of persons and institutions as actors. This representation proves to be more appropriate for the solution of a range of Internet Sociology tasks. In the paper we describe experiments with the modified model and provide some background on… 
1 Citations
Courses Select Textbooks: Comparison of Two Methods
Two IR methods are shown: a spreading activation method (SAM) using semantic network related to textbooks, and a coverage-based method (CBM) using a simple formal comparison of vocabularies using the criterion of term specificity for building the vocabulary of textbooks and the normalized measure of network activation.


Socio-Political Event Extraction Using a Rule-Based Approach
The aim of the present work is to test a group of predefined patterns and rules to obtain sets of automatically filled scenario templates for socio-political events case study: protests and to apply clustering algorithms.
Evaluation of thematic structure of multidisciplinary documents and document flows
The technology was implemented in a system document recognizer that solves the following tasks: evaluation of contribution of each domain to a document; distribution of document flow by the domains; and selection of a possible leader (most representative document) in each group.
An Approach to Clustering Abstracts
The preliminary experiments show that abstracts cannot be clustered with the same quality as full texts, though the achieved quality is adequate for many applications; accordingly, Makagonov's proposal that digital libraries should provide document images of full texts of the papers (and not only abstracts) for open access via Internet, in order to help in search, classification, clustering, selection, and proper referencing of the books.
Use of a Weighted Topic Hierarchy for Document Classification
A statistical method of document classification driven by a hierarchical topic dictionary using a dictionary with a simple structure and is insensible to inaccuracies in the dictionary is proposed.
On the Nature of Structure and Its Identification
A new and lucid structure measure, the so-called weighted partial connectivity, Λ, whose maximization defines a graph's structure is introduced, which results in a new splitting theorem concerning the well-known minimum cut splitting measure.
A Survey of Multilingual Event Extraction from Text
This paper focuses on language-specific event type identification methods for mono- and multilingual detection of socio-political events, and describes the systems that cover this functionality.
Analysis of Clustering Algorithms for Web-Based Search
This paper presents results of a comprehensive analysis of clustering algorithms in connection with document categorization, relating to exemplar-based, hierarchical, and density-based clusteringgorithms.
Introduction to information retrieval
This groundbreaking new textbook teaches web-era information retrieval, including web search and the related areas of text classification and text clustering from basic concepts from a computer science perspective by three leading experts in the field.
Pattern Recognition and Machine Learning
Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.