Automatic metadata generation using associative networks

  title={Automatic metadata generation using associative networks},
  author={Michael A. Rodriguez and Johan Bollen and Herbert Van de Sompel},
In spite of its tremendous value, metadata is generally sparse and incomplete, thereby hampering the effectiveness of digital information services. [] Key Method The proposed method operates through two distinct phases. Occurrence and cooccurrence algorithms first generate an associative network of repository resources leveraging existing repository metadata.

Figures and Tables from this paper

Scholar metadata and knowledge generation with human and artificial intelligence
This article proposes innovative and economic methods of generating knowledge‐based structural metadata (structural keywords) using a combination of natural language processing‐based machine‐learning techniques and human intelligence.
Web-based citation parsing, correction and augmentation
B BibAll is proposed, which is capable of correcting the parsing results of content-based CME methods and augmenting citation metadata by leveraging relevant bibliographic data from digital repositories and cited-by publications on the Web.
Scientific Referential Metadata Creation with Information Retrieval and Labeled Topic Modeling
Evaluation results show that the cyberlearning referential metadata retrieved via meta-search and statistical relevance ranking can effectively help students better understand the essence of scientific keywords and publications.
An interactive metadata model for structural, descriptive, and referential representation of scholarly output
The ScholarWiki system utilizes machine‐learning techniques that can automatically produce self‐enhanced metadata by learning from the structural metadata that scholars contribute, which will add intelligence to enhance and update automatically the publication of metadata Wiki pages.
Experimenting with tagging and context for collaborative MPEG-7 metadata
A detailed analysis of the use of the multimedia tagging tools used in the experiment is contributed to show the relationships between user behaviours, resultant outcomes of these behaviours, and subsequent implications for future collaborative multimedia MPEG-7 tagging tools.
Automatic Metadata Generation for Fish Specimen Image Collections
The ability of computational methods to enhance the digital library services associated with the tens of thousands of digitized specimens stored in open-access repositories world-wide is demonstrated.
Generating metadata for cyberlearning resources through information retrieval and meta-search
Evaluation results show that the cyberlearning referential metadata retrieved via meta‐search and statistical relevance ranking can help students better understand the essence of scientific keywords and publications.
An Evidential Path Logic for Multi-Relational Networks
This article presents a non-bivalent, non-axiomatic, evidential logic and reasoner that is an algebraic ring over a multi-relational network and two binary operations that can be composed to perform various forms of inference.
Decentralized Information Dissemination in Multidimensional Semantic Social Overlays
An algorithm for information dissemination in decentralized networks which only uses local information - that is information associated with a node and its neighbors for disseminating a piece of information with a goal of maximum coverage and minimum spamming is proposed and investigated.
Publisher Names in Bibliographic Data: An Experimental Authority File and a Prototype Application
A project to build a database of authorized names for major publishers worldwide, using ISBN prefix data to cluster bibliographic records by publisher; the resulting database contains thousands of variant forms of each publisher's name and data about their publishing output.


Folksonomies-Cooperative Classification and Communication Through Shared Metadata
This paper examines user-generated metadata as implemented and applied in two web services designed to share and organize digital media to better understand grassroots classification. Metadata data
Knowledge-based metadata extraction from PostScript files
A system, based on a novel spatial/visual knowledge principle, for extracting metadata from scientific papers stored as PostScript files that embeds the general knowledge about the graphical layout of a scientific paper to guide the metadata extraction process.
Metadata Extraction and Harvesting
The conclusion is that integrating extraction of harvesting methods will be the best approach to creating optimal metadata, and more research is needed to identify when to apply which method.
Metadata propagation in the Web using co-citations
A semi-automatic method for propagating metadata that selects a reduced number of documents to be manually qualified and propagates the given metadata values to the other documents belonging to the same cluster.
Automatic document metadata extraction using support vector machines
It is found that discovery and use of the structural patterns of the data and domain based word clustering can improve the metadata extraction performance and an appropriate feature normalization also greatly improves the classification performance.
Automatic Metadata Generation forWeb Pages Using a Text Mining Approach
A machine learning approach to automatically generate semantic metadata for Web pages that adopts the self-organizing map algorithm to cluster training Web pages and conducts a text mining process to discover some semantic descriptions about the Web pages.
A dynamic feature generation system for automated metadata extraction in preservation of digital materials
  • S. Mao, Jongwoo Kim, G. Thoma
  • Computer Science
    First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings.
  • 2004
A dynamic feature updating system is described in which the features used for labeling a current journal issue are generated from previous issues with similar layout style, which can adapt to possible style variations among different issues of the same journal.
Searching the web by constrained spreading activation
Grammar-based random walkers in semantic networks
Usage patterns of collaborative tagging systems
A dynamic model of collaborative tagging is presented that predicts regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given URL.