CiteSeerX: AI in a Digital Library Search Engine

  title={CiteSeerX: AI in a Digital Library Search Engine},
  author={Jian Wu and K. Williams and Hung-Hsuan Chen and Madian Khabsa and Cornelia Caragea and Suppawong Tuarob and Alexander G. Ororbia and Douglas Jordan and Prasenjit Mitra and C. Lee Giles},
  booktitle={AI Mag.},
  • Jian Wu, K. Williams, +7 authors C. Lee Giles
  • Published in AI Mag. 2014
  • Computer Science
  • CiteSeerX is a digital library search engine providing access to more than five million scholarly documents with nearly a million users and millions of hits per day. We present key AI technologies used in the following components: document classification and de-duplication, document and citation clustering, automatic metadata extraction and indexing, and author disambiguation. These AI technologies have been developed by CiteSeerX group members over the past 5–6 years. We show the usage status… CONTINUE READING
    70 Citations
    CiteSeerX: 20 years of service to scholarly big data
    • 5
    • PDF
    A Supervised Learning Approach To Entity Matching Between Scholarly Big Datasets
    • 4
    • PDF
    Information Extraction for Scholarly Document Big Data
    • 1
    • PDF
    ParsRec: A Novel Meta-Learning Approach to Recommending Bibliographic Reference Parsers
    • 7
    • PDF
    A Data Cleaning Method for CiteSeer Dataset
    • 3
    • 1
    • Highly Influenced
    • PDF


    CiteSeer x : A Scholarly Big Dataset
    • 46
    • PDF
    Scholarly big data information extraction and integration in the CiteSeerχ digital library
    • 45
    • PDF
    CiteSeer: an automatic citation indexing system
    • 789
    • PDF
    TableSeer: automatic table metadata extraction and searching in digital libraries
    • 150
    • Highly Influential
    • PDF
    A Web Service for Scholarly Big Data Information Extraction
    • 19
    • PDF
    A figure search engine architecture for a chemistry digital library
    • 24
    • PDF
    Information extraction from research papers using conditional random fields
    • 224