Juan Pablo Fernández Ramírez

Learn More
SeerSuite is a framework for scientific and academic digital libraries and search engines built by crawling scientific and academic documents from the web with a focus on providing reliable, robust services. In addition to full text indexing, SeerSuite supports autonomous citation indexing and automatically links references in research articles to(More)
We present a preliminary study of the evolution of a crawling strategy for an academic document search engine, in particular CiteSeerX. CiteSeerX actively crawls the web for academic and research documents primarily in computer and information sciences, and then performs unique information extraction and indexing extracting information such as OAI metadata,(More)
The CiteSeer x digital library stores and indexes research articles in Computer Science and related fields. Although its main purpose is to make it easier for researchers to search for scientific information, CiteSeer x has been proven as a powerful resource in many data mining , machine learning and information retrieval applications that use The metadata(More)
Cyberinfrastructure or e-science has become crucial for scientific progress and open source systems have greatly facilitated design and implementation. In chemistry, the growth of data has been explosive and timely and effective information and data access is critical. We discuss our Chem X Seer (funded by NSF Chemistry) architecture, a portal and search(More)
We report on the Gaussian file search system designed as part of the ChemXSeer digital library. Gaussian files are produced by the Gaussian software [4], a software package used for calculating molecular electronic structure and properties. The output files are semi-structured, allowing relatively easy access to the Gaussian attributes and metadata. Our(More)
  • 1