Extraction of Structure and Content 123 from the Edgar Database : A Template-Based Approach

  title={Extraction of Structure and Content 123 from the Edgar Database : A Template-Based Approach},
  author={Yu Cong and Miklos A. Vasarhelyi and Alexander Kogan},
This paper presents a template-based approach to extract data from the EDGAR database. A set of heuristic-based templates is used to configure the trainable system in order to have one type of EDGAR filings processed in a single configuration. Such configurability is highly desirable as it adds expendability and flexibility to this system. The template-based approach also enables the system to extract both structural information and content from the filings in the EDGAR database. The ability to… CONTINUE READING

From This Paper

Figures, tables, results, connections, and topics extracted from this paper.
4 Extracted Citations
25 Extracted References
Similar Papers

Referenced Papers

Publications referenced by this paper.
Showing 1-10 of 25 references

On the Structure of Financial Accounting Standards to Support Digital Representation, Storage, and Retrieval

  • I. E. Fisher
  • Journal of Emerging Technologies in Accounting. 1…
  • 2004
Highly Influential
7 Excerpts

A Technical Overview of the EdgarScan System

  • PWC Tech Center.
  • http://edgarscan.pwcglobal.com/EdgarScan…
  • 2003

Document Analysis: Table Structure Understanding and Zone Content Classification

  • Y. L. Wang
  • Ph.D. Dissertation, University of Washington.
  • 2002
2 Excerpts

Similar Papers

Loading similar papers…