Joint optimization of wrapper generation and template detection

@inproceedings{Zheng2007JointOO,
  title={Joint optimization of wrapper generation and template detection},
  author={Shuyi Zheng and Ruihua Song and Ji-Rong Wen and Di Wu},
  booktitle={KDD},
  year={2007}
}
Many websites have large collections of pages generated dynamically from an underlying structured source like a database. The data of a category are typically encoded into similar pages by a common script or template. In recent years, some value-added services, such as comparison shopping and vertical search in a specific domain, have motivated the research of extraction technologies with high accuracy. Almost all previous works assume that input pages of a wrapper induction system conform to a… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 53 CITATIONS

Reconstruction of web forms for efficient web search

  • 2009 Proceeding of International Conference on Methods and Models in Computer Science (ICM2CS)
  • 2009
VIEW 6 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Querying Large Collections of Semistructured Data

VIEW 4 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

TEXT: Automatic Template Extraction from Heterogeneous Web Pages

  • IEEE Transactions on Knowledge and Data Engineering
  • 2011
VIEW 4 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Web Data Extraction Based on Tree Structure Analysis and Template Generation

  • 2010 International Conference on E-Product E-Service and E-Entertainment
  • 2010
VIEW 4 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Incorporating site-level knowledge to extract structured data from web forums

  • WWW
  • 2009
VIEW 4 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2007
2019

CITATION STATISTICS

  • 6 Highly Influenced Citations

References

Publications referenced by this paper.