Extracting Attributes and Their Values from Web Pages

@inproceedings{Yoshida2002ExtractingAA,
  title={Extracting Attributes and Their Values from Web Pages},
  author={Minoru Yoshida},
  year={2002}
}
We propose a method for extracting attributes and their values from Web pages. Our method makes use of word distributions estimated from plain Web pages. The key idea is to estimate word distribution by consulting ontologies built from HTML tables. In a series of experiments, we show that estimated word distributions are useful for extracting attributes and their values in various kinds of HTML representations other than tables. 
Highly Cited
This paper has 17 citations. REVIEW CITATIONS