Finding Structure and Characteristics of Web Documents for Classification


Many Web documents containing the same type of information , would have similar structure. In this paper, we examine the problem of nding the structure of web documents and present a hierarchical structure to represent the relation among text data in the web documents. Due to the loose standard of web page publishing, diierent authors can use diierent… (More)


