Tetsuya Nakatoh

Learn More
Search results generated by searchable databases are served dynamically and far larger than the static documents on the Web. These results pages have been referred to as the Deep Web. We need to extract the target data in results pages to integrate them on different searchable databases. We propose a test bed for information extraction from search results.(More)
A Deep Web wrapper is a program that extracts contents from search results. We propose a new automatic wrapper generation algorithm which discovers a repetitive pattern from search results. The repetitive pattern is expressed by token sequences which consist of HTML tags, plain texts and wild-cards. The algorithm applies a string matching with mismatches to(More)