This paper reports a new general framework of focused web crawling based on "relational subgroup discovery". Predicates are used explicitly to represent the relevance clues of those unvisited pages… (More)
2011 Second International Conference on Digital…
2011
This paper presents a new edge-counting based method using Word Net to compute the similarity. The method achieves a similarity that perfectly fits with human rating and effectively simulate the… (More)
Automatic text classification is one of the most important tools in Information Retrieval. This paper presents a novel text classifier using positive and unlabeled examples. The primary challenge of… (More)
This paper surveys the existing method of learning from positive and unlabeled examples. We divide the existing methods into three families, and review the main algorithms, respectively. The first… (More)
Many real-world classification applications fall into the class of positive and unlabeled learning problems. The existing techniques almost all are based on the two-step strategy. This paper proposes… (More)
This paper investigates a new approach for training text classifiers when only a small set of positive examples is available together with a large set of unlabeled examples. The key feature of this… (More)
Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently locating Domain-Specific Deep Web databases based… (More)