Learn More
We discuss the size-bias inherent in several chemical similarity coefficients when used for the similarity searching or diversity selection of compound collections. Limits to the upper bounds of 14 standard similarity coefficients are investigated, and the results are used to identify some exceptional characteristics of a few of the coefficients. An(More)
This report explains our plagiarism detection method using fuzzy semantic-based string similarity approach. The algorithm was developed through four main stages. First is pre-processing which includes tokenisation, stemming and stop words removing. Second is retrieving a list of candidate documents for each suspicious document using shingling and Jaccard(More)
The features are considered the cornerstone of text summarization. The most important issue is what feature to be considered in a text summarization process. Including all the features in the summarization process may not be considered as an optimal solution. Therefore, other methods need to be deployed. In this paper, random five features used and(More)
Many different types of similarity coefficients have been described in the literature. Since different coefficients take into account different characteristics when assessing the degree of similarity between molecules, it is reasonable to combine them to further optimize the measures of similarity between molecules. This paper describes experiments in which(More)
Problem statement: The aim of automatic text summarization systems is to select the most relevant information from an abundance of text sources. A daily rapid growth of data on the internet makes the achieve events of such aim a big challenge. Approach: In this study, we incorporated fuzzy logic with swarm intelligence; so that risks, uncertainty, ambiguity(More)
  • Siriporn Chimphlee, Naomie Salim, Mohd Salihin, Bin Ngadiman, Witcha Chimphlee, Surat Srinoy
  • 2007
—Discovering user access patterns from web access log is increasing the importance of information to build up adaptive web server according to the individual user's behavior. The variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. In this paper, we present a rough set clustering to cluster web(More)