Data Set Used
telligent plagiarism when ideas are presented in different words.
We discuss the size-bias inherent in several chemical similarity coefficients when used for the similarity searching or diversity selection of compound collections. Limits to the upper bounds of 14 standard similarity coefficients are investigated, and the results are used to identify some exceptional characteristics of a few of the coefficients. An… (More)
This report explains our plagiarism detection method using fuzzy semantic-based string similarity approach. The algorithm was developed through four main stages. First is pre-processing which includes tokenisation, stemming and stop words removing. Second is retrieving a list of candidate documents for each suspicious document using shingling and Jaccard… (More)
The scoring mechanism of the text features is the unique way for determining the key ideas in the text to be presented as text summary. The treating of all text features with same level of importance can be considered the main factor causing creating a summary with low quality. In this paper, we introduced a novel text summarization model based on swarm… (More)
— As the Internet help us cross cultural border by providing different information, plagiarism issue is bound to arise. As a result, plagiarism detection becomes more demanding in overcoming this issue. Different plagiarism detection tools have been developed based on various detection techniques. Nowadays, fingerprint matching technique plays an important… (More)
The features are considered the cornerstone of text summarization. The most important issue is what feature to be considered in a text summarization process. Including all the features in the summarization process may not be considered as an optimal solution. Therefore, other methods need to be deployed. In this paper, random five features used and… (More)
___ Neural Network Model (NNM), Hidden Markov Model (HMM) and Regression Model (RM) are developed to predict the spread of dengue outbreak in Malaysia. The case study covered dengue cases data from Selangor, which include seven mukims and eight administrative districts in year of 2004 and 2005. Specific criteria concerned are location, time (weeks) and… (More)