Generating Javanese Stopwords List using K-means Clustering Algorithm

Text processing in Information Retrieval (IR) requires text documents as primary data sources. However, not all words in the text document are used. Some words often appear in text documents and do not have meaning called stopword [1], stored in a stopword list called a stopword database (corpus) [2][3]. The stopword removal approach depends on this Corpus to remove unnecessary words on the text [4]. The formed word list must be in the same language [1][5]. Various stopword list has been… 

