Data Classification with k-fold Cross Validation and Holdout Accuracy Estimation Methods with 5 Different Machine Learning Techniques

  title={Data Classification with k-fold Cross Validation and Holdout Accuracy Estimation Methods with 5 Different Machine Learning Techniques},
  author={Kaushika Pal and Biraj V. Patel},
  journal={2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)},
  • Kaushika Pal, B. Patel
  • Published 1 March 2020
  • Computer Science
  • 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)
Classification of documents is measured in terms of accuracy by comparing the actual labels with predicted labels for the classes. There are many machine learning techniques, which can be used to build a classifier, it is almost difficult to manually predict which technique should be used for classification, especially when we are working on Indic languages and there is no reliable method, which can give good results. Such area still need to be explored by processing documents with natural… 

Figures and Tables from this paper

Emotion Detection based on Column Comments in Material of Online Learning using Artificial Intelligence
A utility cellular for the detection of emotion from column remarks in the media online, which makes use of synthetic intelligence to type textual content from remarks and to decide the emotion of college students is constructed.
Mapping Gully Erosion Variability and Susceptibility Using Remote Sensing, Multivariate Statistical Analysis, and Machine Learning in South Mato Grosso, Brazil
In Brazil, the development of gullies constitutes widespread land degradation, especially in the state of South Mato Grosso, where fighting against this degradation has become a priority for policy
Artificial intelligent based smart system for safe mining during foggy weather
An intelligent vision enhancement system for continuing opencast mining operations during foggy weather that integrates hardware and software to provide multistage safety features that make it unique from existing systems is dealt with.
Sky Imager-Based Forecast of Solar Irradiance Using Machine Learning
Compared to the state-of-the-art computationally heavy algorithms proposed in the literature, this approach achieves competitive results with much less computational complexity for both nowcasting and forecasting up to 4 h ahead of time.
3D PBV-Net: An automated prostate MRI data segmentation method
SStaGCN: Simplified stacking based graph convolutional networks
This paper proposes a novel GCN called SStaGCN (Simplified stacking based GCN) by utilizing the ideas of stacking and aggregation, which is an adaptive general framework for tackling heterogeneous graph data.
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
Turning a large quantity of fly ash (FA) and bottom ash (BA) into unfired solid bricks is the objective of this study. Five brick mixtures were designed with a constant water-to-binder ratio of 0.35.


Multiclass patent document classification
The obstacles associated with the imbalanced data were mitigated by adding pseudo-synthetic data wherever appropriate, which resulted in a superior SVM classifier based model.
Supervised learning Methods for Bangla Web Document Categorization
Empirical results support that all four methods for categorization of Bangla documents produce satisfactory performance with SVM attaining good result in terms of high dimensional and relatively noisy document feature vectors.
An Efficient Hindi Text Classification Model Using SVM
A Hindi Text Classification model is proposed, which accepts a set of known Hindi documents, preprocesses them at document, sentence and word levels, extracts features, and trains SVM classifier, which further classifies aSet of Hindi unknown documents.
Punjabi Poetry Classification: The Test of 10 Machine Learning Algorithms
Results for Punjabi poetry classification revealed that 4 machine learning algorithms namely, Hyperpipes (HP), K- nearest neighbour (KNN), Naive Bayes (NB) and Support Vector Machine (SVM) with an accuracy of 50.63 %, 52.75 % and 58.79 % respectively, outperformed all other machinelearning algorithms under the test.
Classification of children stories in hindi using keywords and POS density
This paper is proposing a framework for story classification using keyword and Part-of-speech (POS) based features for Hindi stories into three genres: fable, folk-tale and legend.
Now a day’s managing a vast amount of documents in digital forms is very important in text mining applications. Text categorization is a task of automatically sorting a set of documents into
Children story classification based on structure of the story
  • M. HarikrishnaD., K. S. Rao
  • Computer Science
    2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
  • 2015
The main part of the stories has the highest classification accuracy compared to introduction and climax parts of the story, and a framework for story classification using keyword and Part-of-speech (POS) based features is proposed.
Automated Analysis of Bangla Poetry for Classification and Poet Identification
This work makes use of semantic (word) features to perform subject-based classification of Bangla poems, and various stylistic as well as semantic features for poet identification, and uses a Multiclass SVM classifier to classify Tagore’s collection of poetry into four categories.
The survey on the data mining algorithms and the techniques that could be employed with the intelligent computing system are presented, presenting a basic conception of the datamining along with the prominent algorithms of theData mining and the classification of its techniques.
Model for Classification of Poems in Hindi Language Based on Ras
The developed model will classify poem into Shringar, Hasya, Adbhuta, Shanta, Raudra, Veera, Karuna, Bhayanaka, Vibhasta rasas, which will use mix of part-of-speech-based feature and emotional