Hierarchical Semi-supervised Classification with Incomplete Class Hierarchies

  title={Hierarchical Semi-supervised Classification with Incomplete Class Hierarchies},
  author={Bhavana Dalvi and Aditya Kumar Mishra and William W. Cohen},
  journal={Proceedings of the Ninth ACM International Conference on Web Search and Data Mining},
In an entity classification task, topic or concept hierarchies are often incomplete. Previous work by Dalvi et al. [12] has showed that in non-hierarchical semi-supervised classification tasks, the presence of such unanticipated classes can cause semantic drift for seeded classes. The Exploratory learning [12] method was proposed to solve this problem; however it is limited to the flat classification task. This paper builds such exploratory learning methods for hierarchical classification tasks… 

Figures and Tables from this paper

Constrained Semi-supervised Learning in the Presence of Unanticipated Classes

This thesis argues that many AKBC tasks which have previously been addressed separately can be viewed as instances of single abstract problem: multiview semisupervised learning with an incomplete class hierarchy, and presents a generic EM framework for solving this abstract task.

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

A path cost-sensitive learning algorithm is proposed to utilize the structural information and further make use of unlabeled and weakly-labeled data and introduce path constraints into the learning algorithm to incorporate theStructural information of the class hierarchy.

Integrated Framework for Improving Large-Scale Hierarchical Classification

  • Azad NaikH. Rangwala
  • Computer Science
    2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)
  • 2017
This paper proposes an integrated framework to address the aforementioned issues for improving large-scale HC, and experimental evaluations on various image and text datasets shows improved performance.

Learning Taxonomy Adaptation in Large-scale Classification

A multi-class, hierarchical data dependent bound on the generalization error of classifiers deployed in large-scale taxonomies is proposed and a technique for modifying a given taxonomy through pruning is proposed, that leads to a lower value of the upper bound as compared to the original taxonomy.

Revisiting Semi-Supervised Learning with Graph Embeddings

On a large and diverse set of benchmark tasks, including text classification, distantly supervised entity extraction, and entity classification, the proposed semi-supervised learning framework shows improved performance over many of the existing models.

User-Centric Ontology Population

This work proposes a methodology to perform user-centric ontology population that efficiently includes human-in-the-loop at each step, and supports the alignment of concepts in the user’s conceptualization with concepts of the target ontologies, using a novel hierarchical classification approach.

Automatic Ontology Construction

In this chapter, an extensive research in the area of ontology construction is presented, and it has been observed that constructing ontology automatically is a challenging task as this task faces difficulties due to unstructured text and ambiguities in English text.

A Neural Network-Powered Cognitive Method of Identifying Semantic Entities in Earth Science Papers

A novel approach that simulates the cognitive process of how human beings read Earth science articles, and automatically identifies semantic entities from the articles, strengthened by a neural network-based method to identify implicitly cited dataset entities based on the context.

Teaching-to-Learn and Learning-to-Teach for Few Labeled Classification

A framework based co-training style algorithms are proposed to efficiently improve the performance of these algorithms and give specific algorithm for efficient solving in different tasks and outperforming in UCI datasets, also achieve better performance in content-based image retrieval.

Genome sequence-based virus taxonomy using machine learning

This thesis applies machine learning techniques to classify the NCBI reference sequences of virus model species into seven Baltimore Classes, four host groups or hundreds of ICTV hierarchical classes, and provides a systematic experimental framework for apply machineLearning techniques to virus taxonomy.



Classifying entities into an incomplete ontology

Experiments show that the Hierarchical Exploratory EM approach improves seed class F1 by up to 21% when compared to its semi-supervised counterpart and on its way adds newly discovered classes into the hierarchy.

Hierarchical document categorization with support vector machines

A novel hierarchical classification method that generalizes Support Vector Machine learning and that is based on discriminant functions that are structured in a way that mirrors the class hierarchy is proposed.

Coupled semi-supervised learning for information extraction

This paper characterize several ways in which the training of category and relation extractors can be coupled, and presents experimental results demonstrating significantly improved accuracy as a result.

Automatic Gloss Finding for a Knowledge Base using Ontological Constraints

This paper proposes GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available ontological constraints, and demonstrates its effectiveness through extensive experiments on real-world datasets.

Discovering Hierarchical Structure for Sources and Entities

This paper model the concept of hierarchy using a set of latent binary features and proposes a generative model that assigns those latent features to sources and entities in order to maximize the probability of the observed containment.

Semantic Taxonomy Induction from Heterogenous Evidence

This work proposes a novel algorithm for inducing semantic taxonomies that flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word's coordinate terms to help in determining its hypernyms, and vice versa.

Discovering Relations between Noun Categories

This work proposes an approach to automatically discovering relevant relations, given a large text corpus plus an initial ontology defining hundreds of noun categories, and concludes this is a useful approach to semi-automatic extension of the ontology for large-scale information extraction systems such as NELL.

Multi-View Hierarchical Semi-supervised Learning by Optimal Assignment of Sets of Labels to Instances

This paper proposes an optimization based method to tackle semi-supervised learning in the presence of multiple views that makes use of linear programming and mixed integer linear programming formulations along with the EM framework to find consistent class assignments given the scores in each data view.

Latent Variable Models of Concept-Attribute Attachment

A set of Bayesian methods for automatically extending the WordNet ontology with new concepts and annotating existing concepts with generic property fields, or attributes is presented.

Bayesian models for Large-scale Hierarchical Classification

A set of Bayesian methods to model hierarchical dependencies among class labels using multivariate logistic regression, where the parent-child relationships are modeled by placing a hierarchical prior over the children nodes centered around the parameters of their parents.