• Corpus ID: 233444252

Analysis of Legal Documents via Non-negative Matrix Factorization Methods

@article{Budahazy2021AnalysisOL,
  title={Analysis of Legal Documents via Non-negative Matrix Factorization Methods},
  author={Ryan Budahazy and Lu Cheng and Yihuan Huang and Andrew Johnson and Pengyu Li and Joshua Vendrow and Zhoutong Wu and Denali Molitor and Elizaveta Rebrova and Deanna Needell},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.14028}
}
The California Innocence Project (CIP), a clinical law school program aiming to free wrongfully convicted prisoners, evaluates thousands of mails containing new requests for assistance and corresponding case files. Processing and interpreting this large amount of information presents a significant challenge for CIP officials, which can be successfully aided by topic modeling techniques. In this paper, we apply Non-negative Matrix Factorization (NMF) method and implement various offshoots of it… 
Guided Semi-Supervised Non-negative Matrix Factorization on Legal Documents
TLDR
This paper proposes a method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words.
Guided Semi-Supervised Non-Negative Matrix Factorization
TLDR
This paper proposes a novel method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words.

References

SHOWING 1-10 OF 21 REFERENCES
Language (Technology) is Power: A Critical Survey of “Bias” in NLP
TLDR
A greater recognition of the relationships between language and social hierarchies is urged, encouraging researchers and practitioners to articulate their conceptualizations of “bias” and to center work around the lived experiences of members of communities affected by NLP systems.
Non-negative Matrix Factorization Meets Word Embedding
TLDR
This paper proposes a new model which successfully integrates a word embedding model, word2vec, into an NMF framework so as to leverage the semantic relationships between words.
Neural Word Embedding as Implicit Matrix Factorization
TLDR
It is shown that using a sparse Shifted Positive PMI word-context matrix to represent words improves results on two word similarity tasks and one of two analogy tasks, and conjecture that this stems from the weighted nature of SGNS's factorization.
Document clustering based on non-negative matrix factorization
TLDR
This paper proposes a novel document clustering method based on the non-negative factorization of the term-document matrix of the given document corpus that surpasses the latent semantic indexing and the spectral clustering methods not only in the easy and reliable derivation of document clustered results, but also in document clusters accuracies.
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
TLDR
This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks.
The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering
  • Tao Li, C. Ding
  • Computer Science
    Sixth International Conference on Data Mining (ICDM'06)
  • 2006
TLDR
This study presents an overview and summary on various matrix factorization algorithms and theoretically analyze the relationships among them and answers several previously unaddressed yet important questions for matrix factorizations including the interpretation and normalization of cluster posterior and the benefits and evaluation of simultaneous clustering.
Bias in word embeddings
TLDR
A new technique for bias detection for gendered languages is developed and used to compare bias in embeddings trained on Wikipedia and on political social media data, and it is proved that existing biases are transferred to further machine learning models.
Semi-Supervised Nonnegative Matrix Factorization
TLDR
This work presents semi-supervised NMF (SSNMF), where they jointly incorporate the data matrix and the (partial) class label matrix into NMF, and develops multiplicative updates for SSNMF to minimize a sum of weighted residuals.
Learning the parts of objects by non-negative matrix factorization
TLDR
An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.
Hierarchical online NMF for detecting and tracking topic hierarchies in a text stream
...
...