Analysis of Legal Documents via Non-negative Matrix Factorization Methods
@article{Budahazy2021AnalysisOL, title={Analysis of Legal Documents via Non-negative Matrix Factorization Methods}, author={Ryan Budahazy and Lu Cheng and Yihuan Huang and Andrew Johnson and Pengyu Li and Joshua Vendrow and Zhoutong Wu and Denali Molitor and Elizaveta Rebrova and Deanna Needell}, journal={ArXiv}, year={2021}, volume={abs/2104.14028} }
The California Innocence Project (CIP), a clinical law school program aiming to free wrongfully convicted prisoners, evaluates thousands of mails containing new requests for assistance and corresponding case files. Processing and interpreting this large amount of information presents a significant challenge for CIP officials, which can be successfully aided by topic modeling techniques. In this paper, we apply Non-negative Matrix Factorization (NMF) method and implement various offshoots of it…
2 Citations
Guided Semi-Supervised Non-negative Matrix Factorization on Legal Documents
- Computer ScienceArXiv
- 2022
This paper proposes a method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words.
Guided Semi-Supervised Non-Negative Matrix Factorization
- Computer ScienceAlgorithms
- 2022
This paper proposes a novel method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words.
References
SHOWING 1-10 OF 21 REFERENCES
Language (Technology) is Power: A Critical Survey of “Bias” in NLP
- PsychologyACL
- 2020
A greater recognition of the relationships between language and social hierarchies is urged, encouraging researchers and practitioners to articulate their conceptualizations of “bias” and to center work around the lived experiences of members of communities affected by NLP systems.
Non-negative Matrix Factorization Meets Word Embedding
- Computer ScienceSIGIR
- 2017
This paper proposes a new model which successfully integrates a word embedding model, word2vec, into an NMF framework so as to leverage the semantic relationships between words.
Neural Word Embedding as Implicit Matrix Factorization
- Computer ScienceNIPS
- 2014
It is shown that using a sparse Shifted Positive PMI word-context matrix to represent words improves results on two word similarity tasks and one of two analogy tasks, and conjecture that this stems from the weighted nature of SGNS's factorization.
Document clustering based on non-negative matrix factorization
- Computer ScienceSIGIR
- 2003
This paper proposes a novel document clustering method based on the non-negative factorization of the term-document matrix of the given document corpus that surpasses the latent semantic indexing and the spectral clustering methods not only in the easy and reliable derivation of document clustered results, but also in document clusters accuracies.
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
- Computer ScienceNIPS
- 2016
This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks.
The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering
- Computer ScienceSixth International Conference on Data Mining (ICDM'06)
- 2006
This study presents an overview and summary on various matrix factorization algorithms and theoretically analyze the relationships among them and answers several previously unaddressed yet important questions for matrix factorizations including the interpretation and normalization of cluster posterior and the benefits and evaluation of simultaneous clustering.
Bias in word embeddings
- Computer ScienceFAT*
- 2020
A new technique for bias detection for gendered languages is developed and used to compare bias in embeddings trained on Wikipedia and on political social media data, and it is proved that existing biases are transferred to further machine learning models.
Semi-Supervised Nonnegative Matrix Factorization
- Computer ScienceIEEE Signal Processing Letters
- 2010
This work presents semi-supervised NMF (SSNMF), where they jointly incorporate the data matrix and the (partial) class label matrix into NMF, and develops multiplicative updates for SSNMF to minimize a sum of weighted residuals.
Learning the parts of objects by non-negative matrix factorization
- Computer ScienceNature
- 1999
An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.
Hierarchical online NMF for detecting and tracking topic hierarchies in a text stream
- Computer SciencePattern Recognit.
- 2018