#### Filter Results:

- Full text PDF available (154)

#### Publication Year

2002

2017

- This year (22)
- Last 5 years (123)
- Last 10 years (163)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

- Benjamin Recht, Christopher Ré, Stephen J. Wright, Feng Niu
- NIPS
- 2011

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve stateof-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performancedestroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and… (More)

- Feng Niu, Christopher Ré, AnHai Doan, Jude W. Shavlik
- PVLDB
- 2011

Over the past few years, Markov Logic Networks (MLNs) have emerged as a powerful AI framework that combines statistical and logical reasoning. It has been applied to a wide range of data management problems, such as information extraction, ontology matching, and text mining, and has become a core technology underlying several major AI projects. Because of… (More)

- Joseph M. Hellerstein, Christopher Ré, +8 authors Arun Kumar
- PVLDB
- 2012

MADlib is a free, open source library of in-database analytic methods. It provides an evolving suite of SQL-based algorithms for machine learning, data mining and statistics that run at scale within a database engine, with no need for data import/export to other tools. The goal is for MADlib to eventually serve a role for scalable database systems that is… (More)

- Christopher Ré, Nilesh N. Dalvi, Dan Suciu
- 2007 IEEE 23rd International Conference on Data…
- 2007

Modern enterprise applications are forced to deal with unreliable, inconsistent and imprecise information. Probabilistic databases can model such data naturally, but SQL query evaluation on probabilistic databases is difficult: previous approaches have either restricted the SQL queries, or computed approximate probabilities, or did not scale, and it was… (More)

- Victor Bittorf, Benjamin Recht, Christopher Ré, Joel A. Tropp
- NIPS
- 2012

This paper describes a new approach for computing nonnegative matrix factorizations (NMFs) with linear programming. The key idea is a data-driven model for the factorization, in which the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X , the algorithm identifies a matrix C that satisfies X… (More)

- Christopher Ré, Julie Letchner, Magdalena Balazinska, Dan Suciu
- SIGMOD Conference
- 2008

A major problem in detecting events in streams of data is that the data can be imprecise (<i>e.g.</i> RFID data). However, current state-ofthe-art event detection systems such as Cayuga [14], SASE [46] or SnoopIB[1], assume the data is <i>precise</i>. Noise in the data can be captured using techniques such as hidden Markov models. Inference on these models… (More)

- Benjamin Recht, Christopher Ré
- Math. Program. Comput.
- 2013

This paper develops Jellyfish, an algorithm for solving data-processing problems with matrix-valued decision variables regularized to have low rank. Particular examples of problems solvable by Jellyfish include matrix completion problems and least-squares problems regularized by the nuclear norm or γ2-norm. Jellyfish implements a projected incremental… (More)

- Arvind Arasu, Christopher Ré, Dan Suciu
- 2009 IEEE 25th International Conference on Data…
- 2009

We present a declarative framework for collective deduplication of entity references in the presence of constraints. Constraints occur naturally in many data cleaning domains and can improve the quality of deduplication. An example of a constraint is "each paper has a unique publication venue''; if two paper references are duplicates, then their associated… (More)

- Jihad Boulos, Nilesh N. Dalvi, Bhushan Mandhani, Shobhit Mathur, Christopher Ré, Dan Suciu
- SIGMOD Conference
- 2005

MystiQ is a system that uses probabilistic query semantics [3] to find answers in large numbers of data sources of less than perfect quality. There are many reasons why the data originating from many different sources may be of poor quality, and therefore difficult to query: the same data item may have different representation in different sources; the… (More)

- John R. Frank, Max Kleiman-Weiner, +4 authors Ian Soboroff
- TREC
- 2012

The Knowledge Base Acceleration track in TREC 2012 focused on a single task: filter a time-ordered corpus for documents that are highly relevant to a predefined list of entities. KBA differs from previous filtering evaluations in two primary ways: the stream corpus is >100x larger than previous filtering collections, and the use of entities as topics… (More)