#### Filter Results:

- Full text PDF available (17)

#### Publication Year

2011

2016

- This year (0)
- Last 5 years (20)
- Last 10 years (21)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Eric P. Xing, Qirong Ho, +7 authors Yaoliang Yu
- IEEE Transactions on Big Data
- 2015

How can one build a distributed framework that allows efficient deployment of a wide spectrum of modern advanced machine learning (ML) programs for industrial-scale problems using Big Models (100s of billions of parameters) on Big Data (terabytes or petabytes)- Contemporary parallelization strategies employ fine-grained operations and scheduling beyond the… (More)

- Wei Dai, Jinliang Wei, +5 authors Eric P. Xing
- ArXiv
- 2013

A major bottleneck to applying advanced ML programs at industrial scales is the migration of an academic implementation, often specialized for a small, well-controlled computer platform such as desktop PCs and small lab-clusters, to a big, less predicable platform such as a corporate cluster or the cloud. This poses enormous challenges: how does one train… (More)

- Jianfei Chen, Jun Zhu, Zi Wang, Xun Zheng, Bo Zhang
- NIPS
- 2013

Logistic-normal topic models can effectively discover correlation structures among latent topics. However, their inference remains a challenge because of the non-conjugacy between the logistic-normal prior and multinomial topic mixing proportions. Existing algorithms either make restricting mean-field assumptions or are not scalable to large-scale… (More)

- Jinhui Yuan, Fei Gao, +6 authors Wei-Ying Ma
- WWW
- 2015

When building large-scale machine learning (ML) programs, such as massive topic models or deep neural networks with up to trillions of parameters and training examples, one usually assumes that such massive tasks can only be attempted with industrial-sized clusters with thousands of nodes, which are out of reach for most practitioners and academic… (More)

- Seunghak Lee, Jin Kyu Kim, Xun Zheng, Qirong Ho, Garth A. Gibson, Eric P. Xing
- NIPS
- 2014

Distributed machine learning has typically been approached from a data parallel perspective, where big data are partitioned to multiple workers and an algorithm is executed concurrently over different data subsets under various synchronization schemes to ensure speed-up and/or correctness. A sibling problem that has received relatively less attention is how… (More)

- Jinhui Yuan, Fei Gao, +6 authors Wei-Ying Ma
- ArXiv
- 2014

When building large-scale machine learning (ML) programs, such as massive topics models or deep networks with up to trillions of parameters and training examples, one usually assumes that such massive tasks can only be attempted with industrial-sized clusters with thousands of nodes, which are out of reach for most practitioners or academic researchers. We… (More)

- Jin Kyu Kim, Qirong Ho, +4 authors Eric P. Xing
- EuroSys
- 2016

Machine learning (ML) algorithms are commonly applied to big data, using distributed systems that partition the data across machines and allow each machine to read and update all ML model parameters --- a strategy known as data parallelism. An alternative and complimentary strategy, model parallelism, partitions the model parameters for non-shared parallel… (More)

- Yaoliang Yu, Xun Zheng, Micol Marchetti-Bowick, Eric P. Xing
- AISTATS
- 2015

Regularization has played a key role in deriving sensible estimators in high dimensional statistical inference. A substantial amount of recent works has argued for nonconvex regularizers in favor of their superior theoretical properties and excellent practical performances. In a different but analogous vein, nonconvex loss functions are promoted because of… (More)

Topic models have played a pivotal role in analyzing large collections of complex data. Besides discovering latent semantics, supervised topic models (STMs) can make predictions on unseen test data. By marrying with advanced learning techniques, the predictive strengths of STMs have been dramatically enhanced, such as max-margin supervised topic models,… (More)

Supervised topic models with a logistic likelihood have two issues that potentially limit their practical use: 1) response variables are usually over-weighted by document word counts; and 2) existing variational inference methods make strict mean-field assumptions. We address these issues by: 1) introducing a regularization constant to better balance the… (More)