Estimating the Fundamental Limits is Easier Than Achieving the Fundamental Limits

@article{Jiao2019EstimatingTF,
title={Estimating the Fundamental Limits is Easier Than Achieving the Fundamental Limits},
author={Jiantao Jiao and Yanjun Han and Irena Fischer-Hwang and Tsachy Weissman},
journal={IEEE Transactions on Information Theory},
year={2019},
volume={65},
pages={6704-6715}
}
• Jiantao Jiao, +1 author T. Weissman
• Published 2019
• Computer Science, Mathematics
• IEEE Transactions on Information Theory
We show through case studies that it is easier to estimate the fundamental limits of data processing than to construct the explicit algorithms to achieve those limits. Focusing on binary classification, data compression, and prediction under logarithmic loss, we show that in the finite space setting, when it is possible to construct an estimator of the limits with vanishing error with <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> samples, it may require at least… Expand
11 Citations

Figures and Topics from this paper

Minimax Redundancy for Markov Chains with Large State Space
• Mathematics, Computer Science
• 2018 IEEE International Symposium on Information Theory (ISIT)
• 2018
It is shown that, for Markov sources whose relaxation time is at least $1+ \frac{(2+c)}{\sqrt{k}}$, the phase transition for the number of samples required to achieve vanishing compression redundancy is precisely $\Theta(k^{2})$. Expand
Empirical Estimation of Information Measures: A Literature Guide
• S. Verdú
• Computer Science, Mathematics
• Entropy
• 2019
While those quantities are of central importance in information theory, universal algorithms for their estimation are increasingly important in data science, machine learning, biology, neuroscience, economics, language, and other experimental sciences. Expand
Complex image recognition algorithm based on immune random forest model
• Computer Science
• Soft Comput.
• 2020
A complex image recognition algorithm based on immune random forest model is proposed and the experimental results show that the proposed algorithm has high recognition efficiency and higher robustness. Expand
Reliability Analysis of Concurrent Data based on Botnet Modeling
• Computer Science
• 2020 Fourth International Conference on Inventive Systems and Control (ICISC)
• 2020
Reliability analysis of concurrent data based on Botnet modeling shows acceptable performance and the clustering variance method can effectively solve the difficulty of the detection of botnets. Expand
Visual Analysis and Mining of Knowledge Graph for Power Network Data Based on Natural Language Processing
• Computer Science
• 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)
• 2020
Visual analysis and mining of knowledge graph for power network data based on the natural language processing is proposed in this study and the experimental results have proven the effectiveness. Expand
Simultaneous localization and mapping of medical burn areas based on binocular vision and capsule networks
• Computer Science
• Soft Comput.
• 2020
The paper proposes the binocular vision uses stereo matching algorithm to calculate the position deviation between two images, so as to obtain the 3D geometric information of the object. Expand
Value and Strategy of Anime Elements in the Propaganda of COVID-19 Epidemic Situation based on Computer Vision Assisted Systems
• J. Deng
• Computer Science
• 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC)
• 2020
Value and strategy of anime elements in the smart propaganda of the COVID-19 epidemic situation based on the computer vision assisted systems is analyzed and a combination of computer vision and computer systems is combined to construct the intelligent propaganda strategy. Expand
A City Monitoring System Based on Real-Time Communication Interaction Module and Intelligent Visual Information Collection System
• Computer Science
• Neural Processing Letters
• 2020
Real-time communication technology and computer vision acquisition technology are used to build a city monitoring system and the experimental results show that this method has strong timeliness and good monitoring effect. Expand
Speech analysis software reuse technology based on architecture and construction
The proposed methodology gives novel understandings and solutions to the existing challenges and pays attention to whether each feature model fusion method can handle and how to deal with the inconsistent input. Expand
Intelligent Crime Prevention and Control Big Data Analysis System Based on Imaging and Capsule Network Model
• Computer Science
• Neural Processing Letters
• 2020
A smart crime prevention and control big data analysis system based on machine Internet of Things and industrial object system is proposed and the experimental results show that the proposed method has higher data collection rate and crime Prevention and control efficiency. Expand

References

SHOWING 1-10 OF 56 REFERENCES
Maximum Likelihood Estimation of Functionals of Discrete Distributions
• Mathematics, Computer Science
• IEEE Transactions on Information Theory
• 2017
The worst case squared error risk incurred by the maximum likelihood estimator (MLE) in estimating the Shannon entropy is described and it is established that the MLE achieves the minimax optimal rate regardless of the alphabet size. Expand
Estimating Learnability in the Sublinear Data Regime
• Computer Science, Mathematics
• NeurIPS
• 2018
It is often possible to accurately estimate this "learnability" even when given an amount of data that is too small to reliably learn any accurate model, as well as to establish that these sample complexities are optimal, to constant factors. Expand
Variational Minimax Estimation of Discrete Distributions under KL Loss
In the sparse-data limit c → 0, it is found that the Dirichlet-Bayes (add-constant) estimator with parameter scaling like - c log(c) optimizes both the upper and lower bounds, suggesting an optimal choice of the "add- constant" parameter in this regime. Expand
Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs
• Mathematics, Computer Science
• STOC '11
• 2011
We introduce a new approach to characterizing the unobserved portion of a distribution, which provides sublinear--sample estimators achieving arbitrarily small additive constant error for a class ofExpand
Minimax Estimation of Functionals of Discrete Distributions
• Mathematics, Computer Science
• IEEE Transactions on Information Theory
• 2015
The minimax rate-optimal mutual information estimator yielded by the framework leads to significant performance boosts over the Chow-Liu algorithm in learning graphical models and the practical advantages of the schemes for the estimation of entropy and mutual information. Expand
Minimax rate-optimal estimation of KL divergence between discrete distributions
• Computer Science, Mathematics
• 2016 International Symposium on Information Theory and Its Applications (ISITA)
• 2016
A minimax rate-optimal estimator is constructed which is adaptive in the sense that it does not require the knowledge of the support size nor the upper bound on the likelihood ratio, and the effective sample size enlargement phenomenon holds. Expand
Risk bounds for statistical learning
• Mathematics
• 2007
We propose a general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM).We essentially focus on the binary classification framework. We extend Tsybakov's analysis of theExpand
Estimating the Unseen
• Computer Science, Mathematics
• J. ACM
• 2017
This work can be seen as introducing a robust, general, and theoretically principled framework that, for many practical applications, essentially amplifies the sample size by a logarithmic factor; it is expected that it may be fruitfully used as a component within larger machine learning and statistical analysis systems. Expand
Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation
• Mathematics, Computer Science
• IEEE Transactions on Information Theory
• 2016
It is shown that the minimax mean-square error is within the universal multiplicative constant factors of (k/n log k)2 t log2 k/n if n exceeds a constant factor of ( k/log k); otherwise, there exists no consistent estimator. Expand
The Power of Linear Estimators
• Mathematics, Computer Science
• 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science
• 2011
The main result is that for any property in this broad class of practically relevant distribution properties, there exists a near-optimal linear estimator, and a practical and polynomial-time algorithm for constructing such estimators for any given parameters. Expand