Estimating the Fundamental Limits is Easier Than Achieving the Fundamental Limits

  title={Estimating the Fundamental Limits is Easier Than Achieving the Fundamental Limits},
  author={Jiantao Jiao and Yanjun Han and Irena Fischer-Hwang and Tsachy Weissman},
  journal={IEEE Transactions on Information Theory},
We show through case studies that it is easier to estimate the fundamental limits of data processing than to construct the explicit algorithms to achieve those limits. Focusing on binary classification, data compression, and prediction under logarithmic loss, we show that in the finite space setting, when it is possible to construct an estimator of the limits with vanishing error with <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> samples, it may require at least… Expand
Minimax Redundancy for Markov Chains with Large State Space
It is shown that, for Markov sources whose relaxation time is at least $1+ \frac{(2+c)}{\sqrt{k}}$, the phase transition for the number of samples required to achieve vanishing compression redundancy is precisely $\Theta(k^{2})$. Expand
Empirical Estimation of Information Measures: A Literature Guide
  • S. Verdú
  • Computer Science, Mathematics
  • Entropy
  • 2019
While those quantities are of central importance in information theory, universal algorithms for their estimation are increasingly important in data science, machine learning, biology, neuroscience, economics, language, and other experimental sciences. Expand
Complex image recognition algorithm based on immune random forest model
A complex image recognition algorithm based on immune random forest model is proposed and the experimental results show that the proposed algorithm has high recognition efficiency and higher robustness. Expand
Reliability Analysis of Concurrent Data based on Botnet Modeling
Reliability analysis of concurrent data based on Botnet modeling shows acceptable performance and the clustering variance method can effectively solve the difficulty of the detection of botnets. Expand
Visual Analysis and Mining of Knowledge Graph for Power Network Data Based on Natural Language Processing
Visual analysis and mining of knowledge graph for power network data based on the natural language processing is proposed in this study and the experimental results have proven the effectiveness. Expand
Simultaneous localization and mapping of medical burn areas based on binocular vision and capsule networks
The paper proposes the binocular vision uses stereo matching algorithm to calculate the position deviation between two images, so as to obtain the 3D geometric information of the object. Expand
Value and Strategy of Anime Elements in the Propaganda of COVID-19 Epidemic Situation based on Computer Vision Assisted Systems
  • J. Deng
  • Computer Science
  • 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC)
  • 2020
Value and strategy of anime elements in the smart propaganda of the COVID-19 epidemic situation based on the computer vision assisted systems is analyzed and a combination of computer vision and computer systems is combined to construct the intelligent propaganda strategy. Expand
A City Monitoring System Based on Real-Time Communication Interaction Module and Intelligent Visual Information Collection System
Real-time communication technology and computer vision acquisition technology are used to build a city monitoring system and the experimental results show that this method has strong timeliness and good monitoring effect. Expand
Speech analysis software reuse technology based on architecture and construction
The proposed methodology gives novel understandings and solutions to the existing challenges and pays attention to whether each feature model fusion method can handle and how to deal with the inconsistent input. Expand
Intelligent Crime Prevention and Control Big Data Analysis System Based on Imaging and Capsule Network Model
A smart crime prevention and control big data analysis system based on machine Internet of Things and industrial object system is proposed and the experimental results show that the proposed method has higher data collection rate and crime Prevention and control efficiency. Expand


Maximum Likelihood Estimation of Functionals of Discrete Distributions
The worst case squared error risk incurred by the maximum likelihood estimator (MLE) in estimating the Shannon entropy is described and it is established that the MLE achieves the minimax optimal rate regardless of the alphabet size. Expand
Estimating Learnability in the Sublinear Data Regime
It is often possible to accurately estimate this "learnability" even when given an amount of data that is too small to reliably learn any accurate model, as well as to establish that these sample complexities are optimal, to constant factors. Expand
Variational Minimax Estimation of Discrete Distributions under KL Loss
In the sparse-data limit c → 0, it is found that the Dirichlet-Bayes (add-constant) estimator with parameter scaling like - c log(c) optimizes both the upper and lower bounds, suggesting an optimal choice of the "add- constant" parameter in this regime. Expand
Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs
We introduce a new approach to characterizing the unobserved portion of a distribution, which provides sublinear--sample estimators achieving arbitrarily small additive constant error for a class ofExpand
Minimax Estimation of Functionals of Discrete Distributions
The minimax rate-optimal mutual information estimator yielded by the framework leads to significant performance boosts over the Chow-Liu algorithm in learning graphical models and the practical advantages of the schemes for the estimation of entropy and mutual information. Expand
Minimax rate-optimal estimation of KL divergence between discrete distributions
A minimax rate-optimal estimator is constructed which is adaptive in the sense that it does not require the knowledge of the support size nor the upper bound on the likelihood ratio, and the effective sample size enlargement phenomenon holds. Expand
Risk bounds for statistical learning
We propose a general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM).We essentially focus on the binary classification framework. We extend Tsybakov's analysis of theExpand
Estimating the Unseen
This work can be seen as introducing a robust, general, and theoretically principled framework that, for many practical applications, essentially amplifies the sample size by a logarithmic factor; it is expected that it may be fruitfully used as a component within larger machine learning and statistical analysis systems. Expand
Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation
It is shown that the minimax mean-square error is within the universal multiplicative constant factors of (k/n log k)2 t log2 k/n if n exceeds a constant factor of ( k/log k); otherwise, there exists no consistent estimator. Expand
The Power of Linear Estimators
  • G. Valiant, Paul Valiant
  • Mathematics, Computer Science
  • 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science
  • 2011
The main result is that for any property in this broad class of practically relevant distribution properties, there exists a near-optimal linear estimator, and a practical and polynomial-time algorithm for constructing such estimators for any given parameters. Expand