# Estimating the Fundamental Limits is Easier Than Achieving the Fundamental Limits

@article{Jiao2019EstimatingTF, title={Estimating the Fundamental Limits is Easier Than Achieving the Fundamental Limits}, author={Jiantao Jiao and Yanjun Han and Irena Fischer-Hwang and Tsachy Weissman}, journal={IEEE Transactions on Information Theory}, year={2019}, volume={65}, pages={6704-6715} }

We show through case studies that it is easier to estimate the fundamental limits of data processing than to construct the explicit algorithms to achieve those limits. Focusing on binary classification, data compression, and prediction under logarithmic loss, we show that in the finite space setting, when it is possible to construct an estimator of the limits with vanishing error with <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> samples, it may require at least… Expand

#### 11 Citations

Minimax Redundancy for Markov Chains with Large State Space

- Mathematics, Computer Science
- 2018 IEEE International Symposium on Information Theory (ISIT)
- 2018

It is shown that, for Markov sources whose relaxation time is at least $1+ \frac{(2+c)}{\sqrt{k}}$, the phase transition for the number of samples required to achieve vanishing compression redundancy is precisely $\Theta(k^{2})$. Expand

Empirical Estimation of Information Measures: A Literature Guide

- Computer Science, Mathematics
- Entropy
- 2019

While those quantities are of central importance in information theory, universal algorithms for their estimation are increasingly important in data science, machine learning, biology, neuroscience, economics, language, and other experimental sciences. Expand

Complex image recognition algorithm based on immune random forest model

- Computer Science
- Soft Comput.
- 2020

A complex image recognition algorithm based on immune random forest model is proposed and the experimental results show that the proposed algorithm has high recognition efficiency and higher robustness. Expand

Reliability Analysis of Concurrent Data based on Botnet Modeling

- Computer Science
- 2020 Fourth International Conference on Inventive Systems and Control (ICISC)
- 2020

Reliability analysis of concurrent data based on Botnet modeling shows acceptable performance and the clustering variance method can effectively solve the difficulty of the detection of botnets. Expand

Visual Analysis and Mining of Knowledge Graph for Power Network Data Based on Natural Language Processing

- Computer Science
- 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)
- 2020

Visual analysis and mining of knowledge graph for power network data based on the natural language processing is proposed in this study and the experimental results have proven the effectiveness. Expand

Simultaneous localization and mapping of medical burn areas based on binocular vision and capsule networks

- Computer Science
- Soft Comput.
- 2020

The paper proposes the binocular vision uses stereo matching algorithm to calculate the position deviation between two images, so as to obtain the 3D geometric information of the object. Expand

Value and Strategy of Anime Elements in the Propaganda of COVID-19 Epidemic Situation based on Computer Vision Assisted Systems

- Computer Science
- 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC)
- 2020

Value and strategy of anime elements in the smart propaganda of the COVID-19 epidemic situation based on the computer vision assisted systems is analyzed and a combination of computer vision and computer systems is combined to construct the intelligent propaganda strategy. Expand

A City Monitoring System Based on Real-Time Communication Interaction Module and Intelligent Visual Information Collection System

- Computer Science
- Neural Processing Letters
- 2020

Real-time communication technology and computer vision acquisition technology are used to build a city monitoring system and the experimental results show that this method has strong timeliness and good monitoring effect. Expand

Speech analysis software reuse technology based on architecture and construction

- Computer Science
- 2021

The proposed methodology gives novel understandings and solutions to the existing challenges and pays attention to whether each feature model fusion method can handle and how to deal with the inconsistent input. Expand

Intelligent Crime Prevention and Control Big Data Analysis System Based on Imaging and Capsule Network Model

- Computer Science
- Neural Processing Letters
- 2020

A smart crime prevention and control big data analysis system based on machine Internet of Things and industrial object system is proposed and the experimental results show that the proposed method has higher data collection rate and crime Prevention and control efficiency. Expand

#### References

SHOWING 1-10 OF 56 REFERENCES

Maximum Likelihood Estimation of Functionals of Discrete Distributions

- Mathematics, Computer Science
- IEEE Transactions on Information Theory
- 2017

The worst case squared error risk incurred by the maximum likelihood estimator (MLE) in estimating the Shannon entropy is described and it is established that the MLE achieves the minimax optimal rate regardless of the alphabet size. Expand

Estimating Learnability in the Sublinear Data Regime

- Computer Science, Mathematics
- NeurIPS
- 2018

It is often possible to accurately estimate this "learnability" even when given an amount of data that is too small to reliably learn any accurate model, as well as to establish that these sample complexities are optimal, to constant factors. Expand

Variational Minimax Estimation of Discrete Distributions under KL Loss

- Computer Science, Mathematics
- NIPS
- 2004

In the sparse-data limit c → 0, it is found that the Dirichlet-Bayes (add-constant) estimator with parameter scaling like - c log(c) optimizes both the upper and lower bounds, suggesting an optimal choice of the "add- constant" parameter in this regime. Expand

Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs

- Mathematics, Computer Science
- STOC '11
- 2011

We introduce a new approach to characterizing the unobserved portion of a distribution, which provides sublinear--sample estimators achieving arbitrarily small additive constant error for a class of… Expand

Minimax Estimation of Functionals of Discrete Distributions

- Mathematics, Computer Science
- IEEE Transactions on Information Theory
- 2015

The minimax rate-optimal mutual information estimator yielded by the framework leads to significant performance boosts over the Chow-Liu algorithm in learning graphical models and the practical advantages of the schemes for the estimation of entropy and mutual information. Expand

Minimax rate-optimal estimation of KL divergence between discrete distributions

- Computer Science, Mathematics
- 2016 International Symposium on Information Theory and Its Applications (ISITA)
- 2016

A minimax rate-optimal estimator is constructed which is adaptive in the sense that it does not require the knowledge of the support size nor the upper bound on the likelihood ratio, and the effective sample size enlargement phenomenon holds. Expand

Risk bounds for statistical learning

- Mathematics
- 2007

We propose a general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM).We essentially focus on the binary classification framework. We extend Tsybakov's analysis of the… Expand

Estimating the Unseen

- Computer Science, Mathematics
- J. ACM
- 2017

This work can be seen as introducing a robust, general, and theoretically principled framework that, for many practical applications, essentially amplifies the sample size by a logarithmic factor; it is expected that it may be fruitfully used as a component within larger machine learning and statistical analysis systems. Expand

Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation

- Mathematics, Computer Science
- IEEE Transactions on Information Theory
- 2016

It is shown that the minimax mean-square error is within the universal multiplicative constant factors of (k/n log k)2 t log2 k/n if n exceeds a constant factor of ( k/log k); otherwise, there exists no consistent estimator. Expand

The Power of Linear Estimators

- Mathematics, Computer Science
- 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science
- 2011

The main result is that for any property in this broad class of practically relevant distribution properties, there exists a near-optimal linear estimator, and a practical and polynomial-time algorithm for constructing such estimators for any given parameters. Expand