Optimal entropy estimation on large alphabets via best polynomial approximation
Abstract
Consider the problem of estimating the Shannon entropy of a distribution on $ k$ elements from $ n$ independent samples. We show that the minimax mean-square error is within universal multiplicative constant factors of $\left( \frac{n}{k \log n} \right)^{2} + \frac{\log^2 k}{n}$. This implies the recent result of Valiant-Valiant [ 1 ] that the minimal sample size for consistent entropy estimation scales according to $\Theta( \frac{k}{\log k} )$. The apparatus of best polynomial approximation plays a key role in both the minimax lower bound and the construction of optimal estimators.