# technical reports

M. Opper mo216 at newton.cam.ac.uk
Mon Nov 17 04:39:30 EST 1997

Two new papers are now available:

Opper, M and D. Haussler (1997)
"Worst case prediction over sequences under log loss"
in {\em The Mathematics of Information Coding, Extraction
and Distribution}, Springer Verlag,
Edited by G. Cybenko, D. O'Leary and J. Rissanen.

ftp://ftp.cse.ucsc.edu/pub/ml/OHWCpaper.ps   (180K postscript)

Abstract:
We consider the game of sequentially assigning probabilities
to future data based on past observations under logarithmic loss.
We are not making probabilistic assumptions about the generation
of the data, but consider a situation where a
player tries to minimize his loss relative to the loss of
the (with hindsight) best distribution
from a target class for the worst sequence of data.
We give bounds on the minimax regret in terms of the
metric entropies of the target class with respect to
suitable distances between distributions.

D. Haussler and M. Opper (1997)
"Metric Entropy and Minimax Risk in Classification"
{\em Lecture Notes in Computer Science: Studies in Logic
and Computer Science, a selection of essays in honor of Andrzej Ehrenfeucht}
Vol. 1261, 212-235 (1997)
Eds. J. Mycielski, G. Rozenberg and A. Salomaa

ftp://ftp.cse.ucsc.edu/pub/ml/Andrzejpaper.ps (245k postscript)

Abstract:
We apply recent results on the minimax risk in density estimation to the
related problem of pattern classification.
The notion of loss we seek to minimize is an information theoretic
measure of how well we can predict the classification of future examples,
given the classification of previously seen examples. We give an asymptotic
characterization of the minimax risk in terms of the metric entropy
properties of the class of distributions that might be generating
the examples. We then use these results to characterize the minimax
risk in the special case of noisy two-valued classification problems in terms
of the Assouad density and the Vapnik-Chervonenkis dimension.