About: Introduced by Kiefer and Wolfowitz /cite{KW56}, the nonparametric maximum likelihood estimator (NPMLE) is a widely used methodology for learning mixture odels and empirical Bayes estimation. Sidestepping the non-convexity in mixture likelihood, the NPMLE estimates the mixing distribution by maximizing the total likelihood over the space of probability measures, which can be viewed as an extreme form of overparameterization. In this paper we discover a surprising property of the NPMLE solution. Consider, for example, a Gaussian mixture model on the real line with a subgaussian mixing distribution. Leveraging complex-analytic techniques, we show that with high probability the NPMLE based on a sample of size $n$ has $O(/log n)$ atoms (mass points), significantly improving the deterministic upper bound of $n$ due to Lindsay /cite{lindsay1983geometry1}. Notably, any such Gaussian mixture is statistically indistinguishable from a finite one with $O(/log n)$ components (and this is tight for certain mixtures). Thus, absent any explicit form of model selection, NPMLE automatically chooses the right model complexity, a property we term /emph{self-regularization}. Extensions to other exponential families are given. As a statistical application, we show that this structural property can be harnessed to bootstrap existing Hellinger risk bound of the (parametric) MLE for finite Gaussian mixtures to the NPMLE for general Gaussian mixtures, recovering a result of Zhang /cite{zhang2009generalized}.   Goto Sponge  NotDistinct  Permalink

An Entity of Type : fabio:Abstract, within Data Space : covidontheweb.inria.fr associated with source document(s)

AttributesValues
type
value
  • Introduced by Kiefer and Wolfowitz /cite{KW56}, the nonparametric maximum likelihood estimator (NPMLE) is a widely used methodology for learning mixture odels and empirical Bayes estimation. Sidestepping the non-convexity in mixture likelihood, the NPMLE estimates the mixing distribution by maximizing the total likelihood over the space of probability measures, which can be viewed as an extreme form of overparameterization. In this paper we discover a surprising property of the NPMLE solution. Consider, for example, a Gaussian mixture model on the real line with a subgaussian mixing distribution. Leveraging complex-analytic techniques, we show that with high probability the NPMLE based on a sample of size $n$ has $O(/log n)$ atoms (mass points), significantly improving the deterministic upper bound of $n$ due to Lindsay /cite{lindsay1983geometry1}. Notably, any such Gaussian mixture is statistically indistinguishable from a finite one with $O(/log n)$ components (and this is tight for certain mixtures). Thus, absent any explicit form of model selection, NPMLE automatically chooses the right model complexity, a property we term /emph{self-regularization}. Extensions to other exponential families are given. As a statistical application, we show that this structural property can be harnessed to bootstrap existing Hellinger risk bound of the (parametric) MLE for finite Gaussian mixtures to the NPMLE for general Gaussian mixtures, recovering a result of Zhang /cite{zhang2009generalized}.
subject
  • Statistical inference
  • Cluster analysis
  • Convex analysis
  • Machine learning
  • Mathematical and quantitative methods (economics)
  • Mathematical optimization
  • Probability distribution fitting
  • Convex optimization
  • Nonparametric statistics
  • M-estimators
  • Maximum likelihood estimation
  • Probabilistic models
  • Latent variable models
  • Robust statistics
  • Nonparametric Bayesian statistics
  • Experiment (probability theory)
  • Systems of probability distributions
part of
is abstract of
is hasSource of
Faceted Search & Find service v1.13.91 as of Mar 24 2020


Alternative Linked Data Documents: Sponger | ODE     Content Formats:       RDF       ODATA       Microdata      About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data]
OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2025 OpenLink Software