Code
<- 4
n <- rt(n,df=6)) (X
[1] -0.08058779 0.28044078 1.19011050 -1.25212790
Let \(X_1, \dots, X_n \sim P_\theta\) i.i.d. Let \(\hat\theta = \hat\theta (X_1, \dots, X_n)\) be an estimator for \(\theta\).
One often wants to evaluate the variance \(\operatorname{Var} [\hat\theta]\) to quantify the uncertainty of \(\hat\theta\).
The bootstrap is a powerful, broadly applicable method:
The method is nonparametric and can deal with small \(n\).
Now, if historical data \(X_1=(Y_1,Z_1),\ldots,X_n=(Y_n,Z_n)\) are available, then we can estimate \(\lambda_\text{opt}\) by \[ \hat{\lambda}_\text{opt} = \frac{\widehat{\text{Var}[Y]}-\widehat{\text{Cov}[Y,Z]}}{\widehat{\text{Var}[Y]}+\widehat{\text{Var}[Z]}-2\widehat{\text{Cov}[Y,Z]}} \] where
We generated 1000 samples from the population. The first three are
Here: \[ \widehat{\text{Std}[\hat{\lambda}_\text{opt}]} \approx .077, \qquad \bar{\lambda}_\text{opt} \approx .331 \ \ (\approx \lambda_\text{opt} = \frac13 = .333) \] and the distribution of \(\hat{\lambda}_\text{opt}\) is described by
(This could also be used to estimate quantiles of \(\hat{\lambda}_\text{opt}\).)
This provides \[ \widehat{\text{Std}[\hat{\lambda}_\text{opt}]}^* \approx .079 \] and the distribution of \(\hat{\lambda}_\text{opt}\) is described by
(This could again be used to estimate quantiles of \(\hat{\lambda}_\text{opt}\).)
Results are close: \(\widehat{\text{Std}[\hat{\lambda}_\text{opt}]}\approx 0.077\) and \(\widehat{\text{Std}[\hat{\lambda}_\text{opt}]}^*\approx 0.079\).
Above, each bootstrap sample \((X_1^{*b},\ldots,X_n^{*b})\) is obtained by sampling (uniformly) with replacement among the original sample \((X_1,\ldots,X_n)\).
Possible uses:
Possible uses when \(T\) is an estimator of \(\theta\):
Bootstrap estimates of \(\mathbb E [\bar{X}]\) and \(\text{Var}[\bar{X}]\) are then given by
The practical sessions will explore how well such estimates behave.
boot
functionA better strategy is to use the boot
function from
The boot
function takes typically 3 arguments:
data
: the original samplestatistic
: a user-defined function with the statistic to bootstrap
R
: the number \(B\) of bootstrap samples to considerIf the statistic is the mean, then a suitable user-defined function is
The bootstrap estimate of \(\text{Var}[\bar{X}]\) is then