37 Propriedades dos estimadores de máxima verossimilhança e de momentos

37.1 Teorema (da invariância do EMV)

Seja \(\boldsymbol{X}_n\) a.a. de \(X\sim f_\theta, \theta \in \Theta.\) Considere \(g:\Theta \rightarrow \mathbb{R}^k\) uma função. Se existe o EMV para “\(\theta\)”, então \(g(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n))\) é o EMV para \(g(\theta)\).

37.1.1 Exemplo Bernoulli

\(X \sim \mathrm{Ber}(\theta),\theta \in (0,1)\). \(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n) = \bar{X}\).

Se \(g(\theta) = \theta^2\), então \(g(\bar{X}) = \bar{X}^2\) é o EMV para \(\theta^2\).

Se \(g(\theta) = 1-\theta\), então \(g(\bar{X}) = 1 - \bar{X}\) é o EMV para \(1-\theta\).

37.1.2 Exemplo Poisson

\(X \sim \mathrm{Pois}(\theta), \theta > 0\). \(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n) = \bar{X}\).

Se \(g(\theta) = P_\theta(X\geq 3)\), então \[ \begin{aligned} g(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) &= P_{\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)}(X\geq 3) \\ &= 1 -\left(+\mathrm{e}^{-\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)} + \mathrm{e}^{- \hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)} \hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n) + \mathrm{e}^{-\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)} \frac{\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)^2}{2!}\right) \end{aligned} \]

Se \(g(\theta) = E_\theta(X^2)\), então \(g(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) = E_{\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)}(X^2) = \hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n) + (\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n))^2\)

Se \(g(\theta) = \frac{\sqrt{\mathrm{Var}_\theta(X)}}{E_\theta(X)} = \frac{\sqrt{\theta}}{\theta} = \frac{1}{\sqrt{\theta}}\), então \(g(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) = \frac{1}{\sqrt{\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)}}\) é o EMV para \(g(\theta)\).

37.2 Teorema (dos estimadores assintoticamente normais)

Dizemos que \(T(\boldsymbol{X}_n)\) é um estimador para “\(\theta\)” assintoticamente normal se, e somente se, existir \(\mathrm{V}_\theta\) não negativo tal que \[ \sqrt{n} (T(\boldsymbol{X}_n) - \theta) \stackrel{\mathcal{D}}{\rightarrow} N(0,\mathrm{V}_\theta), \forall \theta \in \Theta. \] ou seja, converge em distribuição para uma distribuição Normal de média \(0\) e variância \(\mathrm{V}_\theta\) para todo \(\theta\) no espaço paramétrico.

37.3 Teorema (do limite central para o EMV)

Seja \(\boldsymbol{X}_n\) a.a. de \(X\sim f_\theta\) tal que as condições de regularidade \(C_1:C_9\) estejam satisfeitas.

Portanto, com \(g\) diferenciável com derivada contínua tal que \(g'(\theta) \neq 0\), \[ \sqrt{n}(g(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) - g(\theta)) \stackrel{\mathcal{D}}{\rightarrow} N(0, g'(\theta)^2 I_1(\theta)^{-1}), \forall \theta \in \Theta \]

Logo, pelo teorema de Slutsky \[ \sqrt{n}I_1(\hat{\theta}_{\mathrm{MV}})\frac{(g(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) - g(\theta))}{\sqrt{g'(\hat{\theta}_{\mathrm{MV}})^2}} \stackrel{\mathcal{D}}{\rightarrow} N(0, 1), \forall \theta \in \Theta \]

EMVs são assintoticamente eficientes

Note que \[ \begin{aligned} E_\theta(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) &\stackrel{n\uparrow \infty}{\rightarrow} \theta \\ n \mathrm{Var}_\theta(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) &\stackrel{n\uparrow \infty}{\rightarrow} I_1(\theta)^{-1} \end{aligned} \] Ou seja, estimadores de máxima verossimilhança são assintoticamente eficientes sob as condições de regularidade.

A variância assintótica do estimador de máxima verossimilhança é denotada por \[ \mathrm{Var}_{\theta}^{(a)} (\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) = \frac{I_1(\theta)^{-1}}{n} \]

Notação

\[ \begin{cases} \hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n) &\stackrel{a}{\approx} N(\theta, I_n(\theta)) \\ g(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)) &\stackrel{a}{\approx} N(g(\theta), g'(\theta)^2I_n(\theta)^{-1}) \\ \end{cases} \]

Podemos calcular probabilidades aproximadas mesmo sem saber a distribuição exata, para \(\hat{\theta}_{\mathrm{MV}}\)

37.3.1 Exemplo

Se \(\theta = 3\), então calcule:

\[ \begin{aligned} P_{\theta}(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n)\leq t) &= P_{\theta}(\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n) - \theta \leq t - \theta) \\ &= P_{\theta}\left(\frac{\hat{\theta}_{\mathrm{MV}}(\boldsymbol{X}_n) - \theta}{\sqrt{I_n(\theta)^{-1}}} \leq \frac{t - \theta}{\sqrt{I_n(\theta)^{-1}}}\right) \\ &\stackrel{n\ \text{grande}}{\approx} P_{\theta}(N(0,1) \leq I_n(\theta)^{\frac{1}{2}}(t-\theta)) \\ &= P_{\theta}(N(0,1) \leq I_n(\theta)^{\frac{1}{2}}(t-3)) \end{aligned} \]

37.4 Teorema (do limite central para o estimador do método de momentos)

Seja \(\boldsymbol{X}_n\) a.a. de \(X\) tal que \(\Theta \subseteq \mathbb{R}\) e \(E_\theta(|X|^k) < \infty\) para algum \(k \in \mathbb{N} \setminus \{0\}\). Então, o estimador \(\hat{\theta}_{\mathrm{MM}}(\boldsymbol{X}_n)\) tal que

\[ E_{\hat{\theta}_{\mathrm{MM}}(\boldsymbol{X}_n)}(X^k) = \frac{1}{n} \sum X_i^k \]

é assintoticamente normal se as seguintes condições estiverem satisfeitas:

\(E_\theta[(X^k - E_\theta(X^k))^2] < \infty\)
\(h(\theta) = \frac{\partial E_\theta(X^k)}{\partial \theta} \neq 0, \forall \theta \in \Theta.\)

Com isso, \[ \sqrt{n}(\hat{\theta}_{\mathrm{MM}}(\boldsymbol{X}_n) - \theta) \stackrel{\mathcal{D}}{\rightarrow} N(0, V_\theta) \] em que \[ V_\theta = \frac{E_\theta[(X^k - E_\theta(X^k))^2]}{\left(\frac{\partial E_\theta(X^k)}{\partial \theta}\right)^2} = \frac{\mathrm{Var}_\theta (X^k)}{h(\theta)^2} \]

Além disso, se \(g : \Theta \rightarrow \mathbb{R}\) for diferenciável com derivada contínua tal que \(g'(\theta) \neq 0\), então \[ \sqrt{n}(g(\hat{\theta}_{\mathrm{MM}}(\boldsymbol{X}_n) - \theta) - g(\theta) \stackrel{\mathcal{D}}{\rightarrow} N\left(0,g'(\theta)^2 \frac{\mathrm{Var}_\theta (X^k)}{h(\theta)^2}\right) \]

Notação

\[ \begin{cases} \hat{\theta}_{\mathrm{MM}}(\boldsymbol{X}_n) &\stackrel{a}{\approx} N\left(\theta,\frac{\mathrm{Var}_\theta (X^k)}{n h(\theta)^2}\right) \\ g(\hat{\theta}_{\mathrm{MM}}(\boldsymbol{X}_n)) &\stackrel{a}{\approx} N\left(g(\theta), g'(\theta)^2 \frac{\mathrm{Var}_\theta (X^k)}{n h(\theta)^2}\right) \\ \end{cases} \]

37.5 Algoritmos

Seja \(X\sim \mathrm{Beta}(\theta, 1)\).

using Distributions, Random, Plots, StatsPlots, StatsBase, LaTeXStrings


Random.seed!(10)
n = 50
M = 10_000

MVs = zeros(M)
MMs = zeros(M)
MV(obs) = -n/sum(log.(obs))
MM(obs) = mean(obs)/(1-mean(obs))
theta0 = rand(Uniform(0,100))
dsim = Beta(theta0, 1)
for i in 1:M
  xsim = rand(dsim, n)
  MVs[i] = MV(xsim)
  MMs[i] = MM(xsim)
end


VMV(theta) = theta^2/n
normMV = Normal(theta0, sqrt(VMV(theta0)))
VMM(theta) = theta*(theta+1)^2/((theta+2)*n)
normMM = Normal(theta0, sqrt(VMM(theta0)))

histmv = histogram(MVs, normalize=:pdf, label="", color=:blue,
                   title="Histograma EMVs")
plot!(normMV, label="Aprox. Norm.", color=:green)
vline!([theta0], label=L"\theta", color=:orange)
histmm = histogram(MMs, normalize=:pdf, label="", color=:tomato,
                   title="Histograma EMMs")
plot!(normMM, label="Aprox. Norm.", color=:blue)
vline!([theta0], label=L"\theta", color=:lightgrey)
p = plot(histmv, histmm)

display(p)
qMM = quantile(normMM, 0.25)
qMV = quantile(normMV, 0.25)
println("theta0: $theta0")
println("Viés simulado MM: $(mean(MMs)-theta0)")
println("EQM simulado MM: $(n *mean((MMs .- theta0).^2))")
println("Viés simulado MV: $(mean(MVs) - theta0)")
println("EQM simulado MV: $(n * mean((MVs .- theta0).^2))")
println("P(NormalMM > $(qMM)) = $(round(ccdf(normMM, qMM), digits=2))")
println("P(NormalMV > $(qMV)) = $(round(ccdf(normMM, qMV),digits=2))")
println("Real MM > 2.5 = $(mean(MMs .> qMM))")
println("Real MV > 2.5 = $(mean(MVs .> qMV))")

theta0: 30.104498710537307
Viés simulado MM: 0.5943591309955814
EQM simulado MM: 968.056873503618
Viés simulado MV: 0.612391481120099
EQM simulado MV: 968.182881983385
P(NormalMM > 27.2314280176146) = 0.75
P(NormalMV > 27.23291320813612) = 0.75
Real MM > 2.5 = 0.783
Real MV > 2.5 = 0.7843