The details below are for those interested in how geex is organized. It is not necessary for using geex.
The design of geex
starts with the key to M-estimation, the estimating function:
\[ \psi(O_i, \theta) . \]
geex
composes \(\psi\) with two R functions: the “outer” estFUN
and the “inner” psiFUN
. In pseudocode, \(\psi(O_i, \theta) =\):
estFUN <- function(O_i){
psiFUN <- function(theta){
psi(O_i, theta)
}
return(psiFUN)
}
The reason for composing the \(\psi\) function in this way is that in order to do estimation (finding roots) and inference (computing the empirical sandwich variance estimator), \(\psi\) needs to be function of \(\theta\). M-estimation theory gives the following instructions:
With \(\hat{\theta}\) in hand, the quantity \(B_i\) is simple to compute. The computational challenges of M-estimation, then, are finding roots of \(G_m\) and calculating the derivative \(A_i\). By composing \(\psi\) of two functions in geex
, one can first do all the manipulations of \(O_i\) (data) that are independent of \(\theta\). In a sense, estFUN
“fixes” the data so that numerical routines only need deal with \(\theta\) in psiFUN
.
Before describing the mechanics of how geex
finding roots of \(G_m\) and computes derivatives of \(\psi\), let’s look at the m_estimation_basis
S4
object which forms the basis of all computations in geex
.
An m_estimation_basis
object, at a minimum needs two objects: an estFUN
and a data.frame
. Let’s use a simple estFUN
that estimates the mean and variance of Y1
in the geexex
dataset.
library(geex)
library(dplyr)
myee <- function(data){
Y1 <- data$Y1
function(theta){
c(Y1 - theta[1],
(Y1 - theta[1])^2 - theta[2])
}
}
Now we can create a basis:
And look at what this object contains:
## [1] ".data" ".units" ".weights" ".psiFUN_list"
## [5] ".GFUN" ".control" ".estFUN" ".outer_args"
## [9] ".inner_args"
Two slots are worth examining. First, .psiFUN_list
is a list
of function
s:
## $`1`
## function (theta)
## {
## c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2])
## }
## <environment: 0x7f9a64794a88>
##
## $`2`
## function (theta)
## {
## c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2])
## }
## <bytecode: 0x7f9a646142a8>
## <environment: 0x7f9a64782e08>
This object is essentially equivalent to:
From this list of functions, we can compute \(A_i\), and by summing across the list, form \(G_m\). The latter is found in:
## function (theta)
## {
## psii <- lapply(psi_list, function(psi) {
## do.call(psi, args = append(list(theta = theta), object@.inner_args))
## })
## compute_sum_of_list(psii, object@.weights)
## }
## <environment: 0x7f9a64b3d968>
Now that we have \(G_m\) as a function of theta
, we can found its roots using a root-finding algorithm such as rootSolve::multiroot
:
## $root
## [1] 5.044563 10.041239
##
## $f.root
## [1] -2.131628e-14 4.654055e-13
##
## $iter
## [1] 4
##
## $estim.precis
## [1] 2.433609e-13
Within geex
this is done with the estimate_GFUN_roots
function. To illustrate, I first need to update the .control
slot in mybasis
with starting values for multiroot
.
mycontrol <- new('geex_control', .root = setup_root_control(start = c(1, 1)))
mybasis@.control <- mycontrol
roots <- mybasis %>%
estimate_GFUN_roots()
roots
## $root
## [1] 5.044563 10.041239
##
## $f.root
## [1] -2.131628e-14 -2.238210e-13
##
## $iter
## [1] 4
##
## $estim.precis
## [1] 1.225686e-13
Note that is bad form to assign S4
slot with someS4object@aslot <- something
, but I do so here because I have not created a generic function for setting the .control
slot.
In the last section, we found \(\hat{\theta}\), which we now use to compute the \(A_i\) and \(B_i\) matrices.
geex
uses the numDeriv::jacobian
function to numerically evaluate derivatives. For example, \(A_1 = - (\partial \psi(O_1, \theta)/\partial \theta)|_{\theta = \hat{\theta}}\) for this example is:
## [,1] [,2]
## [1,] 1.000000 0
## [2,] -2.752514 1
geex
performs this operation for each \(i = 1, \dots, m\) to yield a list of \(A_i\) matrices. Then summing across this list yields \(A = \sum_i A_i\). The estimate_sandwich_matrices
function computes the list of \(A_i\), \(B_i\) and \(A\) and \(B\):
mats <- mybasis %>%
estimate_sandwich_matrices(.theta = roots$root)
# Compare to the numDeriv computation above
grab_bread_list(mats)[[1]]
## [,1] [,2]
## [1,] 1.000000 0
## [2,] -2.752514 1
Finally, computing \(\hat{\Sigma} = A^{-1} B (A^{-1})^{\intercal}\) is accomplished with the compute_sigma
function.
## [,1] [,2]
## [1,] 0.10041239 0.03667969
## [2,] 0.03667969 2.49219638
m_estimate
All of the operations described above are wrapped and packaged in the m_estimate
function:
## An object of class "geex"
## Slot "call":
## m_estimate(estFUN = myee, data = geexex, root_control = setup_root_control(start = c(0,
## 0)))
##
## Slot "basis":
## An object of class m_estimation_basis
## psi:
## {
## Y1 <- data$Y1
## function(theta) {
## c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2])
## }
## }
## Data:
## Y1 Y2 X1 Y3 W1 Z1 X2 Y4
## 1 3.6683066 2.0281718 4.949316 16.34576 4.823768 8.921782 0 0.09273926
## 2 10.4524548 1.6432966 7.851962 25.68742 7.884845 13.909474 0 1.01672736
## 3 3.1234106 2.8526264 4.729075 16.10831 4.709346 9.014695 0 0.49399039
## 4 8.3715025 2.5133652 2.564395 10.57997 2.786091 6.733378 0 1.24322433
## 5 -0.8319749 3.0182030 4.782347 16.46401 4.811590 9.290492 0 0.69520599
## 6 3.3987763 0.9785209 5.335713 18.32577 5.415370 10.322199 0 0.95220138
## Y5
## 1 1
## 2 1
## 3 0
## 4 0
## 5 1
## 6 1
## Units:
## Slot "rootFUN_results":
## $root
## [1] 5.044563 10.041239
##
## $f.root
## [1] -2.131628e-14 4.654055e-13
##
## $iter
## [1] 4
##
## $estim.precis
## [1] 2.433609e-13
##
##
## Slot "sandwich_components":
## An object of class sandwich_components
## A (bread matrix):
## [,1] [,2]
## [1,] 1.00000e+02 0
## [2,] -1.65139e-11 100
## B (meat matrix):
## [,1] [,2]
## [1,] 1004.1239 366.7969
## [2,] 366.7969 24921.9638
##
## Slot "GFUN":
## function ()
## NULL
## <bytecode: 0x7f9a5e800ba0>
##
## Slot "corrections":
## list()
##
## Slot "estimates":
## [1] 5.044563 10.041239
##
## Slot "vcov":
## [,1] [,2]
## [1,] 0.10041239 0.03667969
## [2,] 0.03667969 2.49219638