Helper function for prediction-powered inference for OLS estimation
Arguments
- X_l
(matrix): n x p matrix of covariates in the labeled data.
- Y_l
(vector): n-vector of labeled outcomes.
- f_l
(vector): n-vector of predictions in the labeled data.
- X_u
(matrix): N x p matrix of covariates in the unlabeled data.
- f_u
(vector): N-vector of predictions in the unlabeled data.
- w_l
(ndarray, optional): Sample weights for the labeled data set. Defaults to a vector of ones.
- w_u
(ndarray, optional): Sample weights for the unlabeled data set. Defaults to a vector of ones.
Value
(list): A list containing the following:
- est
(vector): vector of PPI OLS regression coefficient estimates.
- se
(vector): vector of standard errors of the coefficients.
- rectifier_est
(vector): vector of the rectifier OLS regression coefficient estimates.
Details
Prediction Powered Inference (Angelopoulos et al., 2023) https://www.science.org/doi/10.1126/science.adi6000
Examples
dat <- simdat()
form <- Y - f ~ X1
X_l <- model.matrix(form, data = dat[dat$set == "labeled",])
Y_l <- dat[dat$set == "labeled", all.vars(form)[1]] |> matrix(ncol = 1)
f_l <- dat[dat$set == "labeled", all.vars(form)[2]] |> matrix(ncol = 1)
X_u <- model.matrix(form, data = dat[dat$set == "unlabeled",])
f_u <- dat[dat$set == "unlabeled", all.vars(form)[2]] |> matrix(ncol = 1)
ppi_ols(X_l, Y_l, f_l, X_u, f_u)
#> $est
#> [,1]
#> (Intercept) 0.5581467
#> X1 0.9968819
#>
#> $se
#> (Intercept) X1
#> 0.09420872 0.09683602
#>
#> $rectifier_est
#> [,1]
#> (Intercept) 0.06308398
#> X1 0.13394343
#>