
Conditional Density Estimation with Heteroskedasticity
Source:R/density_learners.R
lnr_heteroskedastic_density.Rd
TODO: The following code has a bug / statistical issue. =======================================================
Usage
lnr_heteroskedastic_density(
data,
formula,
mean_lnr,
var_lnr,
mean_lnr_args = NULL,
var_lnr_args = NULL,
density_args = NULL
)
Arguments
- data
A dataframe to train a learner / learners on.
- formula
A regression formula to use inside this learner.
- mean_lnr
A learner (function) passed in to be trained on the data with the given formula and then used to predict conditional means for provided
newdata
.- var_lnr
A learner (function) passed in to be trained on the squared error from the
mean_lnr
on the given data and then used to predict the expected variance for the density distribution of the outcome centered around the predicted conditional mean in the output.- mean_lnr_args
Extra arguments to be passed to the
mean_lnr
- var_lnr_args
Extra arguments to be passed to the
var_lnr
- density_args
Extra arguments to be passed to the kernel density smoother
stats::density
, especially things likebw
for specifying the smoothing bandwidth. See?stats::density
.
Value
a closure (function) that produces density estimates
at the newdata
given according to the fit model.
Details
I think there are bugs with this because performing a basic test that if we fix the conditioning set (X) and integrate, integrating a conditional probability density with X fixed should yield 1.
In numerical tests, when the variance is scaled for, integrating conditional densities seems to yield integration values exceeding 1 (sometimes by a lot). I am pretty sure this poses a problem for optimizing negative log likelihood loss.
Said numerical tests are displayed in the `Density-Estimation` article.