
Conditional Density Estimation in the {nadir}
Package
Source: R/density_learners.R
density_learners.Rd
The following learners are available for conditional density estimation:
lnr_lm_density
lnr_glm_density
lnr_homoscedastic_density
Details
There are a few important things to know about conditional density
estimation in the nadir
package.
Firstly, conditional density learners must produce prediction functions that predict _densities_ at the new outcome values given the new covariates.
Secondly, the implemented density estimators come in two flavors:
those with a strong assumption (that of conditional normality), and those
with much weaker assumptions. The strong assumption is encoded
into learners like lnr_lm_density
and lnr_glm_density
and says "after we model the predicted mean given covariates, we expect
the remaining errors to be normally distributed." The
more flexible learners produced by lnr_homoskedastic_density
are similar in spirit, except they fit a stats::density
kernel
bandwidth smoother to the error distribution (after predicting the
conditional expected mean).
A subpoint to the above point that's worth calling attention to is that
lnr_homoskedastic_density
is a learner factory. That is to say,
given a mean_lnr
, lnr_homoskedastic_density
produces a
conditional density learner that uses that mean_lnr
.
Work is ongoing on implementing a lnr_heteroskedastic_density
learner that allows for predicting higher or lower variance in the
conditional density given covariates.
Conditional density learners should be combined with the negative log loss
function when using super_learner()
or using compare_learners
.
Refer to the 2003 Dudoit and van der Laan paper for a starting place on the
appropriate loss functions to use for different types of outcomes.
<https://biostats.bepress.com/ucbbiostat/paper130/>