Conditional Density Estimation in the {nadir} Package

The following learners are available for conditional density estimation:

lnr_lm_density
lnr_glm_density
lnr_homoscedastic_density

Details

There are a few important things to know about conditional density estimation in the nadir package.

Firstly, conditional density learners must produce prediction functions that predict _densities_ at the new outcome values given the new covariates.

Secondly, the implemented density estimators come in two flavors: those with a strong assumption (that of conditional normality), and those with much weaker assumptions. The strong assumption is encoded into learners like lnr_lm_density and lnr_glm_density and says "after we model the predicted mean given covariates, we expect the remaining errors to be normally distributed." The more flexible learners produced by lnr_homoskedastic_density are similar in spirit, except they fit a stats::density kernel bandwidth smoother to the error distribution (after predicting the conditional expected mean).

A subpoint to the above point that's worth calling attention to is that lnr_homoskedastic_density is a learner factory. That is to say, given a mean_lnr, lnr_homoskedastic_density produces a conditional density learner that uses that mean_lnr.

Work is ongoing on implementing a lnr_heteroskedastic_density learner that allows for predicting higher or lower variance in the conditional density given covariates.

Conditional density learners should be combined with the negative log loss function when using super_learner() or using compare_learners. Refer to the 2003 Dudoit and van der Laan paper for a starting place on the appropriate loss functions to use for different types of outcomes. <https://biostats.bepress.com/ucbbiostat/paper130/>

Conditional Density Estimation in the `{nadir}` Package

Details

See also