Skip to contents

This article contains some advice for writing and constructing new learners.

Simple Learner Examples

lnr_lm <- function(data, formula, ...) {
  model <- stats::lm(formula = formula, data = data, ...)

  predict_from_trained_lm <- function(newdata) {
    predict(model, newdata = newdata, type = 'response')
  }
  return(predict_from_trained_lm)
}

Weights

We recommend explicitly handling weights as an argument to learners so that it is a protected argument. Some of the internals of different algorithms may vary, using other names for weights instead, so we recommend doing this to standardize the weights argument across different learner algorithms. As a concrete example, the ranger::ranger() function takes case.weights as its argument rather than weights.

A typical learner that supports weights might look like:

lnr_supportsWeights <- function(data, formula, weights = NULL, ...) {

  # train the model 
  model <- model_fit(data = data, formula = formula, weights = weights, ...)
  
  return(function(newdata) {
    predict(model, newdata = newdata)
  })
}

However, some model fitting procedures do not like passing weights = NULL and so it may be necessary to be careful not to pass the default weights = NULL to the model fitting procedure. As an example of this, we refer the curious reader to the source of lnr_glm in https://github.com/ctesta01/nadir/blob/main/R/learners.R.

model_training_arguments <- list(data = data, formula = formula)

# add weights they aren’t missing if (! is.null(weights) & length(weights) == nrow(data)) { model_training_arguments$weights <- weights }

Attributes

It’s recommended that if you create learners, that you also give them a couple attributes for a couple of reasons:

  • If a learner has a sl_lnr_name attribute, then this can be automatically used in the outputs if a name for the learner is left unspecified.
  • If a learner has a sl_lnr_type attribute, it will be checked against the output_type argument to super_learner().

To set these attributes, when making a new learner, one should run something along the lines of

lnr_myNewLearner <- function(data, formula, ...) {
  model <- # fit your learner given data, formula, ...
    
  predictor_fn <- function(newdata) {
    predict(model, newdata = newdata)
  }
  return(predictor)
}
attr(lnr_myNewLearner, 'sl_lnr_name') <- 'newLearnerName'
attr(lnr_myNewLearner, 'sl_lnr_type') <- 'continuous' # or c('continuous', 'binary') and similar
# see ?nadir_supported_types