Skip to contents

Takes a specification for a data generating process (dgp), list of estimators, config, list of summary_statistics to compute, and provides a get_results method.

Details

The core idea is that a statistical simulation study consists of specifying a repeatable data generating process, some functions (estimators) to run on each generated data sample, and some summary statistics to compute from the simulation results (typically that indicate aspects of the performance of the estimators considered). This is represented by the following pipeline:

  dgp
  estimators    ->  Simulation$new( ... ) -> sim$run() -> sim$get_results()
  config
  summary_fns

Public fields

dgp

A function that takes a single argument n for sample size and generates synthetic data.

estimators

A list of estimators that each can be called on the data

config

A list containing at least the number of replications to perform, the sample_size to use, and whether or not to run in parallel.

summary_stats

A list of summary statistic functions that can be called on the estimates produced

results

A data.frame of results from running the simulation

initialize

Method to initialize the simulation object (does nothing)

set_dgp

Method to set the data generating process

set_estimators

Method to set the estimators

set_config

Method to set the configuration

get_results

Method to retrieve results

set_summary_stats

Method to set summary statistics

run

Method to run the simulation

Active bindings

initialize

Method to initialize the simulation object (does nothing)

set_dgp

Method to set the data generating process

set_estimators

Method to set the estimators

set_config

Method to set the configuration

get_results

Method to retrieve results

set_summary_stats

Method to set summary statistics

run

Method to run the simulation

Methods


Method new()

Usage


Method set_dgp()

Usage

Simulation$set_dgp(dgp_func)

Arguments

dgp_func

A data generating process function (of one argument, n) that produces a dataset for simulation purposes of the sample size given.


Method set_estimators()

Usage

Simulation$set_estimators(estimator_list)

Arguments

estimator_list

A list of functions that can be evaluated on the data output from the data generating process self$dgp.


Method set_config()

Usage

Simulation$set_config(config_list)

Arguments

config_list

A list of configuration settings for the simulation. The following are used by Simulacron3::Simulation by default: replications (integer), sample_size, and parallel.


Method set_summary_stats()

Usage

Simulation$set_summary_stats(summary_func)

Arguments

summary_func

The summary function to set for the simulation. summary_func should take as arguments i, est_results, data.


Method run()

Usage

Simulation$run()


Method get_results()

Usage

Simulation$get_results()


Method clone()

The objects of this class are cloneable with this method.

Usage

Simulation$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

if (FALSE) { # \dontrun{
# Example Usage
# Define a data generating process
dgp <- function(n) data.frame(x = rnorm(n), y = rnorm(n))

# Define some estimators
estimators <- list(
  mean_estimator = function(data) mean(data$x),
  var_estimator = function(data) var(data$x)
)

# Define a summary statistics function
summary_func <- function(iter = NULL, est_results, data = NULL) {
  data.frame(
    mean_est = est_results$mean_estimator,
    var_est = est_results$var_estimator
  )
}

# Create a simulation object
sim <- Simulation$new()

# Set up the simulation
sim$set_dgp(dgp)
sim$set_estimators(estimators)
sim$set_config(list(replications = 500, sample_size = 50))
sim$set_summary_stats(summary_func)

# Run the simulation
sim$run()

# Retrieve results
results <- sim$get_results()
head(results)
} # }