A bigger simulation of bootstrap confidence interval variations

James E. Pustejovsky

In my previous post, I demonstrated some new brand-new additions to the simhelpers package: utilities for calculating bootstrap confidence intervals. That post walked through validating the functions against the results of other packages and showing how they can be used to extrapolate confidence interval coverage rates using fewer bootstraps than one would want to see for a real data analysis. In the course of illustrating the functions, I set up a small simulation study (with just one set of parameters) to show how the extrapolation technique works. In this post, I’m going to expand these simulations across a bigger set of conditions in a multi-factor simulation. This exercise is mostly an excuse to showcase some of the other useful features of simhelpers—especially the bundle_sim() function for creating simulation drivers and the evaluate_by_row() function for executing simulations across a grid of parameter values.

The goal of the simulation is to evaluate how the bootstrap CIs work for estimating the Pearson correlation from a bivariate $t$ data-generating process. I will compare the performance of four different confidence intervals for the correlation coefficient: 1) the Fisher-z interval, which is derived assuming bivariate normality of the measurements; 2) a percentile bootstrap CI; 3) a studentized bootstrap CI; and 4) Efron’s bias-corrected-and-accelerated bootstrap CI. For the bootstrap intervals, I will use my implementation of the Boos & Zhang (2000) extrapolation technique (which I described in much greater detail here).

Code

library(tidyverse)
library(simhelpers)

Data-generating process

As a more-or-less arbitrary data-generating process, I’ll look at a bivariate central $t$ distribution with scale matrix $Σ = [\begin{array}{cc} 1 ρ \\ ρ 1 \end{array}]$ and degrees of freedom $ν$ : $(\begin{matrix} X_{1} \\ X_{2} \end{matrix}) \sim t_{M V} ([\begin{matrix} 0 \\ 0 \end{matrix}], [\begin{array}{cc} 1 ρ \\ ρ 1 \end{array}], ν) .$ The scale matrix of the multivariate $t$ is not exactly the same as its covariance, but $ρ$ nonetheless corresponds to the correlation between $X_{1}$ and $X_{2}$ . For purposes of simulating data, I’ve fixed the means to zero and the diagonals of the scale matrix to 1 because they should not affect the distribution of the sample correlation.

Here’s a data-generating function that returns randomly generated samples based on the bivariate $t$ model:

Code

r_t_bi <- function(n, rho = 0, df = 8) {
  Sigma <- rho + diag(1 - rho, nrow = 2)
  mvtnorm::rmvt(n = n, sigma = Sigma, df = df)
}

dat <- r_t_bi(6, rho = 0.8, df = 10)
dat

           [,1]       [,2]
[1,] -1.3127111 -1.9011068
[2,] -0.5437800 -0.3590287
[3,] -0.9715497 -0.4678720
[4,] -0.8036683 -1.4880140
[5,] -0.4911120 -0.2064628
[6,] -0.5032190 -0.7707798

Estimation methods

Following along the same lines as my previous post, I will look at several different confidence intervals. First, I’ll use an interval based on Fisher’s z-transformation, which is both normalizing and variance-stabilizing (so that the standard error of the point estimator does not depend on the parameter) when the data come from a bivariate normal distribution. Let $r$ denote the usual sample correlation based on a sample of $n$ observations and let $z (r) = \frac{1}{2} \ln (\frac{1 + r}{1 - r}) = {atan}^{- 1} (r)$ be the Fisher transformation function, with inverse $r (z) = atan (z) = \frac{e^{2 z} - 1}{e^{2 z} + 1}$ . The $1 - 2 α$ -level Fisher-z interval is given by $[atan (z (r) - \frac{Φ^{- 1} (1 - α)}{\sqrt{n - 3}}), atan (z (r) + \frac{Φ^{- 1} (1 - α)}{\sqrt{n - 3}})] .$ This interval is based on a bivariate normality assumption that isn’t actually consistent with the data-generating process, so I would not expect it to have the correct coverage unless the degrees of freedom are large enough that the bivariate $t$ approaches the bivariate normal.

In addition to this interval, I’ll look at several different bootstrap intervals, including the usual percentile interval, a studentized interval, and a bias-corrected-and-accelerated interval. The percentile interval is approximate so will not necessarily have exactly nominal coverage, but its coverage rate should approach the nominal level as the sample size gets bigger. The BCa interval should have second-order accurate coverage (Efron, 1987) and so one would expect its coverage rate to approach the nominal level more quickly than the percentile interval. With all these intervals, I’ll use the the Boos & Zhang (2000) extrapolation technique to estimate the coverage level, for which I’ll need to calculate and keep track of intervals based on $B = 49, 99, 199, 299$ , and $399$ bootstrap replicates.

The BCa interval involves making small tweaks to the percentile interval based on an estimate of bias and an estimate of an acceleration constant that is a function of the empirical influence values. I’m going to keep track of the bias correction estimate and the acceleration correction estimates, so that I can understand how noisy they are.

Here’s a function that calculates the point estimator $r$ , the Fisher-z interval, the empirical influence values, and then the bootstrap CIs.

Code

cor_CI <- function(
  dat, 
  CI_type = c("percentile","student","BCa"),
  B_vals = c(49, 99, 199, 299, 399)
) {
  
  # point estimate
  r_est <- cor(dat[,1], dat[,2])
  N <- nrow(dat)
  SE_r <- sqrt((1 - r_est^2)^2 / (N - 1))
  
  # Fisher z CI
  z <- atanh(r_est)
  SE_z <- 1 / sqrt(N - 3)
  CI_z <- z + c(-1, 1) * qnorm(0.975) * SE_z

  
  # empirical influence if needed
  if ("BCa" %in% CI_type) {
    jacks <- sapply(1:N, \(x) cor(dat[-x,1], dat[-x,2]))
    inf_vals <- r_est - jacks
  } else {
    inf_vals <- NULL
  }
  
  # bootstrap samples
  r_boot <- replicate(max(B_vals), {
    i <- sample(1:N, replace = TRUE, size = N)
    r <- cor(dat[i,1], dat[i,2])
    c(r, sqrt((1 - r^2)^2 / (N - 1)))
  })
  
  bs_CIs <- simhelpers::bootstrap_CIs(
    boot_est = r_boot[1,],
    boot_se = r_boot[2,],
    est = r_est,
    se = SE_r,
    influence = inf_vals,
    CI_type = CI_type,
    B_vals = B_vals,
    format = "wide-list"
  )

  tibble::tibble(
    r = r_est,
    SE = SE_r,
    lo = tanh(CI_z[1]),
    hi = tanh(CI_z[2]),
    w = qnorm(mean(r_boot < r_est)),
    a = sum(inf_vals^3) / (6 * sum(inf_vals^2)^1.5),
    bs_CIs = bs_CIs
  )
}

res <- cor_CI(dat)
res

# A tibble: 1 × 7
      r    SE     lo    hi     w     a bs_CIs      
  <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <btstr_CI>  
1 0.741 0.202 -0.177 0.970 0.581 0.111 <df [5 × 7]>

The bundle_sim function is a convenient way to combine the data-generating function and the estimation function into one. It wraps the composition of both functions in a call to replicate(), so that I can repeat the steps of generating data and analyzing the data to my heart’s content:

Code

sim_cor <- bundle_sim(
  f_generate = r_t_bi, 
  f_analyze = cor_CI
)

The result is one function, sim_cor(), with the all of the arguments from the data-generating function and analysis function combined and with an additional argument (called reps) to specify the number of times to repeat the whole thing:

Code

args(sim_cor)

function (reps, n, rho = 0, df = 8, CI_type = c("percentile", 
    "student", "BCa"), B_vals = c(49, 99, 199, 299, 399), seed = NA_integer_) 
NULL

To illustrate how it works, I can call sim_cor() to generate five replications of the generate-and-analyze steps:

Code

five_reps <- sim_cor(5, n = 20, rho = 0.8, df = 8)
five_reps

# A tibble: 5 × 7
      r    SE    lo    hi     w        a bs_CIs      
  <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl> <btstr_CI>  
1 0.543 0.162 0.133 0.795 0.680 -0.00590 <df [5 × 7]>
2 0.735 0.106 0.433 0.888 0.529  0.0188  <df [5 × 7]>
3 0.623 0.140 0.250 0.835 0.676  0.0284  <df [5 × 7]>
4 0.672 0.126 0.326 0.859 0.688  0.0675  <df [5 × 7]>
5 0.723 0.110 0.412 0.883 0.615  0.0284  <df [5 × 7]>

Performance measures

After generating many sets of results, I’ll need to summarize across replications to evaluate the coverage rates of each set of confidence intervals. In my previous post, I did this with a bit of dplyr code. Because I’m going to run the simulation across multiple conditions with different parameter values, it will be helpful to have a function so that I can re-use it later. Here’s such a function, which has basically the same content as the code chunk from my previous post:

Code

eval_performance <- function(dat, rho = 0, B_target = 1999) {
  require(simhelpers, quietly = TRUE)
  
  dat |>
    dplyr::summarize(
      calc_absolute(estimates = r, true_param = rho, criteria = "bias"),
      calc_coverage(lower_bound = lo, upper_bound = hi, true_param = rho),
      extrapolate_coverage(
        CI_subsamples = bs_CIs, true_param = rho, 
        B_target = B_target,
        nested = TRUE, format = "long"
      ),
      across(c(w, a), list(M = ~ mean(.x), SD = ~ sd(.x)))
    )
}

eval_performance(five_reps, rho = 0.8)

# A tibble: 1 × 18
  K_absolute   bias bias_mcse K_coverage coverage coverage_mcse width width_mcse
       <int>  <dbl>     <dbl>      <int>    <dbl>         <dbl> <dbl>      <dbl>
1          5 -0.141    0.0351          5      0.8         0.179 0.541     0.0380
# ℹ 10 more variables: K_boot_coverage <int>, bootstraps <list>,
#   boot_coverage <list>, boot_coverage_mcse <list>, boot_width <list>,
#   boot_width_mcse <list>, w_M <dbl>, w_SD <dbl>, a_M <dbl>, a_SD <dbl>

The function that I created earlier with bundle_sim() can be further enhanced by wrapping in eval_performance(). I’ll just revise my earlier sim_cor() to add it:

Code

sim_cor <- bundle_sim(
  f_generate = r_t_bi, 
  f_analyze = cor_CI,
  f_summarize = eval_performance
)

The result is a “simulation driver” function, which takes model parameters as inputs, runs a full simulation, and returns a dataset with performance measures:

Code

sim_cor(
  reps = 10,
  n = 25, 
  rho = 0.75, 
  df = 8,
  CI_type = "BCa"
)

# A tibble: 1 × 18
  K_absolute     bias bias_mcse K_coverage coverage coverage_mcse width
       <int>    <dbl>     <dbl>      <int>    <dbl>         <dbl> <dbl>
1         10 -0.00667    0.0364         10      0.9        0.0949 0.374
# ℹ 11 more variables: width_mcse <dbl>, K_boot_coverage <int>,
#   bootstraps <list>, boot_coverage <list>, boot_coverage_mcse <list>,
#   boot_width <list>, boot_width_mcse <list>, w_M <dbl>, w_SD <dbl>,
#   a_M <dbl>, a_SD <dbl>

To examine the raw results without running the performance summaries, just set the summarize argument to FALSE:

Code

sim_cor(
  reps = 10,
  n = 25, 
  rho = 0.75, 
  df = 8,
  CI_type = "BCa",
  summarize = FALSE
)

# A tibble: 10 × 7
       r     SE    lo    hi     w        a bs_CIs      
   <dbl>  <dbl> <dbl> <dbl> <dbl>    <dbl> <btstr_CI>  
 1 0.780 0.0800 0.556 0.898 0.645  0.0129  <df [5 × 3]>
 2 0.526 0.148  0.165 0.763 0.645 -0.0389  <df [5 × 3]>
 3 0.762 0.0855 0.525 0.889 0.661  0.0626  <df [5 × 3]>
 4 0.825 0.0652 0.638 0.920 0.665  0.0120  <df [5 × 3]>
 5 0.925 0.0294 0.836 0.967 0.716  0.105   <df [5 × 3]>
 6 0.775 0.0816 0.547 0.896 0.573 -0.0524  <df [5 × 3]>
 7 0.769 0.0835 0.537 0.893 0.676 -0.0401  <df [5 × 3]>
 8 0.878 0.0467 0.740 0.945 0.669  0.0384  <df [5 × 3]>
 9 0.831 0.0631 0.650 0.923 0.634  0.00181 <df [5 × 3]>
10 0.738 0.0930 0.483 0.877 0.661 -0.0162  <df [5 × 3]>

Parameters to examine

Now that I’ve got a simulation driver, I’m in position to run simulations across a wider set of conditions. For multifactor simulations, I like to instantiate the full set of conditions in a dataset, which I then process to actually do the computation. For these simulations, the model parameters are the true correlation, $ρ$ , and the degrees of freedom, $ν$ ; the only design parameter is $n$ , the sample size. For $ρ$ , I’ll look at values between 0.0 and 0.8 just to cover most of the range of possibilities. (I don’t need to look at negative correlations because the statistics involved here are all symmetric with respect to the sign of $ρ$ .) For degrees of freedom, I’ll look at a fairly small value of $ν = 8$ , an intermediate value of $ν = 16$ , and a larger value of $ν = 32$ , the last of which should be fairly close to a bivariate normal distribution. For sake of complete overkill, I guess I’ll also throw in $ν = 48$ , which should be almost identical to a bivariate normal distribution. It’s useful to evaluate such a condition because the the Fisher z interval is known to be exact under bivariate normality, so we can validate the simulation set-up by verifying that the coverage levels are consistent with theory here. For the choice of sample sizes to examine, I’m interested in understanding how the coverage of the bootstrap intervals changes as sample size increases, so I will look at a fairly fine grid of values between a very small sample size of $n = 10$ and a more moderate sample size of $n = 100$ . Here’s how I create a dataset of simulation conditions:

Code

params <- expand_grid(
  n = seq(10,100,10),
  rho = seq(0.0,0.8,0.2),
  df = c(8,16,32,48)
)

params

# A tibble: 200 × 3
       n   rho    df
   <dbl> <dbl> <dbl>
 1    10   0       8
 2    10   0      16
 3    10   0      32
 4    10   0      48
 5    10   0.2     8
 6    10   0.2    16
 7    10   0.2    32
 8    10   0.2    48
 9    10   0.4     8
10    10   0.4    16
# ℹ 190 more rows

Execute

With all these components in place, it’s time to start computing. The task here is to run the sim_cor() simulation driver function on the set of parameters in each row of params. In the tidyverse idiom, one would normally do this with purrr::pmap() or one of its variants. The simhelpers package includes a function evaluate_by_row() that accomplishes the same thing. It’s more or less a wrapper to pmap(), with a few little extra touches. Instead of purrr::pmap(), it uses furrr::future_pmap() to enable parallel computing via future the package. It also keeps track of total compute time and does a little bit of tidying up the simulation results after running everything. Here is how I execute the simulation with evaluate_by_row():

Code

# enable parallel computing
library(future)
library(furrr)
plan(multisession, workers = 8L)

set.seed(20250117)

res <- evaluate_by_row(
  params = params,
  sim_function = sim_cor,
  reps = 4000,
  CI_type = c("percentile","student","BCa"),
  B_vals = c(49,99,199,299,399),
  B_target = 1999,
  .options = furrr_options(seed = TRUE),
  .progress = TRUE
)

   user  system elapsed 
   4.13    0.57 4869.19

Results

After a little further data-cleaning on the back end, we can see how the coverage rates of different intervals compare to each other across the full range of parameter values and sample sizes. Here’s a graph of the simulated coverage rates as a function of $n$ :

Code

Fisher_res <- 
  res %>%
  select(n, rho, df, coverage, width) %>%
  mutate(CI_type = "Fisher-z")

boot_res <- 
  res %>%
  select(n, rho, df, bootstraps, coverage = boot_coverage, width = boot_width) %>%
  unnest(c(bootstraps, coverage, width)) %>%
  filter(bootstraps == 1999) %>%
  select(-bootstraps)

CI_res <- 
  bind_rows(Fisher_res, boot_res) %>%
  mutate(
    rho_lab = paste("rho ==", rho),
    df_lab = factor(df, levels = c(8,16,32,48), labels = paste("nu ==", c(8,16,32,48))),
    CI_type = factor(CI_type, levels = c("Fisher-z","percentile","student","BCa"))
  )

ggplot(CI_res) + 
  aes(n, coverage, color = CI_type) + 
  facet_grid(df_lab ~ rho_lab, labeller = "label_parsed") + 
  scale_x_continuous(breaks = seq(20,100,20)) + 
  scale_y_continuous(limits = c(0.87,1.0), breaks = seq(0.9,1.0,0.05), expand = expansion(0, 0)) + 
  geom_hline(yintercept = 0.95, linetype = "dashed") + 
  geom_point() + geom_line() + 
  theme_minimal() + 
  theme(legend.position = "top") + 
  labs(x = "n", y = "Coverage rate", color = "")

A few observations here:

For small degrees of freedom, the Fisher-z interval has below-nominal coverage and appears to degrade further as sample size gets better. This makes sense because the interval is derived under a bivariate normal model that is not consistent with the actual data-generating process unless the degrees of freedom are large. With larger degrees of freedom $(ν = 32)$ , the Fisher-z has almost exactly nominal coverage.
Across degrees of freedom and true correlations, the percentile interval has below-nominal coverage for the smaller sample sizes but appears to get steadily better as sample size increases.
The studentized interval performs remarkably well, with coverage very close to nominal even at very small sample sizes.
Rather curiously, the BCa interval pretty consistently has worse coverage than the (simpler) percentile interval, even for larger sample sizes. This is counter-intuitive to me because the BCa interval is supposed to have second-order accurate coverage (Efron, 1987) whereas the percentile interval is only first-order accurate. The upshot of this is that one would expect the coverage rate of the BCa interval to improve more quickly than the percentile interval, but that doesn’t seem to be the case here.

For sake of completeness, here is a graph of the average width of each type of CI:

Code

ggplot(CI_res) + 
  aes(n, width, color = CI_type) + 
  facet_grid(df_lab ~ rho_lab, labeller = "label_parsed") + 
  scale_x_continuous(breaks = seq(20,100,20)) + 
  scale_y_continuous(breaks = seq(0,0.8,0.2), expand = expansion(0, 0)) + 
  expand_limits(y = 0) + 
  coord_cartesian(ylim = c(0,1)) + 
  geom_point() + geom_line() + 
  theme_minimal() + 
  theme(legend.position = "top") + 
  labs(x = "n", y = "Average interval width", color = "")

At the smallest sample sizes, all of the intervals are quite wide and the studentized interval is extremely so (I’ve truncated the graph at a width of 1.0, so the studentized interval goes off the scale when $n = 10$ ). The other intervals are all pretty similar in width, and nearly identically so when the degrees of freedom get large enough.

Thoughts

I think the most intriguing thing about these simulation results is the BCa interval seems to generally be worse than a regular percentile interval—not the refinement that one might expect. I’m not sure why that is. It could have something to do with how the acceleration constant is calculated. I’ve computed it with a jackknife approximation, but there are alternatives (such as using an infinitesimal jackknife, or just direct calculation if one is willing to impose a distributional assumption) which might work better. There are also some other interval construction methods that I haven’t bothered to look at here. For instance, Hu et al. (2020) proposed two intervals based on non-parametric empirical likelihood methods. They reported their own set of simulations to evaluate their proposed intervals against the Fisher-z interval, but they didn’t look at bootstrap-based intervals. Their work could be the basis for a range of extensions to the simulation reported here, such as

Adding their empirical likelihood intervals to the set of estimators examined here—how do they do compared to nonparametric bootstraps?
Evaluating the Fisher-z and bootstrap intervals under the non-normal data-generating processes that Hu et al. (2020) examined. Does the performance of the bootstrap intervals change in meaningful ways under a bivariate exponential distribution?

The main point of this post isn’t so much to get into the weeds of interval estimators for the correlation coefficient. Rather, it was to demonstrate an approach to programming simulations of bootstrap methods and showcase some of the tools available in simhelpers that can help with this process. The schematic I’ve followed for programming the simulations is discussed in much greater detail in my book with Luke Miratrix on writing simulations.

Session Information

sessioninfo::session_info()

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_United States.utf8
 ctype    English_United States.utf8
 tz       America/Chicago
 date     2025-01-18
 pandoc   3.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 ! package     * version date (UTC) lib source
 P cli           3.6.3   2024-06-21 [?] RSPM (R 4.3.0)
 P colorspace    2.1-1   2024-07-26 [?] RSPM (R 4.3.0)
 P digest        0.6.37  2024-08-19 [?] RSPM (R 4.3.0)
 P dplyr       * 1.1.4   2023-11-17 [?] RSPM
 P evaluate      0.23    2023-11-01 [?] RSPM (R 4.3.0)
 P farver        2.1.2   2024-05-13 [?] RSPM
 P fastmap       1.1.1   2023-02-24 [?] RSPM
 P forcats     * 1.0.0   2023-01-29 [?] RSPM
 P generics      0.1.3   2022-07-05 [?] RSPM
 P ggplot2     * 3.5.1   2024-04-23 [?] RSPM (R 4.3.0)
 P glue          1.8.0   2024-09-30 [?] RSPM (R 4.3.0)
 P gtable        0.3.6   2024-10-25 [?] RSPM (R 4.3.0)
 P hms           1.1.3   2023-03-21 [?] RSPM
 P htmltools     0.5.7   2023-11-03 [?] RSPM (R 4.3.0)
 P htmlwidgets   1.6.4   2023-12-06 [?] RSPM
 P jsonlite      1.8.8   2023-12-04 [?] RSPM
 P knitr         1.47    2024-05-29 [?] RSPM
 P lifecycle     1.0.4   2023-11-07 [?] RSPM
 P lubridate   * 1.9.3   2023-09-27 [?] RSPM
 P magrittr      2.0.3   2022-03-30 [?] RSPM
 P munsell       0.5.1   2024-04-01 [?] RSPM
 P mvtnorm       1.3-2   2024-11-04 [?] RSPM (R 4.3.0)
   pillar        1.10.1  2025-01-07 [1] RSPM (R 4.3.0)
 P pkgconfig     2.0.3   2019-09-22 [?] RSPM
 P purrr       * 1.0.2   2023-08-10 [?] RSPM
 P R6            2.5.1   2021-08-19 [?] RSPM
 P rbibutils     2.3     2024-10-04 [?] RSPM (R 4.3.0)
   Rdpack        2.6.2   2024-11-15 [1] RSPM (R 4.3.0)
 P readr       * 2.1.5   2024-01-10 [?] RSPM
   renv          1.0.5   2024-02-29 [1] RSPM (R 4.3.1)
 P rlang         1.1.4   2024-06-04 [?] RSPM
 P rmarkdown     2.27    2024-05-17 [?] RSPM
 P rstudioapi    0.17.1  2024-10-22 [?] RSPM (R 4.3.0)
 P scales        1.3.0   2023-11-28 [?] RSPM
 P sessioninfo   1.2.2   2021-12-06 [?] RSPM
   simhelpers  * 0.3.1   2025-01-14 [1] Github (meghapsimatrix/simhelpers@3a7dda2)
 P stringi       1.8.4   2024-05-06 [?] RSPM (R 4.3.0)
 P stringr     * 1.5.1   2023-11-14 [?] RSPM
 P tibble      * 3.2.1   2023-03-20 [?] RSPM
 P tidyr       * 1.3.1   2024-01-24 [?] RSPM
 P tidyselect    1.2.1   2024-03-11 [?] RSPM
 P tidyverse   * 2.0.0   2023-02-22 [?] RSPM
 P timechange    0.3.0   2024-01-18 [?] RSPM
 P tzdb          0.4.0   2023-05-12 [?] RSPM
 P vctrs         0.6.5   2023-12-01 [?] RSPM
 P withr         3.0.2   2024-10-28 [?] RSPM (R 4.3.0)
 P xfun          0.45    2024-06-16 [?] RSPM
 P yaml          2.3.8   2023-12-11 [?] RSPM

 [1] C:/Users/James/Documents/jepusto-quarto/renv/library/R-4.3/x86_64-w64-mingw32
 [2] C:/Users/James/AppData/Local/R/cache/R/renv/sandbox/R-4.3/x86_64-w64-mingw32/bd3f13aa

 P ── Loaded and on-disk path mismatch.

──────────────────────────────────────────────────────────────────────────────

Back to top

References

Boos, D. D., & Zhang, J. (2000). Monte Carlo evaluation of resampling-based hypothesis tests. Journal of the American Statistical Association, 95(450), 486–492. https://doi.org/10.1080/01621459.2000.10474226

Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82(397), 171–185. https://doi.org/10.1080/01621459.1987.10478410

Hu, X., Jung, A., & Qin, G. (2020). Interval estimation for the correlation coefficient. The American Statistician, 74(1), 29–36. https://doi.org/10.1080/00031305.2018.1437077