The multivariate delta method

delta method
distribution theory
Author

James E. Pustejovsky

Published

April 11, 2018

The delta method is surely one of the most useful techniques in classical statistical theory. It’s perhaps a bit odd to put it this way, but I would say that the delta method is something like the precursor to the bootstrap, in terms of its utility and broad range of applications—both are “first-line” tools for solving statistical problems. There are many good references on the delta-method, ranging from the Wikipedia page to a short introduction in The American Statistician (Oehlert, 1992). Many statistical theory textbooks also include a longer or shorter discussion of the method (e.g., Stuart & Ord, 1996; Casella & Berger, 2002).

I use the delta method all the time in my work, especially to derive approximations to the sampling variance of some estimator (or covariance between two estimators). Here I’ll give one formulation of the multivariate delta method that I find particularly useful for this purpose. (This is nothing at all original. I’m only posting it on the off chance that others might find my crib notes helpful—and by “others” I mostly mean myself in six months…)

Multi-variate delta method covariances

Suppose that we have a p-dimensional vector of statistics T=(T1,...,Tp) that converge in distribution to the parameter vector θ=(θ1,...,θp) and have asymptotic covariance matrix Σ/n, i.e.,

n(Tθ)DN(0,Σ).

Now consider two functions f and g, both of which take vectors as inputs, return scalar quantities, and don’t have funky (discontinuous) derivatives. The asymptotic covariance between f(T) and g(T) is then approximately

Cov(f(T),g(T))1nj=1pk=1pfθjgθkσjk,

where σjk is the entry in row j and column k of the matrix Σ. If the entries of T are asymptotically uncorrelated , then this simplifies to

Cov(f(T),g(T))1nj=1pfθjgθjσjj.

If we are interested in the variance of a single statistic, then the above formulas simplify further to

Var(f(T))1nj=1pk=1pfθjfθkσjk

or

Var(f(T))1nj=1p(fθj)2σjj

in the case of uncorrelated T.

Finally, if we are dealing with a univariate transformation f(θ), then of course the above simplifies even further to

Var(f(T))=(fθ)2Var(T)

Pearson’s r

These formulas are useful for all sorts of things. For example, they can be used to derive the sampling variance of Pearson’s correlation coefficient. Suppose we have a simple random sample of n observations from a multivariate normal distribution with mean 0 and variance-covariance matrix Φ=[ϕxxϕxyϕxyϕyy]. Pearson’s correlation is calculated as

r=sxysxxsyy,

where sxx and syy are sample variances and sxy is the sample covariance. These sample variances and covariances are unbiased estimates of ϕxx, ϕyy, and ϕxy, respectively. So in terms of the above notation, we have T=(sxx,syy,sxy), θ=(ϕxx,ϕyy,ϕxy), and ρ=ϕxy/ϕxxϕyy.

From a previous post, we can work out the variance-covariance matrix of T:

Var(n1[sxxsyysxy])=Σ=[2ϕxx22ϕxy22ϕyy22ϕxyϕxx2ϕxyϕyyϕxy2+ϕxxϕyy].

The last piece is to find the derivatives of r with respect to T:

rϕxy=ϕxx1/2ϕyy1/2rϕxx=12ϕxyϕxx3/2ϕyy1/2rϕyy=12ϕxyϕxx1/2ϕyy3/2

Putting the pieces together, we have

(n1)Var(r)σ11(rϕxy)2+σ22(rϕxx)2+σ33(rϕyy)2+2σ12rϕxyrϕxx+2σ13rϕxyrϕyy+2σ23rϕxxrϕyy=ϕxy2+ϕxxϕyyϕxxϕyy+ϕxy2ϕxx22ϕxx3ϕyy+ϕxy2ϕyy22ϕxxϕyy32ϕxyϕxxϕxx2ϕyy2ϕxyϕyyϕxxϕyy2+ϕxy4ϕxx2ϕyy2=12ϕxy2ϕxxϕyy+ϕxy4ϕxx2ϕyy2=(1ρ2)2.

Fisher’s z-transformation

Meta-analysts will be very familiar with Fisher’s z-transformation of r, given by z(ρ)=12log(1+ρ1ρ). Fisher’s z is the variance-stabilizing (and also normalizing) transformation of r, meaning that Var(z(r)) is approximately a constant function of sample size, not depending on the degree of correlation ρ. We can see this using another application of the delta method:

zρ=11ρ2.

Thus,

Var(z(r))1(1ρ2)2×Var(r)=1n1.

The variance of z is usually given as 1/(n3), which is even closer to exact. Here we’ve obtained the variance of z using two applications of the delta-method. Because of the chain rule, we’d have ended up with the same result if we’d gone straight from the sample variances and covariances, using the multivariate delta method and the derivatives of z with respect to θ.

Covariances between correlations

These same techniques can be used to work out expressions for the covariances between correlations estimated on the same sample. For instance, suppose you’ve measured four variables, W, X, Y, and Z, on a simple random sample of n observations. What is Cov(rxy,rxz)? What is Cov(rwx,ryz)? I’ll leave the derivations for you to work out. See Steiger (1980) for solutions.

Back to top