Skip to contents

Zhu and Zhang (2022)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.

Usage

glht_zz2022(Y,G,n,p)

Arguments

Y

A list of \(k\) data matrices. The \(i\)th element represents the data matrix (\(p\times n_i\)) from the \(i\)th population with each column representing a \(p\)-dimensional observation.

G

A known full-rank coefficient matrix (\(q\times k\)) with \(\operatorname{rank}(\boldsymbol{G})<k\).

n

A vector of \(k\) sample sizes. The \(i\)th element represents the sample size of group \(i\), \(n_i\).

p

The dimension of data.

Value

A (list) object of S3 class htest containing the following elements:

p.value

the \(p\)-value of the test proposed by Zhu and Zhang (2022).

statistic

the test statistic proposed by Zhu and Zhang (2022).

beta0

the parameter used in Zhu and Zhang (2022)'s test.

beta1

the parameter used in Zhu and Zhang (2022)'s test.

df

estimated approximate degrees of freedom of Zhu and Zhang (2022)'s test.

Details

Suppose we have the following \(k\) independent high-dimensional samples: $$ \boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},\; i=1,\ldots,k. $$ It is of interest to test the following GLHT problem: $$H_0: \boldsymbol{G M}=\boldsymbol{0}, \quad \text { vs. } \quad H_1: \boldsymbol{G M} \neq \boldsymbol{0},$$ where \(\boldsymbol{M}=(\boldsymbol{\mu}_1,\ldots,\boldsymbol{\mu}_k)^\top\) is a \(k\times p\) matrix collecting \(k\) mean vectors and \(\boldsymbol{G}:q\times k\) is a known full-rank coefficient matrix with \(\operatorname{rank}(\boldsymbol{G})<k\).

Zhu and Zhang (2022) proposed the following test statistic: $$ T_{ZZ}=\|\boldsymbol{C} \hat{\boldsymbol{\mu}}\|^2-q \operatorname{tr}(\hat{\boldsymbol{\Sigma}}), $$ where \(\boldsymbol{C}=[(\boldsymbol{G D G}^\top)^{-1/2}\boldsymbol{G}]\otimes\boldsymbol{I}_p\), and \(\hat{\boldsymbol{\mu}}=(\bar{\boldsymbol{y}}_1^\top,\ldots,\bar{\boldsymbol{y}}_k^\top)^\top\), with \(\bar{\boldsymbol{y}}_{i},i=1,\ldots,k\) being the sample mean vectors and \(\hat{\boldsymbol{\Sigma}}\) being the usual pooled sample covariance matrix of the \(k\) samples.

They showed that under the null hypothesis, \(T_{ZZ}\) and a chi-squared-type mixture have the same normal or non-normal limiting distribution.

References

Zhu T, Zhang J (2022). “Linear hypothesis testing in high-dimensional one-way MANOVA: a new normal reference approach.” Computational Statistics, 37(1), 1--27. doi:10.1007/s00180-021-01110-6 .

Examples

set.seed(1234)
k <- 3
p <- 50
n <- c(25, 30, 40)
rho <- 0.1
M <- matrix(rep(0, k * p), nrow = k, ncol = p)
y <- (-2 * sqrt(1 - rho) + sqrt(4 * (1 - rho) + 4 * p * rho)) / (2 * p)
x <- y + sqrt((1 - rho))
Gamma <- matrix(rep(y, p * p), nrow = p)
diag(Gamma) <- rep(x, p)
Y <- list()
for (g in 1:k) {
  Z <- matrix(rnorm(n[g] * p, mean = 0, sd = 1), p, n[g])
  Y[[g]] <- Gamma %*% Z + t(t(M[g, ])) %*% (rep(1, n[g]))
}
G <- cbind(diag(k - 1), rep(-1, k - 1))
glht_zz2022(Y, G, n, p)
#> 
#> 
#> 
#> data:  
#> statistic = 5.345, df = 18.8571, beta0 = -51.3143, beta1 = 2.7212,
#> p-value = 0.3382
#>