Test proposed by Zhang et al. (2017) — glht

Zhang et al. (2017)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.

Usage

glht_zgz2017(Y,G,n,p)

Arguments

Y: A list of $k$ data matrices. The $i$th element represents the data matrix ($p\times n_i$) from the $i$th population with each column representing a $p$-dimensional observation.
G: A known full-rank coefficient matrix ($q\times k$) with $\operatorname{rank}(\boldsymbol{G})<k$.
n: A vector of $k$ sample sizes. The $i$th element represents the sample size of group $i$, $n_i$.
p: The dimension of data.

Value

A (list) object of S3 class htest containing the following elements:

statistic: the test statistic proposed by Zhang et al. (2017)
p.value: the $p$-value of the test proposed by Zhang et al. (2017).
beta: the parameters used in Zhang et al. (2017)'s test.
df: estimated approximate degrees of freedom of Zhang et al.(2017)'s test.

Details

Suppose we have the following $k$ independent high-dimensional samples: $$ \boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},\;i=1,\ldots,k. $$ It is of interest to test the following GLHT problem: $$H_0: \boldsymbol{G M}=\boldsymbol{0}, \quad \text { vs. } \quad H_1: \boldsymbol{G M} \neq \boldsymbol{0},$$ where $\boldsymbol{M}=(\boldsymbol{\mu}_1,\ldots,\boldsymbol{\mu}_k)^\top$ is a $k\times p$ matrix collecting $k$ mean vectors and $\boldsymbol{G}:q\times k$ is a known full-rank coefficient matrix with $\operatorname{rank}(\boldsymbol{G})<k$.

Zhang et al. (2017) proposed the following test statistic: $$ T_{ZGZ}=\|\boldsymbol{C \hat{\mu}}\|^2, $$ where $\boldsymbol{C}=[(\boldsymbol{G D G}^\top)^{-1/2}\boldsymbol{G}]\otimes\boldsymbol{I}_p$, and $\hat{\boldsymbol{\mu}}=(\bar{\boldsymbol{y}}_1^\top,\ldots,\bar{\boldsymbol{y}}_k^\top)^\top$, with $\bar{\boldsymbol{y}}_{i},i=1,\ldots,k$ being the sample mean vectors and $\boldsymbol{D}=\operatorname{diag}(1/n_1,\ldots,1/n_k)$.

They showed that under the null hypothesis, $T_{ZGZ}$ and a chi-squared-type mixture have the same normal or non-normal limiting distribution.

References

Zhang J, Guo J, Zhou B (2017). “Linear hypothesis testing in high-dimensional one-way MANOVA.” Journal of Multivariate Analysis, 155, 200--216. doi:10.1016/j.jmva.2017.01.002 .

Examples

set.seed(1234)
k <- 3
p <- 50
n <- c(25, 30, 40)
rho <- 0.1
M <- matrix(rep(0, k * p), nrow = k, ncol = p)
y <- (-2 * sqrt(1 - rho) + sqrt(4 * (1 - rho) + 4 * p * rho)) / (2 * p)
x <- y + sqrt((1 - rho))
Gamma <- matrix(rep(y, p * p), nrow = p)
diag(Gamma) <- rep(x, p)
Y <- list()
for (g in 1:k) {
  Z <- matrix(rnorm(n[g] * p, mean = 0, sd = 1), p, n[g])
  Y[[g]] <- Gamma %*% Z + t(t(M[g, ])) %*% (rep(1, n[g]))
}
G <- cbind(diag(k - 1), rep(-1, k - 1))
glht_zgz2017(Y, G, n, p)
#> 
#> 
#> 
#> data:  
#> statistic = 103.56, df = 70.5439, beta = 1.3914, p-value = 0.353
#>