Test proposed by Zhang and Zhu (2022)

Zhang and Zhu (2022)'s test for testing equality of two-sample high-dimensional mean vectors with assuming that two covariance matrices are the same.

Usage

ts_zz2022(y1, y2)

Arguments

y1: The data matrix ($p$ by $n_1$) from the first population. Each column represents a $p$-dimensional observation.
y2: The data matrix ($p$ by $n_2$) from the first population. Each column represents a $p$-dimensional observation.

Value

A (list) object of S3 class htest containing the following elements:

p.value: the p-value of the test proposed by Zhang and Zhu (2022).
statistic: the test statistic proposed by Zhang and Zhu (2022).
beta0: parameter used in Zhang and Zhu (2022)'s test
beta1: parameter used in Zhang and Zhu (2022)'s test
df: estimated approximate degrees of freedom of Zhang and Zhu (2022)'s test.

Details

Suppose we have two independent high-dimensional samples: $$ \boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},i=1,2. $$ The primary object is to test $$H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.$$ Zhang et al.(2022) proposed the following test statistic: $$T_{ZZ} = \frac{n_1n_2}{n} \|\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2\|^2-\operatorname{tr}(\hat{\boldsymbol{\Sigma}}),$$ where $\bar{\boldsymbol{y}}_{i},i=1,2$ are the sample mean vectors and $\hat{\boldsymbol{\Sigma}}$ is the pooled sample covariance matrix. They showed that under the null hypothesis, $T_{ZZ}$ and a chi-squared-type mixture have the same normal or non-normal limiting distribution.

References

Zhang J, Zhu T (2022). “A revisit to Bai--Saranadasa's two-sample test.” Journal of Nonparametric Statistics, 34(1), 58--76. doi:10.1080/10485252.2021.2015768 .

Examples

set.seed(1234)
n1 <- 20
n2 <- 30
p <- 50
mu1 <- t(t(rep(0, p)))
mu2 <- mu1
rho <- 0.1
y <- (-2 * sqrt(1 - rho) + sqrt(4 * (1 - rho) + 4 * p * rho)) / (2 * p)
x <- y + sqrt((1 - rho))
Gamma <- matrix(rep(y, p * p), nrow = p)
diag(Gamma) <- rep(x, p)
Z1 <- matrix(rnorm(n1 * p, mean = 0, sd = 1), p, n1)
Z2 <- matrix(rnorm(n2 * p, mean = 0, sd = 1), p, n2)
y1 <- Gamma %*% Z1 + mu1 %*% (rep(1, n1))
y2 <- Gamma %*% Z2 + mu2 %*% (rep(1, n2))
ts_zz2022(y1, y2)
#> 
#> 
#> 
#> data:  
#> statistic = 0.46212, df = 9.3201, beta0 = -26.1767, beta1 = 2.8086,
#> p-value = 0.4236
#>