Skip to contents

Schott, J. R. (2007)'s test for one-way MANOVA problem for high-dimensional data with assuming that underlying covariance matrices are the same.

Usage

ks_s2007(Y,n,p)

Arguments

Y

A list of \(k\) data matrices. The \(i\)th element represents the data matrix (\(p\times n_i\)) from the \(i\)th population with each column representing a \(p\)-dimensional observation.

n

A vector of \(k\) sample sizes. The \(i\)th element represents the sample size of group \(i\), \(n_i\).

p

The dimension of data.

Value

A (list) object of S3 class htest containing the following elements:

statistic

the test statistic proposed by Schott (2007).

p.value

the \(p\)-value of the test proposed by Schott (2007).

Details

Suppose we have the following \(k\) independent high-dimensional samples: $$ \boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},i=1,\ldots,k. $$ It is of interest to test the following one-way MANOVA problem: $$H_0: \boldsymbol{\mu}_1=\cdots=\boldsymbol{\mu}_k, \quad \text { vs. }\; H_1: H_0 \;\operatorname{is \; not\; ture}.$$ Schott (2007) proposed the following test statistic: $$ T_{S}=[\operatorname{tr}(\boldsymbol{H})/h-\operatorname{tr}(\boldsymbol{E})/e]/\sqrt{N-1}, $$ where \(\boldsymbol{H}=\sum_{i=1}^kn_i(\bar{\boldsymbol{y}}_i-\bar{\boldsymbol{y}})(\bar{\boldsymbol{y}}_i-\bar{\boldsymbol{y}})^\top\), \(\boldsymbol{E}=\sum_{i=1}^k\sum_{j=1}^{n_i}(\boldsymbol{y}_{ij}-\bar{\boldsymbol{y}}_{i})(\boldsymbol{y}_{ij}-\bar{\boldsymbol{y}}_{i})^\top\), \(h=k-1\), and \(e=N-k\), with \(N=n_1+\cdots+n_k\). They showed that under the null hypothesis, \(T_{S}\) is asymptotically normally distributed.

References

Schott JR (2007). “Some high-dimensional tests for a one-way MANOVA.” Journal of Multivariate Analysis, 98(9), 1825--1839. doi:10.1016/j.jmva.2006.11.007 .

Examples

set.seed(1234)
k <- 3
p <- 50
n <- c(25, 30, 40)
rho <- 0.1
M <- matrix(rep(0, k * p), nrow = k, ncol = p)
y <- (-2 * sqrt(1 - rho) + sqrt(4 * (1 - rho) + 4 * p * rho)) / (2 * p)
x <- y + sqrt((1 - rho))
Gamma <- matrix(rep(y, p * p), nrow = p)
diag(Gamma) <- rep(x, p)
Y <- list()
for (g in 1:k) {
  Z <- matrix(rnorm(n[g] * p, mean = 0, sd = 1), p, n[g])
  Y[[g]] <- Gamma %*% Z + t(t(M[g, ])) %*% (rep(1, n[g]))
}
ks_s2007(Y, n, p)
#> 
#> 
#> 
#> data:  
#> statistic = 0.31984, p-value = 0.3745
#>