Skip to contents

Yamada and Srivastava (2012)'test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.

Usage

glht_ys2012(Y,X,C)

Arguments

Y

An \(n\times p\) response matrix obtained by independently observing a \(p\)-dimensional response variable for \(n\) subjects.

X

A known \(n\times k\) full-rank design matrix with \(\operatorname{rank}(\boldsymbol{G})=k<n\).

C

A known matrix of size \(q\times k\) with \(\operatorname{rank}(\boldsymbol{C})=q<k\).

Value

A (list) object of S3 class htest containing the following elements:

statistic

the test statistic proposed by Yamada and Srivastava (2012).

p.value

the \(p\)-value of the test proposed by Yamada and Srivastava (2012).

Details

A high-dimensional linear regression model can be expressed as $$\boldsymbol{Y}=\boldsymbol{X\Theta}+\boldsymbol{\epsilon},$$ where \(\Theta\) is a \(k\times p\) unknown parameter matrix and \(\boldsymbol{\epsilon}\) is an \(n\times p\) error matrix.

It is of interest to test the following GLHT problem $$H_0: \boldsymbol{C\Theta}=\boldsymbol{0}, \quad \text { vs. } H_1: \boldsymbol{C\Theta} \neq \boldsymbol{0}.$$

Yamada and Srivastava (2012) proposed the following test statistic: $$T_{YS}=\frac{(n-k)\operatorname{tr}(\boldsymbol{S}_h\boldsymbol{D}_{\boldsymbol{S}_e}^{-1})-(n-k)pq/(n-k-2)}{\sqrt{2q[\operatorname{tr}(\boldsymbol{R}^2)-p^2/(n-k)]c_{p,n}}},$$ where \(\boldsymbol{S}_h\) and \(\boldsymbol{S}_e\) are the variation matrices due to the hypothesis and error, respectively, and \(\boldsymbol{D}_{\boldsymbol{S}_e}\) and \(\boldsymbol{R}\) are diagonal matrix with the diagonal elements of \(\boldsymbol{S}_e\) and the sample correlation matrix, respectively. \(c_{p, n}\) is the adjustment coefficient proposed by Yamada and Srivastava (2012). They showed that under the null hypothesis, \(T_{YS}\) is asymptotically normally distributed.

References

Yamada T, Srivastava MS (2012). “A test for multivariate analysis of variance in high dimension.” Communications in Statistics-Theory and Methods, 41(13-14), 2602--2615. doi:10.1080/03610926.2011.581786 .

Examples

set.seed(1234)
k <- 3
q <- k-1
p <- 50
n <- c(25,30,40)
rho <- 0.01
Theta <- matrix(rep(0,k*p),nrow=k)
X <- matrix(c(rep(1,n[1]),rep(0,sum(n)),rep(1,n[2]),rep(0,sum(n)),rep(1,n[3])),ncol=k,nrow=sum(n))
y <- (-2*sqrt(1-rho)+sqrt(4*(1-rho)+4*p*rho))/(2*p)
x <- y+sqrt((1-rho))
Gamma <- matrix(rep(y,p*p),nrow=p)
diag(Gamma) <- rep(x,p)
U <- matrix(ncol = sum(n),nrow=p)
for(i in 1:sum(n)){
U[,i] <- rnorm(p,0,1)
}
Y <- X%*%Theta+t(U)%*%Gamma
C <- cbind(diag(q),-rep(1,q))
glht_ys2012(Y,X,C)
#> 
#> 
#> 
#> data:  
#> statistic = 0.33789, cpn = 1.2185, p-value = 0.3677
#>