Skip to contents

Li et al. (2025)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data under heteroscedasticity.

Usage

LHNB2025.GLHTBF.NABT(Y, B, O, A, n, p)

Arguments

Y

A list of \(k\) data matrices. The \(i\)th element represents the data matrix (\(n_i \times p\)) from the \(i\)th population with each row representing a \(p\)-dimensional observation.

B

A vector of \(k\) coefficients \((B_1,\ldots,B_k)\) specifying the linear combination of group mean vectors.

O

A length-\(p\) vector used to form \(\Omega = \mathrm{diag}(O_1^2,\ldots,O_p^2)\).

A

A length-\(p\) vector used in \(W = \Omega + A A^\top\).

n

A vector of \(k\) sample sizes. The \(i\)th element represents the sample size of group \(i\), \(n_i\).

p

The dimension of data.

Value

A list of class "NRtest" containing the results of the hypothesis test.

Details

Suppose we have \(k\) independent high-dimensional samples $$\boldsymbol{Y}_{i1},\ldots,\boldsymbol{Y}_{in_i}\ \text{are i.i.d. with}\ \mathrm{E}(\boldsymbol{Y}_{i1})=\boldsymbol{\mu}_i,\ \mathrm{Cov}(\boldsymbol{Y}_{i1})=\boldsymbol{\Sigma}_i,\ i=1,\ldots,k,$$ where the covariance matrices \(\boldsymbol{\Sigma}_i\) may differ across groups.

It is of interest to test the k-sample linear hypothesis $$H_0:\ \sum_{i=1}^k B_i\boldsymbol{\mu}_i=\boldsymbol{0}\quad \text{vs.}\quad H_1:\ \sum_{i=1}^k B_i\boldsymbol{\mu}_i\neq\boldsymbol{0}.$$

Li et al. (2025) proposed a random-integration-based U-statistic \(T_n\) (Eq. (5) in the paper), constructed using the weight matrix \(\boldsymbol{W}=\boldsymbol{\Omega}+\boldsymbol{A}\boldsymbol{A}^\top\) with \(\boldsymbol{\Omega}=\mathrm{diag}(O_1^2,\ldots,O_p^2)\). They showed that the standardized statistic \(Z=T_n/\sqrt{\hat{\sigma}^2}\) is approximated by \(N(0,1)\) under \(H_0\).

A recommended default choice of tuning parameters is of the form \(A_1=\cdots=A_p=\sqrt{5}\,p^{-3/8}\) and \(O_k=\sqrt{\epsilon\left(1+\frac{2k}{3p}\right)}\), \(k=1,\ldots,p\).

References

Li J, Hong S, Niu Z, Bai Z (2025). “Test for high-dimensional linear hypothesis of mean vectors via random integration.” Statistical Papers, 66(1), 8.

Examples

# \donttest{
library("HDNRA")
data("corneal")

# corneal: 150 x p, split into 4 groups (n_i x p)
group1 <- as.matrix(corneal[1:43,  ])      # normal
group2 <- as.matrix(corneal[44:57, ])      # unilateral suspect
group3 <- as.matrix(corneal[58:78, ])      # suspect map
group4 <- as.matrix(corneal[79:150,])      # clinical keratoconus

Y <- list(group2, group3, group4)
n <- c(nrow(group2), nrow(group3), nrow(group4))
p <- ncol(group2)

# One linear combination (example): B = (4, -1.5, -2.5)
B <- c(4, -1.5, -2.5)

# Paper-style tuning parameters (example with eps = 2)
A <- rep(sqrt(5) * p^(-3/8), p)
O <- sqrt(2) * (1 + 2*(1:p)/(3*p))

LHNB2025.GLHTBF.NABT(Y, B, O, A, n, p)
#> 
#> Results of Hypothesis Test
#> --------------------------
#> 
#> Test name:                       Random integration test
#> 
#> Null Hypothesis:                 Linear combination of mean vectors is 0
#> 
#> Alternative Hypothesis:          Linear combination of mean vectors is not 0
#> 
#> Data:                            Y
#> 
#> Sample Sizes:                    n1 = 14
#>                                  n2 = 21
#>                                  n3 = 72
#> 
#> Sample Dimension:                2000
#> 
#> Test Statistic:                  Z[RI] = -0.0521
#> 
#> Approximation method to the      Normal approximation
#> null distribution of Z[RI]: 
#> 
#> Approximation parameter(s):      Tn      =    -5.4898
#>                                  sigma^2 = 11082.9278
#> 
#> P-value:                         0.5207941
#> 
# }