The R package HDNRA includes the latest methods based on normal-reference approach to test the equality of the mean vectors of high-dimensional samples with possibly different covariance matrices. HDNRA
is also used to demonstrate the implementation of these tests, catering not only to the two-sample problem, but also to the general linear hypothesis testing (GLHT) problem. This package provides easy and user-friendly access to these tests. Both coded in C++ to allow for reasonable execution time using Rcpp. Besides Rcpp, the package has no strict dependencies in order to provide a stable self-contained toolbox that invites re-use.
There are:
Two real data sets in HDNRA
Seven normal-reference tests for the two-sample problem
- ZGZC2020.TS.2cNRT
- ZZ2022.TS.3cNRT
- ZZZ2020.TS.2cNRT
- ZWZ2023.TSBF.2cNRT
- ZZ2022.TSBF.3cNRT
- ZZGZ2021.TSBF.2cNRT
- ZZZ2023.TSBF.2cNRT
Five normal-reference tests for the GLHT problem in HDNRA
Four existing tests for the two-sample problem in HDNRA
Five existing tests for the GLHT problem in HDNRA
Installation
You can install and load the most recent development version of HDNRA
from GitHub with:
# Installing from GitHub requires you first install the devtools or remotes package
install.packages("devtools")
# Or
install.packages("remotes")
# install the most recent development version from GitHub
devtools::install_github("nie23wp8738/HDNRA")
# Or
remotes::install_github("nie23wp8738/HDNRA")
# load the most recent development version from GitHub
library(HDNRA)
Usage
Load the package
library(HDNRA)
#> **------------------------------------------------------**
#> ** HHH HHH DDDDDDDD NNNN NN RRRRRR AAAA **
#> ** HHH HHH DD DD NNNNN NN RR RR AA AA **
#> ** HHHHHHHHH DD DD NN NN NN RRRRRR AAAAAAAA **
#> ** HHH HHH DD DD NN NN NN RR RR AA AA **
#> ** HHH HHH DDDDDDDD NN NNNN RR RR AA AA **
#> **
#> ** High-Dimensional Location Testing Toolbox
#> **
#> ** Version :2.0.1 (2024)
#> ** Authors :Pengfei Wang,Shuqi Luo,Tianming Zhu,Bu Zhou
#> ** Maintainer:Pengfei Wang (nie23.wp8738@e.ntu.edu.sg)
#> **
#> ** This package provides a comprehensive set of tools for
#> ** high-dimensional location testing, including classical
#> ** and state-of-the-art normal-reference approaches for
#> ** two-sample and general linear hypothesis testing (GLHT).
#> **
#> ** Please report any bugs or suggestions to the maintainer.
#> **------------------------------------------------------**
Example data
Package HDNRA
comes with two real data sets:
# A COVID19 data set from NCBI with ID GSE152641 for the two-sample problem.
?COVID19
data(COVID19)
dim(COVID19)
#> [1] 87 20460
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
dim(group1)
#> [1] 24 20460
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
dim(group2)
#> [1] 62 20460
# A corneal data set acquired during a keratoconus study for the GLHT problem.
?corneal
data(corneal)
dim(corneal)
#> [1] 150 2000
group1 <- as.matrix(corneal[1:43, ]) ## normal group
dim(group1)
#> [1] 43 2000
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
dim(group2)
#> [1] 14 2000
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
dim(group3)
#> [1] 21 2000
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
dim(group4)
#> [1] 72 2000
Example for two-sample problem
A simple example of how to use one of the normal-reference tests ZWZ2023.TSBF.2cNRT
using data set COVID19
:
data("COVID19")
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) # healthy group1
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) # patients group2
# The data matrix for tsbf_zwz2023 should be p by n, sometimes we should transpose the data matrix
ZWZ2023.TSBF.2cNRT(group1, group2)
#>
#> Results of Hypothesis Test
#> --------------------------
#>
#> Test name: Zhu et al. (2023)'s test
#>
#> Null Hypothesis: Difference between two mean vectors is 0
#>
#> Alternative Hypothesis: Difference between two mean vectors is not 0
#>
#> Data: group1 and group2
#>
#> Sample Sizes: n1 = 24
#> n2 = 62
#>
#> Sample Dimension: 20460
#>
#> Test Statistic: T[ZWZ] = 4.1877
#>
#> Approximation method to the 2-c matched chi^2-approximation
#> null distribution of T[ZWZ]:
#>
#> Approximation parameter(s): df1 = 2.7324
#> df2 = 171.7596
#>
#> P-value: 0.008672887
Example for GLHT problem
A simple example of how to use one of the normal-reference tests ZZG2022.GLHTBF.2cNRT
using data set corneal
:
data("corneal")
dim(corneal)
#> [1] 150 2000
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
G <- cbind(diag(k-1),rep(-1,k-1))
ZZG2022.GLHTBF.2cNRT(Y,G,n,p)
#>
#> Results of Hypothesis Test
#> --------------------------
#>
#> Test name: Zhang et al. (2022)'s test
#>
#> Null Hypothesis: The general linear hypothesis is true
#>
#> Alternative Hypothesis: The general linear hypothesis is not true
#>
#> Data: Y
#>
#> Sample Sizes: n1 = 43
#> n2 = 14
#> n3 = 21
#> n4 = 72
#>
#> Sample Dimension: 2000
#>
#> Test Statistic: T[ZZG] = 159.7325
#>
#> Approximation method to the 2-c matched chi^2-approximation
#> null distribution of T[ZZG]:
#>
#> Approximation parameter(s): df = 6.1652
#> beta = 6.1464
#>
#> P-value: 0.0002577084
Code of Conduct
Please note that the HDNRA project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms