Title: | Testing One Hypothesis Multiple Times |
---|---|
Description: | Approximations of global p-values when testing hypothesis in presence of non-identifiable nuisance parameters. The method relies on the Euler characteristic heuristic and the expected Euler characteristic is efficiently computed by in Algeri and van Dyk (2018) <arXiv:1803.03858>. |
Authors: | Sara Algeri |
Maintainer: | Sara Algeri <[email protected]> |
License: | GPL-2 |
Version: | 1.4 |
Built: | 2025-03-02 03:18:20 UTC |
Source: | https://github.com/cran/TOHM |
Approximations of global p-values when testing hypothesis in presence of non-identifiable nuisance parameters. The method relies on the Euler characteristic heuristic and the expected Euler characteristic is efficiently computed by in Algeri and van Dyk (2018) <arXiv:1803.03858>.
The functions collected in TOHM
mainly focus on the implementation of the Likelihood Ratio Tests (see TOHM_LRT
). However, several functions (e.g.,EC_T
, global_p
) can be used to obtain global p-values for other test statistics and to compute the Euler characteristic using the graph algorithm described in Algeri and van Dyk (2018).
Sara Algeri Maintainer: Sara Algeri <[email protected]>
S. Algeri and D.A. van Dyk. Testing one hypothesis multiple times: The multidimensional case. arXiv:1803.03858, submitted to the Journal of Computational and Graphical Statistics, 2018.
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} #Specifying search region theta1<-seq(172.5,217.5,by=15) theta2<-seq(-2,58,by=10) THETA<-as.matrix(expand.grid(theta1,theta2)) originalR<-dim(THETA)[1] rownames(THETA)<-1:(dim(THETA)[1]) THETA2<-THETA[(THETA[,1]<=217.5)&(THETA[,1]>=172.5),] THETA_sel<-THETA2[(THETA2[,2]<=(28+sqrt(30^2-(THETA2[,1]- 195)^2)))&(THETA2[,2]>=(28-sqrt(30^2-(THETA2[,1]-195)^2))),] #Generating toy EC ECs<-cbind(rpois(100,1.5),rpois(100,1)) TOHM_LRT(data,mll,null0=0,init=c(0.1),lowlim=c(0),uplim=c(1), THETA=THETA_sel,ck=c(1,8),type=c("Chi-bar^2"), k=NULL,k_vec=c(0,1),weights=c(0.5,0.5), ECdensities=NULL,ECs=ECs)
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} #Specifying search region theta1<-seq(172.5,217.5,by=15) theta2<-seq(-2,58,by=10) THETA<-as.matrix(expand.grid(theta1,theta2)) originalR<-dim(THETA)[1] rownames(THETA)<-1:(dim(THETA)[1]) THETA2<-THETA[(THETA[,1]<=217.5)&(THETA[,1]>=172.5),] THETA_sel<-THETA2[(THETA2[,2]<=(28+sqrt(30^2-(THETA2[,1]- 195)^2)))&(THETA2[,2]>=(28-sqrt(30^2-(THETA2[,1]-195)^2))),] #Generating toy EC ECs<-cbind(rpois(100,1.5),rpois(100,1)) TOHM_LRT(data,mll,null0=0,init=c(0.1),lowlim=c(0),uplim=c(1), THETA=THETA_sel,ck=c(1,8),type=c("Chi-bar^2"), k=NULL,k_vec=c(0,1),weights=c(0.5,0.5), ECdensities=NULL,ECs=ECs)
Computes the Euler characteristic (EC) density of a given order for Chi-squared random fields.
chi2_ECden(c, k, j)
chi2_ECden(c, k, j)
c |
Value on which the EC density is evaluated. |
k |
Degrees of freedom of the Chi-square random field. |
j |
Order of the EC density to be implemented. |
Returns the value of the EC density of order j
evaluated at c
for a Chi-square random field with k
degrees of freedom.
Sara Algeri
R.J. Adler and J.E. Taylor. Random fields and geometry. Springer Science and Business Media, 2009.
c<-1 k<-1 j<-2 chi2_ECden(c,k,j)
c<-1 k<-1 j<-2 chi2_ECden(c,k,j)
It computes the Euler characteristic (EC) of the generalized Likelihood Ratio test (LRT) field above specified thresholds over a given search area.
EC_LRT(ck, x, mll, null0, init, lowlim, uplim, THETA)
EC_LRT(ck, x, mll, null0, init, lowlim, uplim, THETA)
ck |
Vector of thresholds defining the excursions sets with respect to which the ECs are computed. |
x |
A vector or matrix collecting the data on which the LRT is computed. |
mll |
A function specifying the negative (profile) log-likelihood. See details. |
null0 |
A scalar or vector collecting the values of the free parameters under the null hypothesis. See details. |
init |
Vector of initial values for the MLE. |
lowlim |
Vector of lower bounds for the MLE. |
uplim |
Vector of upper bounds for the MLE. |
THETA |
A vector or matrix of grid values for the nuisance parameter with respect to which the search is performed. |
mll
takes as first argument the vector of the parameters for which the MLE is generated. Other arguments of mll
are the data vector or matrix (x
) and a scalar or vector corresponding to the fixed value for the nuisance parameter with respect to which the profilying is computed (theta
, see gLRT
). If the latter is a vector it must be of same length of the rows in THETA
.
If the null model has nuisance parameters, null0
takes as arguments the values of the parameters being tested under the null hypothesis, followed by the estimates of the nuisance parameters obtained assuming that the null hypothesis is true.
Returns a vector of EC values with respect to the thresholds specified in ck
.
Sara Algeri
S. Algeri and D.A. van Dyk. Testing one hypothesis multiple times: The multidimensional case. arXiv:1803.03858, submitted to the Journal of Computational and Graphical Statistics, 2018.
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} #Specifying search region theta1<-seq(172.5,217.5,by=15) theta2<-seq(-2,58,by=10) THETA<-as.matrix(expand.grid(theta1,theta2)) originalR<-dim(THETA)[1] rownames(THETA)<-1:(dim(THETA)[1]) THETA2<-THETA[(THETA[,1]<=217.5)&(THETA[,1]>=172.5),] THETA_sel<-THETA2[(THETA2[,2]<=(28+sqrt(30^2-(THETA2[,1]- 195)^2)))&(THETA2[,2]>=(28-sqrt(30^2-(THETA2[,1]-195)^2))),] EC_LRT(ck=c(1,8),x=data,mll=mll,null0=0,init=c(0.1), lowlim=c(0),uplim=c(1), THETA_sel)
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} #Specifying search region theta1<-seq(172.5,217.5,by=15) theta2<-seq(-2,58,by=10) THETA<-as.matrix(expand.grid(theta1,theta2)) originalR<-dim(THETA)[1] rownames(THETA)<-1:(dim(THETA)[1]) THETA2<-THETA[(THETA[,1]<=217.5)&(THETA[,1]>=172.5),] THETA_sel<-THETA2[(THETA2[,2]<=(28+sqrt(30^2-(THETA2[,1]- 195)^2)))&(THETA2[,2]>=(28-sqrt(30^2-(THETA2[,1]-195)^2))),] EC_LRT(ck=c(1,8),x=data,mll=mll,null0=0,init=c(0.1), lowlim=c(0),uplim=c(1), THETA_sel)
It computes the Euler characteristic (EC) of a given field above specified thresholds over a specified search area.
EC_T(ck, Ts, THETA)
EC_T(ck, Ts, THETA)
ck |
Vector of thresholds defining the excursions sets with respect to which the ECs are computed. |
Ts |
Vector of values of the field for each grid point in |
THETA |
A vector or matrix of grid values for the nuisance parameter with respect to which the search is performed. |
Returns a vector of EC values with respect to the thresholds specified in ck
.
Sara Algeri
S. Algeri and D.A. van Dyk. Testing one hypothesis multiple times: The multidimensional case. arXiv:1803.03858, submitted to the Journal of Computational and Graphical Statistics, 2018.
EC_T(ck=c(3,4),Ts=rnorm(10), THETA=cbind(1:10,21:30))
EC_T(ck=c(3,4),Ts=rnorm(10), THETA=cbind(1:10,21:30))
Compute the Euler characteristic (EC) densities for Gaussian, Chi-square and Chi-bar-square random fields up to a given order.
ECden_vec(c, D, type = c("Gaussian", "Chi^2", "Chi-bar^2"), k = NULL, k_vec = NULL, weights = NULL)
ECden_vec(c, D, type = c("Gaussian", "Chi^2", "Chi-bar^2"), k = NULL, k_vec = NULL, weights = NULL)
c |
Value on which the EC densities are evaluated. |
D |
Maximum order of the EC density to be computed. |
type |
Type of random field. The possible options are |
k |
If |
k_vec |
If |
weights |
If |
If type="Chi-bar^2"
the degrees of freedom of the Chi-square random fields involved in the mixture, as well as the respective weights, have to be spefcified in the arguments k_vec
and weights
.
Returns the values of the EC densities of order zero up to the dimension of the search area considered and evaluated at c
.
Sara Algeri
R.J. Adler and J.E. Taylor. Random fields and geometry. Springer Science and Business Media, 2009.
J.E. Taylor and K.J.Worsley. Detecting sparse cone alternatives for gaussian random fields, with an application to fmri. Statistica Sinica, 2013.
ECden_vec(12,2,"Chi-bar^2",k_vec=c(0,1),weights=c(0.5,0.5))
ECden_vec(12,2,"Chi-bar^2",k_vec=c(0,1),weights=c(0.5,0.5))
It computes the maximum of the generalized Likelihood Ratio Test (LRT) evaluated over a grid of values.
find_max(x, mll, null0, init, lowlim, uplim, THETA)
find_max(x, mll, null0, init, lowlim, uplim, THETA)
x |
A vector or matrix collecting the data on which the LRT field is computed. |
mll |
A function specifying the negative (profile) log-likelihood. See details. |
null0 |
A vector or scalar of the free parameters under the null hypothesis. See details. |
init |
A vector or scalar of initial values for the MLE. |
lowlim |
A vector or scalar of lower bounds for the MLE. |
uplim |
A vector or scalar of upper bounds for the MLE. |
THETA |
A vector or matrix of grid values of the nuisance parameter with respect to which the search is performed. |
mll
takes as first argument the vector of the parameters for which the MLE is generated. Other arguments of mll
are the data vector or matrix (x
) and a scalar or vector corresponding to the fixed value for the nuisance parameter with respect to which the profilying is computed (theta
, see gLRT
). If the latter is a vector it must be of same length of the rows in THETA
.
If the null model has nuisance parameters, null0
takes as arguments the values of the parameters being tested under the null hypothesis, followed by the estimates of the nuisance parameters obtained assuming that the null hypothesis is true.
max_gLRT |
Maximum observed of the LRT field. |
theta_max |
Value of |
Sara Algeri
S. Algeri and D.A. van Dyk. Testing one hypothesis multiple times: The multidimensional case. arXiv:1803.03858, submitted to the Journal of Computational and Graphical Statistics, 2018.
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} #Specifying search region theta1<-seq(172.5,217.5,by=15) theta2<-seq(-2,58,by=10) THETA<-as.matrix(expand.grid(theta1,theta2)) originalR<-dim(THETA)[1] rownames(THETA)<-1:(dim(THETA)[1]) THETA2<-THETA[(THETA[,1]<=217.5)&(THETA[,1]>=172.5),] THETA_sel<-THETA2[(THETA2[,2]<=(28+sqrt(30^2-(THETA2[,1]- 195)^2)))&(THETA2[,2]>=(28-sqrt(30^2-(THETA2[,1]-195)^2))),] find_max(x=data,mll=mll,null0=0,init=c(0.1), lowlim=c(0),uplim=c(1), THETA=THETA_sel)
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} #Specifying search region theta1<-seq(172.5,217.5,by=15) theta2<-seq(-2,58,by=10) THETA<-as.matrix(expand.grid(theta1,theta2)) originalR<-dim(THETA)[1] rownames(THETA)<-1:(dim(THETA)[1]) THETA2<-THETA[(THETA[,1]<=217.5)&(THETA[,1]>=172.5),] THETA_sel<-THETA2[(THETA2[,2]<=(28+sqrt(30^2-(THETA2[,1]- 195)^2)))&(THETA2[,2]>=(28-sqrt(30^2-(THETA2[,1]-195)^2))),] find_max(x=data,mll=mll,null0=0,init=c(0.1), lowlim=c(0),uplim=c(1), THETA=THETA_sel)
Computes the Euler characteristic (EC) density of a given order for Gaussian random fields.
Gauss_ECden(c, j)
Gauss_ECden(c, j)
c |
Value on which the EC density is evaluated. |
j |
Order of the EC density to be implemented. |
Returns the value of the EC density of order j
evaluated at c
for a Gaussian random field.
Sara Algeri
R.J. Adler and J.E. Taylor. Random fields and geometry. Springer Science and Business Media, 2009.
c<-1 j<-2 Gauss_ECden(c,4)
c<-1 j<-2 Gauss_ECden(c,4)
It computes the global p-value for a given value of the test statistic.
global_p(c, ck, type = c("Gaussian", "Chi^2", "Chi-bar^2"), k = NULL, k_vec = NULL, weights = NULL, ECdensities = NULL, ECs)
global_p(c, ck, type = c("Gaussian", "Chi^2", "Chi-bar^2"), k = NULL, k_vec = NULL, weights = NULL, ECdensities = NULL, ECs)
c |
Observed value of the test statistic. |
ck |
Vector of thresholds defining the excursions sets with respect to which the ECs are computed. |
type |
Type of random field. The possible options are |
k |
If |
k_vec |
If |
weights |
If |
ECdensities |
See datails. |
ECs |
A vector or matrix containing the Euler characteristics (ECs) computed over a Monte Carlo simulation of the random field under the null model. Each colum correspond to the ECs obtained for each of the thresholds in |
If type="Chi-bar^2"
the degrees of freedom of the Chi-square random fields involved in the mixture, as well as the respective weights, have to be spefcified in the arguments k_vec
and weights
.
If the distribution of the random field is not available in type
, the user can specify in ECdensities
a function taking c
as argument and returning
the vector of the desired EC densities to be evaluated at c
. Notice that the length of the vector returned by the function specified in ECdensities
must corresponds to one plus the dimension of the search area (since the first value should correspond to the EC density of order zero (see ECden_vec
)).
global_p |
Global p-value. |
MCerror |
Monte Carlo error associated to the global p-vaue. |
Sara Algeri
S. Algeri and D.A. van Dyk. Testing one hypothesis multiple times: The multidimensional case. arXiv:1803.03858, submitted to the Journal of Computational and Graphical Statistics, 2018.
ck<-c(1,2) ECs<-cbind(rpois(100,1.5),rpois(100,1)) global_p(c=12,ck=ck,type="Gaussian",ECs=ECs)
ck<-c(1,2) ECs<-cbind(rpois(100,1.5),rpois(100,1)) global_p(c=12,ck=ck,type="Gaussian",ECs=ECs)
Compute the generalized Likelihood Ratio Test (LRT) for a specified value of the nuisance parameter.
gLRT(theta, mll, x, init, lowlim, uplim, null0)
gLRT(theta, mll, x, init, lowlim, uplim, null0)
theta |
A vector or scalar of the value of the nuisance parameter with respect to which the LRT is computed. |
mll |
A function specifying the negative (profile) log-likelihood. See details. |
x |
A vector or matrix collecting the data. |
init |
A vector or scalar of initial values for the MLE. |
lowlim |
A vector or scalar of lower bounds for the MLE. |
uplim |
A vector or scalar of upper bounds for the MLE. |
null0 |
A vector or scalar of the free parameters under the null hypothesis. See details. |
mll
takes as first argument the vector of the parameters for which the MLE is generated. Other arguments of mll
are the data vector or matrix (x
) and a scalar or vector corresponding to the fixed value for the nuisance parameter with respect to which the profilying is computed (theta
, see gLRT
). If the latter is a vector it must be of same length of the rows in THETA
.
If the null model has nuisance parameters, null0
takes as arguments the values of the parameters being tested under the null hypothesis, followed by the estimates of the nuisance parameters obtained assuming that the null hypothesis is true.
The value of the generalized LRT for a specified value of theta
.
Sara Algeri
S. Algeri and D.A. van Dyk. Testing one hypothesis multiple times: The multidimensional case. arXiv:1803.03858, submitted to the Journal of Computational and Graphical Statistics, 2018.
A.C. Davison. Statistical models, volume 11. Cambridge University Press, 2003.
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} gLRT(theta=c(200,30),mll=mll,init=0.1,lowlim=0,uplim=1,null0=0,x=data)
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} gLRT(theta=c(200,30),mll=mll,init=0.1,lowlim=0,uplim=1,null0=0,x=data)
It implements the procedure described in Algeri and van Dyk (2018) to perform tests of hypothesis under non-regular conditions, and which can be formulated as test of hypothesis where a nuisance parameter is present only under the alternative.
TOHM_LRT(x, mll, null0, init, lowlim, uplim, THETA, ck, type = c("Chi^2", "Chi-bar^2"), k = NULL, k_vec = NULL, weights = NULL, ECdensities = NULL, ECs = NULL)
TOHM_LRT(x, mll, null0, init, lowlim, uplim, THETA, ck, type = c("Chi^2", "Chi-bar^2"), k = NULL, k_vec = NULL, weights = NULL, ECdensities = NULL, ECs = NULL)
x |
A vector or matrix collecting the data. |
mll |
A function specifying the negative (profile) log-likelihood. See details. |
null0 |
A vector or scalar of the free parameters under the null hypothesis. See details. |
init |
A vector or scalar of initial values for the MLE. |
lowlim |
A vector or scalar of lower bounds for the MLE. |
uplim |
A vector or scalar of upper bounds for the MLE. |
THETA |
A vector or matrix of grid values of the nuisance parameter with respect to which the search is performed. |
ck |
Vector of thresholds defining the excursions sets with respect to which the ECs are computed. |
type |
Type of random field. The possible options are |
k |
If |
k_vec |
If |
weights |
If |
ECdensities |
See datails. |
ECs |
A vector or matrix containing the Euler characteristics (ECs) computed over a Monte Carlo simulation of the random field under the null model. Each colum correspond to the ECs obtained for each of the thresholds in |
mll
takes as first argument the vector of the parameters for which the MLE is generated. Other arguments of mll
are the data vector or matrix (x
) and a scalar or vector corresponding to the fixed value for the nuisance parameter with respect to which the profilying is computed (theta
, see gLRT
). If the latter is a vector it must be of same length of the rows in THETA
.
If the null model has nuisance parameters, null0
takes as arguments the values of the parameters being tested under the null hypothesis, followed by the estimates of the nuisance parameters obtained assuming that the null hypothesis is true.
If type="Chi-bar^2"
the degrees of freedom of the Chi-square random fields involved in the mixture, as well as the respective weights, have to be spefcified in the arguments k_vec
and weights
.
If the distribution of the random field is not available in type
, the user can specify in ECdensities
a function taking c
as argument and returning
the vector of the desired EC densities to be evaluated at c
. Notice that the length of the vector returned by the function specified in ECdensities
must corresponds to one plus the dimension of the search area (since the first value should correspond to the EC density of order zero (see ECden_vec
)).
max_gLRT |
Maximum observed of the LRT field. |
theta_max |
Value of |
global_p |
Global p-value. |
MCerror |
Monte Carlo error associated to the global p-vaue. |
Sara Algeri
S. Algeri and D.A. van Dyk. Testing one hypothesis multiple times: The multidimensional case. arXiv:1803.03858, submitted to the Journal of Computational and Graphical Statistics, 2018.
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} #Specifying search region theta1<-seq(172.5,217.5,by=15) theta2<-seq(-2,58,by=10) THETA<-as.matrix(expand.grid(theta1,theta2)) originalR<-dim(THETA)[1] rownames(THETA)<-1:(dim(THETA)[1]) THETA2<-THETA[(THETA[,1]<=217.5)&(THETA[,1]>=172.5),] THETA_sel<-THETA2[(THETA2[,2]<=(28+sqrt(30^2-(THETA2[,1]- 195)^2)))&(THETA2[,2]>=(28-sqrt(30^2-(THETA2[,1]-195)^2))),] #Generating toy EC ECs<-cbind(rpois(100,1.5),rpois(100,1)) TOHM_LRT(data,mll,null0=0,init=c(0.1),lowlim=c(0),uplim=c(1), THETA=THETA_sel,ck=c(1,8),type=c("Chi-bar^2"), k=NULL,k_vec=c(0,1),weights=c(0.5,0.5), ECdensities=NULL,ECs=ECs)
#generating data of interest N<-100 x<-as.matrix(cbind(runif(N*2,172.5,217.5),runif(N*2,-2,58))) x2<-x[(x[,1]<=217.5)&(x[,1]>=172.5),] x_sel<-x2[(x2[,2]<=(28+sqrt(30^2-(x2[,1]-195)^2)))&(x2[,2]>=(28- sqrt(30^2-(x2[,1]-195)^2))),] data<-x_sel[sample(seq(1:(dim(x_sel)[1])),N),] #Specifying minus-log-likelihood kg<-function(theta){integrate(Vectorize(function(x) { exp(-0.5*((x-theta[1])/0.5)^2)*integrate(function(y) { exp(-0.5*((y-theta[2])/0.5)^2) }, 28-sqrt(30^2-(x-195)^2), 28+sqrt(30^2-(x-195)^2))$value}) , 172.5, 217.5)$value} mll<-function(eta,x,theta){ -sum(log((1-eta)/(pi*(30)^2)+eta*exp(-0.5*((x[,1]- theta[1])/0.5)^2- 0.5*((x[,2]-theta[2])/0.5)^2)/kg(theta)))} #Specifying search region theta1<-seq(172.5,217.5,by=15) theta2<-seq(-2,58,by=10) THETA<-as.matrix(expand.grid(theta1,theta2)) originalR<-dim(THETA)[1] rownames(THETA)<-1:(dim(THETA)[1]) THETA2<-THETA[(THETA[,1]<=217.5)&(THETA[,1]>=172.5),] THETA_sel<-THETA2[(THETA2[,2]<=(28+sqrt(30^2-(THETA2[,1]- 195)^2)))&(THETA2[,2]>=(28-sqrt(30^2-(THETA2[,1]-195)^2))),] #Generating toy EC ECs<-cbind(rpois(100,1.5),rpois(100,1)) TOHM_LRT(data,mll,null0=0,init=c(0.1),lowlim=c(0),uplim=c(1), THETA=THETA_sel,ck=c(1,8),type=c("Chi-bar^2"), k=NULL,k_vec=c(0,1),weights=c(0.5,0.5), ECdensities=NULL,ECs=ECs)