03-Statistics.Rmd

# Statistics

This section is not intended as a textbook on statistics. Rather it demonstrates regression approaches that can be used including sample size estimation, R codes provided.

## Univariable analyses

### Parametric tests

T-test is the workhorse for comparing if 2 datasets are have the same distribution. Performing t-test in R requires data from 2 columns: one 
containing the variables for comparison and one to label the group. There are different forms of t-test depending on whether the two samples are paired or unpaired.  In general, the analysis takes the form of $t=\frac{\mu_1 - \mu_2}{variance}$. It is recommended to check the distribution of the data by using histogram. For this exercise, we will use the simulated data from ECR trials. The grouping variable is the trial assignment. 

```{r 03-Statistics-1}
#comparison of early neurological recovery (ENI) by tral (T)
dtTrial<-read.csv("./Data-Use/dtTrial_simulated.csv")
t.test(dtTrial$ENI~dtTrial$T)
```

### Non-parametric tests

Chi-squared and Fisher-exact tests can be done by using the _table_ function for setting up the count data into 2 x 2 contingency table or confusion matrix. The formula for the Chi-squared test takes on a familiar form $\chi^2=\frac{(observed-expected)^2}{expected}$. In this example we will use the data above.

```{r 03-Statistics-1-1}
table(dtTrial$HT,dtTrial$T)
chisq.test(dtTrial$HT,dtTrial$T)
```

The Wilcoxon rank sum test is performed with continuous data organised in the 
same way as the t-test. There are several different approaches to performing Wilcoxon rank sum test. The _coin_ package allows handling of ties.

```{r 03-Statistics-2}
library(coin)
wilcox.test(ENI~T, data=dtTrial)
```

## Regression

There are many different form of regression methods. A key principle is that the predictors are independent of each others. This issue will be expand on in the later in the section on collinearity. Special methods are required when the predictors are collinear.

### Brief review of matrix

A vector is has length one. A matrix is an ordered array in 2 dimensions. A 
tensor is an ordered array in 3 dimensions. 

A matrix in which the columns are linearly related are said to be rank 
deficient. The rank of a given matrix is an expression of the number of linearly independent columns of that matrix. Given that row rank and column rank are equivalent, rank deficiency of a matrix is expressed as the difference between 
the lesser of the number of rows and columns, and the rank of the matrix. 
A matrix with rank of 1 is likely to be linearly related. 

### Linear (least square) regression

Least square regression uses the geometric properties of Euclidean geometry to identify the line of best. The sum of squares $SSE$ is $\sum(observed-expected)^2$. The $R^2$ is a measure of the fit of the model. It 
is given by $1-\frac{SS_(res)}{SS_(total)}$. Low $R^2$ indicates a poorly fitted model and high $R^2$ indicates excellent fitting. The assumption here is that 
the outcome variable is a continuous variable.

```{r 03-Statistics-3, warning=F}
library(ggplot2)
load("./Data-Use/world_stroke.Rda")

ggplot(world_sfdf, aes(x=LifeExpectancy,y=MeanLifetimeRisk))+
  geom_smooth(method="lm", aes(Group=Income, linetype=Income))+geom_point()+xlab("Life Expectancy")
```

### Logistic regression

For outcome that are binary in nature such as yes or no, then least square regression is not appropriate. There are no close form solution for this 
analysis and a numerical approach using maximum likelihood approach is needed. When examining the results of logistic regression one is often enchanted by the large odds ratio. It is important to look at the metrics of model calibration 
(discussed below). A clue to a poorly calibrated model is the observation that 
the width of the confidence interval for odds ratio is wide.


```{r 03-Statistics-4, warning=F}

#glm
data("BreastCancer",package = "mlbench")

#remove id column and column with NA to feed into iml later
BreastCancer2<-lapply(BreastCancer[,-c(1,7)], as.numeric)
BreastCancer2<-as.data.frame(BreastCancer2)

DCa<-glm(Class~., data=BreastCancer2)

summary(DCa)


```

### Discrimination and Calibration

A high _$R^2$ suggests that the linear regression model is well calibrated. This metric is often not displayed but should be sought when interpreting the data.

The areas under the receiver operating characteristic curve (AUC) is used to assess how well the models discriminate between those who have the disease and those who do not have the disease of interest. An AUC of 0.5 is classified as no better than by chance; 0.8 to 0.89 provides good (excellent) discrimination, and 0.9 to 1.0 provides outstanding discrimination. This rule of thumb about interpreting AUC when reading the literature is language the authors used to describe the AUC. This test of discrimination is not synonymous with calibration. It is possible to have a model with high discrimination but poor calibration [@pmid1738016]. The AUC is similar to Harrell's c-index but the interpretation of difference in AUC and c-index between models is not straightforward. A difference in 0.1 of AUC correspond to the number of subject rank correctly. The c-index was originally described for survival analysis [@pmid7069920]. Harrell described the c-index (concordance index) as estimating the probability that, of two randomly chosen patients, the patient with the higher prognostic score will outlive the patient with the lower prognostic score. As such the c-index should be interpreted as the number of concordant pairs relative to the total number of comparable pairs. It has been proposed that the AUC and c-index is not appropriate for survival analysis as they do not account for the dynamic nature of the data[@LONGATO2020103496]. The integrated Graf score has been proposed to account for difference in the estimated event-free survival probabilities with the actual outcome [@pmid10474158].

Calibration of logistic regression model is performed using the Hosmer–Lemeshow goodness-of-ﬁt test and the Nagelkerke generalized R2. A model is well 
calibrated when the Hosmer–Lemeshow goodness-of-ﬁt test shows no difference between observed and expected outcome or P value approaching 1. A high 
generalized $R^2$ value suggests a well-calibrated regression model. Running regression through the _rms_ or _PredictABEL_ library provide these results. The  generalized $R^2$ can be obtained manually from base R by running an intercept only model and again with covariates. It is given by $1-\frac{L1}{L0}$. 

```{r 03-Statistics-4-1, warning=F}

#lrm on logistic regression analysis for Breast Cancer
library(rms)
DCa_rms<-lrm(Class~., data=BreastCancer2)
DCa_rms
```


#### Measuring Improvement in Regression Models 

The net reclassification improvement  (NRI) and integrated discrimination improvement (IDI) have been proposed as more sensitive metrics of improvement in model discrimination.The NRI can be considered as a percentage reclassiﬁcation 
for the risk categories and the IDI is the mean difference in predicted probabilities between 2 models (constructed from cases with disease and without disease). The NRI and IDI scores are expressed as fractions and can be converted to percentage by multiplying 100.The continuous NRI and IDI can be performed 
using _PredictABEL_ [@pmid28579970][@pmid26796056].

#### Shapley value

We can use ideas from game theory relating to fair distribution of proﬁt in coalition games; the coalition (co-operative) game in this case can be 
interpreted as contribution of the covariates to the model. The Shapley value regression method calculates the marginal contribution of each covariate as the average of all permutations of the coalition of the covariates containing the covariate of interest minus the coalition without the covariate of interest. The advantage of this approach is that it can handle multicollinearity (relatedness) among the covariates.

The feature importance is used to assess the impact of the features on the 
model's decision

```{r 03-Statistics-5, warning=F}

#this section takes the output from logistic regression above.
library(iml)
X = BreastCancer2[which(names(BreastCancer2) != "Class")]
predictor = Predictor$new(DCa, data = X, y = BreastCancer2$Class)
imp = FeatureImp$new(predictor, loss = "mae")
plot(imp)
```

From the logistic regression above cell thickness and cromatin have the highest coefficient and lowest p value. This is the same as feature importance. By contrast the Shapley values show that cell shape and marg adhesion make the largest impact on the model when considering the contribution to the model after considering all the contribution by different coalitions.

```{r 03-Statistics-5-1}
#explain with game theory
shapley <- Shapley$new(predictor, x.interest = X[1, ])
shapley$plot()

```

#### ICE

The individual conditional expectation (ICE) curves is the plot of the 
expectation fof the predictive value for each observation at the unique value 
for the feature.

```{r 03-Statistics-5-2}

#feature is the covariate of interest
par(mfrow=c(1,2))
eff_thick <- FeatureEffect$new(predictor, 
                         feature = "Cl.thickness", 
                         method = "ice",
                         center.at = 0)

plot(eff_thick)




```


### Interaction 

Interactions is plotted here using lollipop bar. The _ggalt_ library can be used to create this type of plot with _geom_lollipop_. The strength of interaction is measured using Friedman's H-statistics. The H-statistics ranges from 0 to 1 with 
1 indicating the overall interaction strength. In the case with Breast Cancer data, the interaction strength is low. 

When describing interaction terms it is recommended that the results be 
expressed as β coefficients rather than as odds ratio.

```{r 03-Statistics-5-3}
#plot interactions
interact <- Interaction$new(predictor)
plot(interact)
```


## Confounder 

A confounder is a covariate that serves as a cause of both exposure and outcome and as such confound the analysis. A mediator exist on the causal pathway from exposure to outcome. A common misconception is that the multiple regression adjust for imbalance in covariates in clinical trial. This issue was observed in the pivotal NINDS alteplase trial. The results of the trial has since been accepted with re-analysis of this trial using covariate adjustment [@pmid15345796].

There are several methods for covariate adjustment in radomised trials: direct adjustment, standardisation and inverse-probability-of-treatment weighting.

### Confounder weighted model

The issue asked is whether one should choose to perform confounder analysis or propensity matching. 

### Propensity matching

Propensity matching is an important technique to adjust for imbalance in covariates between 2 arms. There are concerns with mis-use of this technique 
such as difference in placebo arms from multiple sclerosis trials [@doi10.1001/jamaneurol.2020.0678]. It is proposed that this technique should be 
used only if all the confounders are measurable. This situation may not be satisfied if the data were accrued at different period, in different continent etc.

### E-values

The E-values [@pmid28693043] has been proposed a measure of unmeasured confounders in observational studies. The E-value is a measure of the extent to which the confounder have on the treatment–outcome association, conditional on the measured covariates. A large E-value is desirable. The E-values is available in _Evalue_ library [@pmid29912013].

## Causal inference

Causation and association are often miscontrued to be the same. However, the finding of correlation (association) does necessarily imply causation. Causal inference evaluates the response of an effect variable in the setting of change in the cause of the effect variable. There are issues with approach to analysis of causal inference. It can be performed using frequentist such as confounder weighted model or Bayesian methods such as [(Baysian additive regression tree)][Bayesian trees method].

## Special types of regression

### Ordinal regression

Ordinal regression is appropriate when the outcome variable is in the form of ordered categorical values. For example, the Rankin scale of disability is 
bounded between the values of 0 and 6. This type of analysis uses the 
proportional odds model and the requirement for this model is stringent. When examining results of ordinal regression check that the authors provide this metric, the Brant test. The Brant test assesses the parallel regression assumption. Ordinal regression is performed using _polr_ function in _MASS_ library. The Brant test is available in the _Brant_ library.

### Survival analysis

Survival analysis is useful when dealing with time to event data. Time to event data can be left, interval and right censoring. Left censoring exists when events may have already occurred at the start of the study eg purchase of phones. Right censoring exists when events have not happened yet eg cancer trial. Interval censoring exists when an insurance has been purchased but the date of product purchase is not yet known.

The Cox model assesses the hazard of outcome between two groups. The assumption of this model is that the hazard between each arm is proportional [@pmid32167523]. The proportional hazard model can be tested based on the weighted Schoenfeld residuals[@10.1093/biomet/81.3.515]. There are non-parametric models available when the assumption of the proportional hazard model does not hold.

In the next chapter on machine learning, an illustration of [random survival forest with _rfrsc_ library][Random survival forest with rfsrc] and _ranger_ library are provided. In the section on [clinical trial][Interpreting clinical trials] we illustrate how the results can be converted to numbers needed to treat. The median survival corresponding to survival probability of 0.50 can be determined here. Metrics for assessing survival model was described [above][Discrimination and Calibration].

```{r 03-Statistics-6, warning=F}

library(survival)
library(survminer)
#data from survival package on NCCTG lung cancer trial
#https://stat.ethz.ch/R-manual/R-devel/library/survival/html/lung.html
data(cancer, package="survival")

#time in is available in days
#status censored=1, dead=2
#sex:Male=1 Female=2

survfit(Surv(time, status) ~ 1, data = cancer)

```



```{r 03-Statistics-6-1, warning=F}
sfit<- coxph(Surv(time, status) ~ age+sex+ph.ecog+ph.karno+wt.loss, data = cancer)
summary(sfit)
```

Test proportional hazard assumption using weighted residuals [@10.1093/biomet/81.3.515]. The finding below shows that inclusion of covariate ph.karno violate proportional hazard assumption. 

```{r 03-Statistics-6-2, warning=F}
cox.zph(sfit)
```

Plot fit of survival model

```{r 03-Statistics-6-3}
ggcoxdiagnostics(sfit, type = "deviance", 
 ox.scale = "linear.predictions")
```

Forest plot of outcome from survival analysis

```{r 03-Statistics-6-4, warning=F}
ggforest(sfit)
```

An alternative way to display the output from Cox regression is to use _forestmodel_ library .

```{r 03-Statistics-6-5, warning=F}
forestmodel::forest_model(sfit)
```

The pseudoR2 for Cox regression model proposed by Royston can be evaluated

```{r 03-Statistics-6-6, warning=F}
royston(sfit)

```

### Quantile regression

Least spare regression is appropriate when the data is homoscedascity or the 
error term remain constant. This can be seen as the data varies around the 
fitted line. Homoscedascity implies that the data is homogenous. Quantile regression is appropriate when the distribution of the data is non-normal and it is more appropriate to look at the conditional median of the dependent variable. There are several libraries for this task _quantreg_ and Bayesian libraries. In the example below, the life time risk of stroke is regressed against life expectancy using lest square and quantile regression. 

```{r 03-Statistics-7, warning=F}
library(quantreg)
library(ggplot2)

load("./Data-Use/world_stroke.Rda")

#quantile regression
rqfit <- rq( MeanLifetimeRisk~ LifeExpectancy, data = world_sfdf)
rqfit_sum<-summary(rqfit)

#least square regression
lsfit<-lm(MeanLifetimeRisk~LifeExpectancy,data=world_sfdf)
lsfit_sum<-summary(lsfit) 

#plot
ggplot(world_sfdf,  aes(x=LifeExpectancy,y=MeanLifetimeRisk))+
  #add fitted line for least square
  geom_abline(intercept =lsfit_sum$coefficients[1], slope=lsfit_sum$coefficients[2],color="red")+
  #add fitted line for quantile regression
  geom_point()+xlab("Life Expectancy")+
  geom_abline(intercept =rqfit_sum$coefficients[1], slope=rqfit_sum$coefficients[2],color="blue")

#annotate least square and quantile at position x, y
#annotate("text",x=60, y=27, label=paste0("least square =",                        round(lsfit_sum$coefficients[1],2) ,"+", round(lsfit_sum$coefficients[2],2),"x ","Life Expectancy"),color="red")+ annotate("text",x=75, y=12,label=paste0("quantile =",round(rqfit_sum$coefficients[1],2), " + ", round(rqfit_sum$coefficients[2],2)," x ","Life Expectancy"),color="blue")
```

### Non-negative regression
In certain situations, it is necessary to constrain the analysis so that the regression coefifcients are non-negative. For example, when regressing brain regions against infarct volume, there is no reason believe that a negative coefficient attributable to a brain region is possible[@pmid16504541] . Non-negative regression can be performed in R using _nnls_. 


### Poisson regression

Poisson regression is used when dealing with number of event over time or 
distance such as number of new admissions or new cases of hepatitis or TIA over time. An assumption of the Poisson distribution is that the mean & lambda; and variance &lambda; are the same. 

A special case of Poisson regression is the negative binomial regression. This latter method is used when the variance is greater than the mean pf the data or over-dispersed data. Negative binomial regression can be applied to number of 'failure' event over time. Here 'failure' has a lose definition and can be stroke recurrence after TIA or cirrhosis after hepatitis C infection. 

Zero-inflated data occurs when there is an abundance of zeroes in the data (true and excess zeroes).

### Conditional logistic regression

Conditional logistical regression model should be used when the aim is to 
compare pair of objects from the same patient. Examples include left and right arms or left and right carotid arteries. This method is available from _clogit_ in _survival_. 

### Multinomial modelling

Multinomial modelling is used when the outcome categorical variables are not ordered. This situation can occur when analysis involves choice outcome (choices of fruit: apple, orange or pear). In this case, the log odds of each of the categorical outcomes are analysed as a linear combination of the predictor variables. The _nnet_ library have functions for performing this analysis.

## Sample size estimation

Clinicians are often frustrated about sample size and power estimation for a study, grant or clinical trial. This aspect is scrutinised by ethics committee 
and in peer review process for journals. Luckily, R provides several packages 
for sample size amd power estimation: _pwr_ library. Cohen has written reference textbook on this subject [@COHEN1977179]. 

### Proportion

```{r 03-Statistics-8, warning=F}
library(pwr)
#ttest-d is effect size 
#d = )mean group1 -mean group2)/variance
pwr.t.test(n=300,d=0.2,sig.level=.05,alternative="greater") 
```

We provided an example below for generating power of clinical trial. Examples 
are taken from a paper on sample size estimation for phase II trials [@pmid16931782].

```{r 03-Statistics-9}
library(pwr)
#h is effect size. effect size of 0.5 is very large 
#sample size
pwr.2p.test(h=0.5,n=50,sig.level=0.05,alternative="two.sided")

#medium effect size
pwr.2p.test(h=0.1,n=50,sig.level=0.05,alternative="two.sided")
```

The output of the sample size calculation can be put into a table or plot.

```{r 03-Statistics-10}
library(pwr)
#pwr.2p.test(h=0.3,n=80,sig.level=0.05,alternative="two.sided")
h <- seq(.1,.5,.1) #from 0.1 to 0.3 by 0.05
nh <- length(h) #5
p <- seq(.3,.9,.1)# power from 0.5 to 0.9 by 0.1
np <- length(p) #9
# create an empty array 9 x 5
samplesize <- array(numeric(nh*np), dim=c(nh,np))
for (i in 1:np){
  for (j in 1:nh){
    result <- pwr.2p.test(n = NULL, h = h[j],
    #result <- pwr.r.test(n = NULL, h = h[j],
    sig.level = .05, power = p[i],
    alternative = "two.sided")
    samplesize[j,i] <- ceiling(result$n)
  }
}
samplesize
#graph
xrange <- range(h)
yrange <- round(range(samplesize))
colors <- rainbow(length(p))
plot(xrange, yrange, type="n",
  xlab="Effect size (h)",
  ylab="Sample Size (n)" )
# add power curves
for (i in 1:np){
  lines(h, samplesize[,i], type="l", lwd=2, col=colors[i])
}
# add annotation (grid lines, title, legend) 
abline(v=0, h=seq(0,yrange[2],50), lty=2, col="grey89")
abline(h=0, v=seq(xrange[1],xrange[2],.02), lty=2,
   col="grey89")
title("Sample Size Estimation\n Difference in Proportion")
legend("topright", title="Power", as.character(p),
   fill=colors)
```

#### Non-inferiority

Non-inferiority trials may offer information in a way that a traditional superiority design do not. The design may be interested in other aspect of the treatment such as cost and lower toxicity [@pmid26080342]. Examples of non-inferiority trial designs include antibiotics versus surgery for 
appendicitis [@pmid26080338]. There are concerns with reporting of 
noninferiority trial. Justification for the margin provided in 27.6% [@pmid25781447]. The following describes a trial design where it's expected that drug will result in a certain outcome _p1_ and the control arm _p2_ and the 
ratio of subject in treatment to control arm is _k_. The difference in outcome 
is _delta_. The margin is defined as non-inferior if <0.  

```{r 03-Statistics-11}
library(TrialSize)
TwoSampleProportion.NIS(alpha=.05, 
                        beta=.8,
                        p1=.6,
                        p2=.7,
                        k=1,
                        delta = .1,       
                        margin=-.2
                        )
```

### Logistic regression

```{r 03-Statistics-12}
library(powerMediation)
library(ggplot2)

#continuous predictor
#p1=event rate
#exp(0.405) =1.5

powerLogisticCon(n=200, p1=0.265, OR=exp(0.014), alpha=0.05)


# creating a data frame using data from 
a=seq(0,05.4,0.05)
df_power<-data.frame(`ASPECTS`= a,
"Power"=powerLogisticCon(n=100, p1=a, OR=exp(.695), alpha=0.05)
)
ggplot(data=df_power, aes(x=ASPECTS, y=Power))+geom_point()



```

An alternative library to perform sample size for logistic regression is _WebPower_ library.

```{r 03-Statistics-12-1}
library(WebPower) 

wp.logistic(p0=0.007, #Prob (Y=1|X=0) 
            p1=0.012, #Prob (Y=1|X=1)
            alpha=0.05, 
            power=0.80, 
            alternative="two.sided", 
            family="normal")
```

### Survival studies

Sample size for survival studies can be performed using _powerSurvEpi_ or _gsDesign_.

```{r 03-Statistics-13}
library(powerSurvEpi)

#sample size
ssizeEpi.default(power = 0.80, 
                 theta = 2, 
                 p = 0.408 , 
                 psi = 0.15,
                 rho2 = 0.344^2, 
                 alpha = 0.05)
#power
powerEpi.default(n = 2691, 
                 theta = 2, 
                 p = 0.408, 
                 psi = 0.250,
                 rho2 = 0.344^2, 
                 alpha = 0.05)

#Amarenco NEJM 2020 #equal sample size k=1
ssizeCT.default(power = 0.8, k = .8, pE = 0.085, 
                pC = 0.109, 
                RR = 0.78, alpha = 0.05)
```



### Multiple regression

The power for general linear model can be calculated using the _pwr.f2.test_ function.

```{r 03-Statistics-14}
library(pwr)
#u=degrees of freedom for numerator
#v=degrees of freedomfor denominator
#f2=effect size

```


## Randomised clinical trials

A common misconception is that the multiple regression adjust for imbalance in covariates in clinical trial. This issue was observed in the pivotal NINDS alteplase trial. The results of the trial has since been accepted with re-analysis of this trial using covariate adjustment[@pmid15345796]. There are several methods for covariate adjustment in radomised trials: direct adjustment, standardisation and inverse-probability-of-treatment weighting.

### Covariate adjustment in trials

Specifically, covariate adjustment refers to adjustment of covariates available at the time of randomisation, i.e. prespecified variables and not variables after randomisation such as pneumonia post stroke trials. The advantage of covariate adjustment is that it results in narrower confidence interval as well as increase the power of the trial up to 7% [@10.1186/1745-6215-15-139]. The increased power is highest when prognostic variables are used but can decrease power when non-prognostic variables are used [@10.1186/1745-6215-15-139].

### Subgroup analysis

Subgroup analysis can be misleading especially if not specified prior to trial analysis [@doi:10.1056/NEJMsr077003]. Furthermore, increasing the number of subgroup analysis will lead to increasing the chance of false positive result or multiplicity. It is important to differentiate between prespecified and posthoc analyses as posthoc analyses may be biased by examination of the data.

### p value for interaction

The p value for interaction describe the influence by a baseline variable  treatment effect on outcome in clinical trial [@doi:10.1056/NEJMsr077003]. In a hypothetical trial, a significant p value for interaction between males and females for treatment effect on primary outcome indicates heterogenity of treatment effect.

```{r 03-Statistics-14-1}
library(Publish)
library(survival)
```

### Interpreting risk reduction in clinical trials

A key issue in interpreting of clinical trials occurs when the relative risk 
reduction or relative hazard risk are provided. This issue affect the clinical interpretation of the trial finding and its application in practice. An example is the result of the ACAS asymptomatic carotid artery trial is often quoted as showing 50% risk reduction. In fact, there was 2% annual risk of ipsilateral stroke in the medical and 1% risk in the surgical arm. The absolute risk reduction or ARR was 1% per year. However, the 50% relative risk reduction is often quoted to explain to patients.

### NNT from ARR

In this case, the number needed to treat (NNT) is given by $\frac{1}{ARR}$ or $\frac{1}{0.01}=100$ to achieve the trial outcome or 100 patients are needed to be treated to reduce the stroke risk to 1%. The recommendation is that the 95% confidence interval for NNT should be provided. 


### NNT from odds ratio

Calculation of NNT for odds ratio requires knowledge of the outcome of the placebo group. The NNT is given by $\frac{1}{ACR-\frac{OR*ACR}{1-(ACR+OR*ACR)}}$. The ACR represents the assumed control risk. The NNT can be calculated from _nnt_ function in _meta_ library. 

```{r 03-Statistics-14-2}

library(meta)
#p.c = baseline risk
nnt(0.73, p.c = 0.3, sm = "OR")

```

### NNT from risk ratio

```{r 03-Statistics-14-3}

#data from EXTEND-IV trial in NEJM 2019
#outcome 35.4% in tpa and 29.5% in control
nnt(1.44, p.c = 0.295, sm = "RR")

```

### NNT from hazard ratio

Calculation of NNT for hazard ratio requires knowledge of the outcome of the placebo group and the hazard ratio or _HR_. The formula using the binomial theorem is $p=1-q$ where q is given by the ratio of outcome and numbers recruited in the placebo group. The formula is taken from [@pmid32404155] . The NNT is given 
$\frac{1}{[p^{HR}-p}$ . We illustrated this using data from metaanalysis on aspirin use in stroke in Lancet 2016. There were 45 events among 16053 patients 
in the control group. The HR was 0.44. The NNT from this formula $\frac{1}{.9971968^.44-.9971968}$ was 637. 

### NNT from metaananlysis

There are concerns with using NNT from the results of metaanalysis as the findings are amalgations of trials with different settings [@pmid12614141] [@pmid10356018]. The caution applies when baseline risks or absolute risk differences vary across trials. 

## Diagnostic test

### Sensitivity, specificity

The sensitivity of a diagnostic test is the true positive rate and the specificity is the true negative rate.  Example of 2 x 2 table is provided here. As an exercise, consider a paper about a diagnostic test for peripheral vertigo reporting 100% sensitivity and 94.4% specificity. There are 114 patients, 72 patients without stroke have vertigo and positive test findings. Among patients with stroke 7 of 42 have positive test findings. The sensitivity is $TP=\frac{TP}{TP+FN}$ and the specificity is the $TN=\frac{TN}{TN+FP}$.

```{r 03-Statistics-14-3-1}
#              Peripheral Vertigo
#            Disease Positive     Disease Negative
#$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
# HIT Test#                 #                      $
# Positive# True Positive   # False Positive       $  
#         #     72          #       7              $
#$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
# HIT Test#                 #                      $
# Negative# False Negative  # True Negative        $
#         #     0           #       35             $ 
#$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

# Peripheral Vertigo
#Sensitiviy=TP/(TP+FN)=100%
#Specificity=TN/(TN+FP)=83%
```


### AUC

The area under the receiver operating characteristics (ROC) curve or AUC is a measure of the accuracy of the test. It is recommended that ROC curve is used when there are multiple threshold. It should not be used when the test has only one threshold. Some investigators suggest caution regarding the validity of using receiver operating curve with single threshold diagnostic tests [@pmid33250548]. An AUC of 0.5 is classified as no better than by chance; 0.6–0.69 provides poor discrimination; 0.7–0.79 provides acceptable (fair) discrimination, 0.8 to 0.89 provides good (excellent) discrimination, and 0.9 to 1.0 provides outstanding discrimination. 



### Likelihood ratio

Positive likelihood ratio (PLR) is the ratio of sensitivity to false positive rate (FPR); the negative (NLR) likelihood ratio is the ratio of 1-sensitivity to specificity. A PLR indicates the likelihood that a positive spot sign (test) would be expected in a patient with ICH (target disorder) compared with the likelihood that the same result would be expected in a patient without ICH. Using the recommendation by Jaeschke et al, a high PLR (>5) and low NLR (<0.2) indicate that the test results would make moderate changes in the likelihood of hematoma growth from baseline risk. PLRs of >10 and NLRs of <0.1 would confer very large changes from baseline risk .

#### Fagan's normogram

Fagan’s normogram can be conceptualised as a sliding ruler to match the disease 
prevalence and likelihood ratios to evaluate the impact on the post-test probability [@pmid1143310]. At the current disease prevalence of 23.4% and PLR 4.65, the post-test probability remains low at 0.60. 

```{r 03-Statistics-14-3-2, eval=F}
library(tidyverse)
source("https://raw.githubusercontent.com/achekroud/nomogrammer/master/nomogrammer.r")
p<-nomogrammer(Prevalence = .234, Plr = 4.85, Nlr = 0.49)
p+ggtitle("Fagan's normogram for Spot Sign and ICH growth")

#to save the file
#ggsave(p,file="Fagan_SpotSign.png",width=5.99,height=3.99,units="in")

```


#### Likelihood ratio graph

Likelihood ratio graph is a tool for comparing diagnostic tests [@pmid10700737].

```{r 03-Statistics-14-3-3}
#plot likelihood ratio graph

LR_graph<-function (Read,sheet,Sensitivity, Specificity){
  
  Read1<-readxl::read_xlsx(Read, sheet = sheet)
  
  #binary data
  #A=True pos %B=False positive   %C=False negative   %D=True negative
  
  A=Read1$TP
  B=Read1$FP
  C=Read1$FN
  D=Read1$TN
  
  TPR=A/(A+C)
  FPR=1-(D/(D+B))
  
  TPR_DiagnosticTest=Sensitivity
  FPR_DiagnosticTest=1-Specificity
  
  # set plot
  
  X=seq(0,1,by=.1)
  Y=seq(0,1,by=.1)

  plot(X,Y,main="Likelihood Ratio graph", xlab="1-Specificity",ylab="Sensitivity",cex=.25)
  
  #pch describe the shape. The value 1 corresponds o
  points(FPR_DiagnosticTest,TPR_DiagnosticTest,pch=8,col="blue",cex=2)
  
  #pch describe the shape. The value 8 corresponds *
  points(FPR,TPR,pch=1,col="red",cex=2) #add point
  #abline(coef = c(0,1)) #add diagonal line
  df1<-data.frame(c1=c(0,TPR_DiagnosticTest),c2=c(0,FPR_DiagnosticTest))
  reg1<-lm(c1~c2,data=df1)
  df2<-data.frame(c1=c(TPR_DiagnosticTest,1),c2=c(FPR_DiagnosticTest,1))
  reg2<-lm(c1~c2,data=df2)
  abline(reg1)
  abline(reg2)
  text(x=FPR_DiagnosticTest,y=TPR_DiagnosticTest+.3,label="Superior",cex=.7)
  text(x=FPR_DiagnosticTest+.2,y=TPR_DiagnosticTest+.2,label="Absence",cex=.7)
  text(x=.0125,y=TPR_DiagnosticTest-.1,label="Presence",cex=.7)
  text(x=FPR_DiagnosticTest+.1,y=TPR_DiagnosticTest,label="Inferior",cex=.7)
  text(x=.7,y=.2,label="Reference = Content Expert",cex=.7)
  text(x=.7,y=.15, label="Diagnostic Test software", cex=.7)

}
```

Runnning the function from above

```{r 03-Statistics-14-3-3-1}
#Sensitivity=0.623
#Specificity=1-.927

LR_graph("./Data-Use/Diagnostic_test_summary.xlsx",1,.623,.927)

```

## Metaanalysis

During journal club, junior doctors are often taught about the importance of metaanalysis. It is worth knowing how to perform a metaanalysis in order to critique the study. This is an important issue as the junior doctor is supervised by someone who a content expert but not necessarily a method expert. Metaanalysis can be performed for clinical trials, cohort studies or diagnostic studies. As an example, it is not well known outside of statistics journal that the bivariate analysis is the preferred method to evaluate diagnostic studies [@pmid16168343]. By contrast, the majority of metaanalysis of diagnostic studies uses the univariate method of Moses and Littenberg [@pmid8210827]. This issue will be expanded below.

### Quality of study

All studies require evaluation of the quality of the individual studies. This can be done with the QUADAS2 tool, available at 
https://annals.org/aim/fullarticle/474994/quadas-2-revised-tool-quality-assessment-diagnostic-accuracy-studies. 

### PRISMA

The PRISMA statement is useful for understanding the search strategy and the papers removed and retained in the metaanalysis. An example of generating the statement is provided below in R. The example given here is from a paper on the use of spot sign to predict enlargment of intracerebral hemorrhage [@pmid31272327].

```{r 03-Statistics-14-3-4}
library(PRISMAstatement)
#example from Spot sign paper. Stroke 2019
prisma(found = 193,
       found_other = 27,
       no_dupes = 141, 
       screened = 141, 
       screen_exclusions = 3, 
       full_text = 138,
       full_text_exclusions = 112, 
       qualitative = 26, 
       quantitative = 26,
       width = 800, height = 800)

#https://rich-iannone.github.io/DiagrammeR/graphviz_and_mermaid.html#attributes
library(DiagrammeR)
grViz("
digraph boxes_and_circles {

  # a 'graph' statement
  graph [overlap = true, fontsize = 10]

  # several 'node' statements
  node [shape = box,
        fontname = Helvetica]
  Stroke

  node [shape = oval,
        fixedsize = false,
        color=red,
        width = 0.9] 
  Hypertension; 'No Hypertension'
  
  node [shape= circle,
  fontcolor=red,
        color=blue,
        fixedsize=false]

  Hypokalemia; 'No Hypokalemia'        

  # several 'edge' statements
  edge [arrowhead=diamond]
  
  Stroke->{Hypertension, 'No Hypertension'}
  Hypertension->{Hypokalemia, 'No Hypokalemia'}
  
}

")

## alternative

grViz("digraph flowchart {
      # node definitions with substituted label text
      node [fontname = Helvetica, shape = rectangle]        
      tab1 [label = '@@1']
      tab2 [label = '@@2']
      tab3 [label = '@@3']
      tab4 [label = '@@4']
      tab5 [label = '@@5']

 
# edge definitions with the node IDs
      tab1 -> tab3 
      tab1-> tab2 
      #tab2->tab3
      tab2 -> tab4 
      tab2-> tab5;
      }

      [1]: 'Stroke n=19'
      [2]: 'Hypertension n=10'
      [3]: 'No Hypertension n=9'
      [4]: 'Hypokalemia n=?'
      [5]: 'No Hypokalemia n=?'
      ")

```

### Conversion of median to mean

One issue with performing metaanalysis is that one paper may report mean and another report median age. The formula for the mean is given by $\frac{a+2m+b}{4}$ where _m_ is the median, _a_ is the upper and _b_ is the lower range [@pmid25524443]. The variance is given by $S^2=\frac{1}{12}(\frac{(a+2m+b)^2}{4}+{(b-a)^2})$. This formula requires examination of the data such as the figure to obtain the upper and lower range. These changes are incorporated into _meta_ libray using _meatamean_ function [@doi10.1136/ebmental-2019-300117].  More recently, investigators suggest to also consider the skewness of the data from the 5 number summary data [@pmid37161735]. The argument _method.mean_ in function _metamean_ is used to specify the method for estimating the mean. In this case we chose the Luo method for illustration Luo [@doi:10.1177/0962280216669183]. See the help page by typing question mark before _metamean_ for more options.

```{r 03-Statistics-14-4}
library(meta)

#NIHSS data from ANGEL large core trial in NEJM 2023
metamean(q1=4,q3=20,median=16,n=230, method.mean = "Luo")


```

Here the same data is used for the McGrath method [@doi:10.1177/0962280219889080].

```{r 03-Statistics-14-5}

metamean(q1=4,q3=20,median=16,n=230, method.mean = "QE-McGrath")

```


The conversion from median to mean in _metafor_ is performed using _conv.fivenum_ function [@JSSv036i03]. There is an update on the _metafor_ page as well as discussion on alternate approach. The default method of this function is to use the methods by Luo [@doi:10.1177/0962280216669183], Wan [@pmid25524443] and Shi [@pmid37161735]

```{r 03-Statistics-14-6}
library(metafor)
# example data frame
EstMean <- data.frame(Paper=c(1:4,NA), min=c(1,2,NA,2,NA), q1=c(NA,NA,4,4,NA),
                  median=c(5,6,6,6,NA), q3=c(NA,NA,10,10,NA),
                  max=c(12,14,NA,14,NA),
                  mean=c(NA,NA,NA,NA,7.0), sd=c(NA,NA,NA,NA,4.2),
                  n=c(30,30,30,30,30))
EstMean

EstMean <- conv.fivenum(min=min, q1=q1, median=median, q3=q3, max=max, n=n, data=EstMean)

EstMean


```

The _estmeansd_ library uses quantile estimation method with _qe.mean.sd_ function when the data available are the median and quartile ranges [@doi:10.1177/0962280219889080] [@pmid36412105].The approach here is to use simulation to estimate the parameters.

```{r 03-Statistics-14-7}
library(estmeansd)

#data from ANGEL large core trial in NEJM 2023
res_qe <- bc.mean.sd(q1.val = 13, med.val = 16, q3.val = 20, n = 230)
res_qe 

```
The standard error from the mean can be estimated using _get_SE_ function.

```{r 03-Statistics-14-8}
get_SE(res_qe)

```

The _estmeansd_ library uses Box-Cox method for estimating mean and sd when the data available are the median, minimum and maximum values.

```{r 03-Statistics-14-9}
library(estmeansd)

res_bc <- bc.mean.sd(min.val = 13, med.val = 16, max.val = 42, n = 230)
res_bc 

```
Again, the standard error from the mean can be estimated using _get_SE_ function.

```{r 03-Statistics-14-10}
get_SE(res_bc)

```


### Inconsistency I2

The inconsistency $I^2$ index is the sum of the squared deviations from the overall effect and weighted by the study size. Value <25% is classified as low 
and greater than 75% as high heterogeneity. This test can be performed using metafor package [@JSSv036i03]. The presence of high $I^2$ suggests a need to proceed to  meta-regression on the data to understand the source of heterogeneity. The fixed component were the covariates which were being tested for their effect on heterogeneity. The random effect components were the sensitivity and FPR. The $I^2$ value for the TIA clinic study is 34.41%. As such meta-regression is not needed for that study. By contrast, the $I^2$ is much higher for the spot sign study, necessitating metaregression.


```{r 03-Statistics-15}
library(PRISMAstatement)

#example from Spot sign paper. Stroke 2019
prisma(found = 193,
       found_other = 27,
       no_dupes = 141, 
       screened = 141, 
       screen_exclusions = 3, 
       full_text = 138,
       full_text_exclusions = 112, 
       qualitative = 26, 
       quantitative = 26,
       width = 800, height = 800)
```

### Metaanalysis of proportion

This is an example of metaanalysis of stroke recurrence following management in rapid TIA clinic. A variety of different methods for calculating the 95% confidence interval of the binomial distribution. The mean of the binomial distribution is given by p and the variance by $\frac{p \times (1-p)}{n}$. The term $z$ is given    by $1-\frac{\alpha}{2}$ quantile of normal distribution. A standard way of calculating the confidence interval is the Wald method $p\pm z\times \sqrt{\frac{p \times(1-p)}{n}}$. The Freeman-Tukey double arcsine transformation tries to transform the data to a normal distribution. This approach is useful when occurence of event is rare. The exact or Clopper-Pearson method is suggested as the most conservative of the methods for calculating confidence interval for proportion. It is based on cumulative properties of the binomial distribution. The Wilson method has similarities to the Wald method. It has an extra term $z^2/n$. There are many different methods for calculating the confidence interval for proportions. Investigators such as Agresti proposed that approximate methods are better than exact method [@doi:10.1080/00031305.1998.10480550]. Brown and colleagues proposed the use of the Wilson method [@brown2001]

```{r 03-Statistics-16, warning=F}

library(metafor) #open software metafor
#create data frame dat
#xi is numerator
#ni is denominator

dat <- data.frame(model=c("melbourne","paris","oxford","stanford","ottawa","new zealand"),
xi=c(7,7,6,2,31,2), 
ni=c(468,296, 281,223,982,172))

#calculate new variable pi base on ratio xi/ni
dat$pi <- with(dat, xi/ni)

#Freeman-Tukey double arcsine trasformation
dat <- escalc(measure="PFT", xi=xi, ni=ni, data=dat, add=0)	
res <- rma(yi, vi, method="REML", data=dat, slab=paste(model))

#create forest plot with labels
metafor::forest(res, transf=transf.ipft.hm, targs=list(ni=dat$ni), xlim=c(-1,1.4),refline=res$beta[1],
       cex=.8, ilab=cbind(dat$xi, dat$ni), 
       #position of data on x-axis
       ilab.xpos=c(-.6,-.4),digits=3)

#par function combine multiple plots into one
op <- par(cex=.75, font=2)
#position of column names on x-axis
text(-1.0, 7.5, "model ",pos=4)
text(c(-.55,-.2), 7.5, c("recurrence", " total subjects"))
text(1.4,7.5, "Proportion [95% CI]", pos=2)
par(op)
```

Exact 95% confidence interval for proportion is provided below using the TIA 
data above. This solution was provided on _stack overflow_. This is performed using the _binomial.test_ function.

```{r 03-Statistics-16-1, warning=F}

#exact confidence interval
sapply(split(dat, dat$model), function(x) binom.test(x$xi, x$ni)$conf.int)
```

The data needs to be transposed using _t_ function. This generates one column 
for lower and another for upper CI. Note that sapply returns a matrix or array 
and the column names have to be assigned.

```{r 03-Statistics-16-2, warning=F}
#the data
tmp <- t(sapply(split(dat, dat$model), function(x) binom.test(x$xi, x$ni)$conf.int))
tmp
```

The exact confidence is now put back into the dat data frame.

```{r Statistics-16-3, warning=F}

dat$ci.lb <- tmp[,1] #adding column to data frame dat
dat$ci.ub <- tmp[,2] #adding column to data frame dat

dat <- escalc(measure="PFT", xi=xi, ni=ni, data=dat, add=0)	
res <- rma.glmm(measure="PLO", xi=xi, ni=ni, data=dat)


#insert the exact confidence interval 
with(dat, metafor::forest(yi, ci.lb=ci.lb, ci.ub=ci.ub, 
  ylim=c(-1.5,8.5), 
  xlim=c(-1.1,1), 
  refline=predict(res, transf=transf.ilogit)$pred,
  cex=.8, 
  ilab=cbind(dat$xi, dat$ni), 
  ilab.xpos=c(-.6,-.2),digits=3))
  
op <- par(cex=.75, font=2)
addpoly(res, row=-1, transf=transf.ilogit)
abline(h=0)
#position of column names on x-axis
text(-.9, 7.5, "Model", pos=2)
text(c(-.55,-.2), 7.5, c("recurrence", " total subjects"))
text( 1,   7.5, "Proportion [95% CI]", pos=2)
```

### Metaanalysis of continuous data

Metanalysis of continuous outcome data can be performed using standardised mean difference or ratio of means. It is available in the _meta_ library using _metacont_ function [@doi10.1136/ebmental-2019-300117].

### Bivariate Metaanalysis

The univariate method of Moses-Shapiro-Littenberg combines these measures (sensitivity and specificity) into a single measure of accuracy (diagnostic odds ratio)[@pmid8210827] . This approach has been criticized for losing data on sensitivity and specificity of the test. Similar to the univariate method, the bivariate method employs a random effect to take into account the within study correlation [@pmid16168343]. Additionally, the bivariate method also accounts 
for the between-study correlation in sensitivity and specificity. Bivariate analysis is performed using _mada_ package. A [Bayesian library][Bayesian Metaanalysis] for bivariate analysis _meta4diag_ is illustrated later.

The example below is taken from a metaanalysis of spot sign as predictor 
expansion of intracerebral hemorrhage [@pmid31272327]. The data for this 
analysis is available in the Data-Use sub-folder.

```{r 03-Statistics-17, warning=F}
library(mada)
#spot sign data
Data<-read.csv("./Data-Use/ss150718.csv")

#remove duplicates using subset
#another way is to use filter from dplyr
Dat<-subset(Data, Data$retain=="yes") 
(ss<-reitsma(Dat))
summary(ss)
AUC(reitsma(data = Dat))
sumss<-SummaryPts(ss,n.iter = 10^3) #bivariate pooled LR
summary(sumss)
```

### Metaanalysis of clinical trial.

Fixed effect (Peto or Mantel-Haenszel) approaches assume that the population is the same for all studies and thus each study is the source of error. Random 
effect (DerSimonian Laird) assumes an additional source of error between studies.The random effect approach results in more conservate estimate of effect size confidence interval. A criticism of DerSimnonian and Laird approach is that it is prone to type I error especially when the number of number of studies is small (n<20) or moderate heterogeneity. It's estimated that 25% of the significant findings with DerSimonian Laird method may be non-significant with Hartung-Knapp method (BMC Medical Research Methodology 2014, 14:25). The Hartung-Knapp method (Stat Med 2001;20:3875-89) estimate the between studies variance and treat it as fixed. It employs quantile of t-distribution rather than normal distribution. The Hartung-Knapp method is available in meta and metafor package.

The data is from Jama Cardiology on Associations of Omega-3 Fatty Acid Supplement Use With Cardiovascular Disease Risks Meta-analysis of 10 Trials Involving 77917 Individuals [@pmid29387889]. Subsequently a meta-analysis [@pmid31567003] reported that contrary to the earlier meta-analysis, Omega-3 lowers the risk of cardiovascular diseases with effect related to dose. The analysis using DerSimonian Laird and Hartung Knapp method is illustrated below.

```{r 03-Statistics-17-1, warning=F}
library(tidyverse)
library(metafor)

#Omega 3 data
Year=c(2010,2014,2010,2007,2010,2010,2013,2008,2012,1999)
Trials=c("DOIT","AREDS-2","SU.FOL.OM3","JELIS","Alpha Omega","OMEGA","R&P","GISSI-HF","ORIGIN","GISSI-P")
Treatment=c(29,213,216,262,332,534,733,783,1276,1552)
Treatment.per=c(10.3,9.9,17.2,2.8,13.8,27.7,11.7,22.4,20.3,27.4)
Control=c(35,208,211,324,331,541,745,831,1295,1550)
Control.per=c(12.5,10.1,16.9,3.5,13.6,28.6, 11.9,23.9,20.7,27.3)

#combine into data frame
rct<-data.frame(Year,Trials,Treatment,Treatment.per,Control,Control.per) %>%
  mutate(Treatment.number=round(Treatment*100/Treatment.per,0),
         Control.number=round(Control*100/Control.per,0), group=ifelse(Year>2010,"wide","narrow")) %>%
  rename(ai=Treatment,n1i=Treatment.number,ci=Control,
         n2i=Control.number,study=Trials) 


#peto's fixed effect method
res <- rma.peto(ai=ai, n1i=n1i, ci=ci, n2i=n2i, data=rct)
print(res, digits=2)
result<-predict(res, transf=exp, digits=2)
#forest plot
metafor::forest(res, targs=list(study=rct$study), 
main="RCT of Omega3 fatty acid for cardiovascular disease-Peto")
```

The funnel plot is used here to illustrate presence or absence publication bias. In this case the funnel plot reflects the same findings as the low $I^2$ value. There is one outlier. We will explore this further with the _GOSH_ plot below.


```{r 03-Statistics-17-2, warning=F}
# funnel plot 
funnel(res, refline=0, level=c(90, 95, 99), 
       shade=c("white", "gray", "darkgray"))

```

Random effect (DerSimonian Laird) assumes an additional source of error between studies when calculating odds ratio.

```{r 03-Statistics-17-3, warning=F}
#DL
datO3 <- escalc(measure = "OR",ai=ai, n1i=n1i, ci=ci, n2i=n2i,data=rct)
res.DL<-rma(yi,vi, method = "DL",data=datO3)
metafor::forest(res.DL,main="RCT of Omega3 fatty acid for cardiovascular disease-DL")
```

DerSimonian Laird method may not be appropriate for metaanalysis with small 
number of studies . The recommendation is to use the Hartung Knapp random effect method. This analysis show that inspite of the small sample size the significant findings remain.

```{r 03-Statistics-17-4, warning=F}
#Hartung-ignore error message
res.HK<-rma.uni(yi,vi=1/vi,method="FE",knha=TRUE,data=datO3)
metafor::forest(res.HK,main="RCT of Omega3 fatty acid for cardiovascular disease-HK")
```

Subgroup analysis is one way of exploring the data for group effect. Metaregression is illustrated later.

```{r 03-Statistics-17-5, warning=F, eval=F}
#datO3$group<-rct3$group
#plot subgroups
res.group <- rma(yi, vi, mods = ~ group, data=datO3)
res.wide<- rma(yi, vi, subset=(group=="wide"), data=datO3)
res.wide
res.narrow<- rma(yi, vi, subset=(group=="narrow"), data=datO3)
res.narrow
#https://stackoverflow.com/questions/39392706/using-the-r-forestplot-package-
#is-there-a-way-to-assign-variable-colors-to-boxe

#confidence interval
rmeta_conf <- 
  structure(list(
    mean  = c(NA, NA, exp(-0.0676), exp(-0.0266),  NA, exp(-0.0352)), 
    lower = c(NA, NA, exp(-0.1873), exp(-0.0715),  NA, exp(-0.0753)),
    upper = c(NA, NA, exp(0.0520), exp(0.0183),  NA, exp(0.0049))),
    .Names = c("mean", "lower", "upper"), 
    row.names = c(NA, -6L), 
    class = "data.frame")

#table data
tabletext<-cbind(c("","Trials","wide","narrow",NA,"Summary"),
    c("Events","(Drugs)",sum(filter(datO3,group=="wide")$ai),
      sum(filter(datO3,group=="narrow")$ai),NA,NA),
    c("Events","(Control)",sum(filter(datO3,group=="wide")$ci),
      sum(filter(datO3,group=="narrow")$ci),NA,NA),     
    c("","OR","0.934","0.974",NA,"0.965")
                 )
#use forestplot as another library to perform forest plot
library(forestplot)
 forestplot(tabletext, 
       rmeta_conf,new_page = TRUE,
       is.summary=c(TRUE,TRUE,rep(FALSE,8),TRUE),
       clip=c(0.1,2.5), 
       xlog=TRUE, 
       col=fpColors(box="royalblue",line="darkblue", summary="royalblue"))


```

Baujat plot is another method in addition to funnel  plot explore heterogeneity

```{r 03-Statistics-17-6, warning=F}
# adjust margins so the space is better used

par(mar=c(5,4,2,2))
# create Baujat plot to explore source of heterogeneity
baujat(res.DL, xlim=c(0,20), ylim=c(0,0.2))
```


GOSH plot explores study heterogeneity using output of fixed effect model for 
all possible subsets

```{r 03-Statistics-17-7, warning=F}

## fit FE model to all possible subsets
# uses output from DL analysis

sav <- gosh(res.DL)
 
### create GOSH plot
### red points for subsets that include and blue points
### for subsets that exclude study 16 (the ISIS-4 trial)

plot(sav, out=dim(rct)[1], breaks=100)

```


### Metaregression

```{r 03-Statistics-17-8, warning=F}

#separate metaregression for tsens and tfpr

#tsens - setting single or multicentre for spot sign metaanalysis
dat1<-subset(Dat, Dat$result_id==1 )
(metass<-reitsma(dat1, formula = cbind(tsens,tfpr)~PubYear+Study.type+Setting+Quality.assessment)) 
summary(metass)


#tfpr - setting single or multicentre for spot sign metaanalysis
dat1<-subset(Dat, Dat$result_id==1 )
(metassfull<-reitsma(dat1, formula = cbind(tsens,tfpr)~
        PubYear+clinical+Study.type+CTA6hrs+Setting+Quality.assessment)) 
summary(metassfull)
```

Plot year against tsens from metaregression

```{r 03-Statistics-17-9, warning=F}

#using output from spot sign

library(ggplot2)
library(lubridate)
ssr<-as.data.frame(ss$residuals)
#convert character to year
ssr$Year<-as.Date(as.character(Dat$PubYear),"%Y")
ssr$Quality<-Dat$Quality.assessment
ggplot(ssr, aes(x=ssr$Year,y=ssr$tsens))+geom_point()+
  scale_x_date()+geom_smooth(method="lm")+
  ggtitle("Relationship between transformed sensitivity and Publication Year")+
  labs(x="Year",y="transformed sensitivity")
```

#### summary Positive and Negative Likelihood Ratio

Positive likelihood ratio (PLR) is the ratio of sensitivity to false positive 
rate (FPR); the negative (NLR) likelihood ratio is the ratio of 1-sensitivity 
to specificity. A PLR indicates the likelihood that a positive spot sign (test) would be expected in a patient with target disorder compared with the 
likelihood that the same result would be expected in a patient without target disorder. Using the recommendation by Jaeschke et al[@pmid8309035] a 
high PLR (>5) and low NLR (<0.2) indicate that the test results would make moderate changes in the likelihood of hematoma growth from baseline risk. 
PLRs of >10 and NLRs of <0.1 would confer very large changes from baseline risk. The pooled likelihood ratios were used to calculate post-test odds according to Bayes’ Theorem and post-test probabilities of outcome after a positive test 
result for a range of possible values of baseline risk. 

Data from likelihood ratio can be used to create Fagan's normogram.

```{r 03-Statistics-18, eval=F}
#The code for this is available at
#"https://raw.githubusercontent.com/achekroud/nomogrammer/master/nomogrammer.r"
library(nomogrammer)
p<-nomogrammer(Prevalence = .234, Plr = 4.85, Nlr = 0.49)
p+ggtitle("Fagan's normogram for Spot Sign and ICH growth")
#ggsave(p,file="Fagan_SpotSign.png",width=5.99,height=3.99,units="in")
```

## Data simulation

Data simulation is an important aspects of data science. The example below is taken from our experience trying to simulate data from recent clot retrieval trials in stroke [@pmid25517348] [@pmid25671797]. Simulation is performed using _simstudy_ library. 

```{r 03-Statistics-19, warning=F}
library (simstudy)
library(tidyverse)

#T is Trial
def <- defData(varname = "T", dist = "binary", formula = 0.5)

#early neurological improvement (ENI) .37 in TPA and .8 in ECR
#baseline NIHSS 13 in TPA and 17 in ECR

def <- defData(def, varname = "ENI", dist = "normal", formula = .8-.53*T, variance = .1)

#baseline NIHSS 13 in TPA and 17 in ECR
def <- defData(def, varname = "Y1", dist = "normal", formula=13, variance = 1)

def <- defData(def, varname = "Y2", dist = "normal", formula = "Y1*ENI - 5 * T >5",variance = 1)

def <- defData(def, varname = "Y3", dist = "normal", formula = "Y2- 4*T>2",variance = 1)

def <- defData(def, varname = "Y4", dist = "normal", formula = "Y3- 2*T>0", variance = 1)

#male
def <- defData(def,varname = "Male", dist = "binary", formula = 0.49*T)

#diabetes .23 in TPA and .06 in ECR
def <- defData(def,varname = "Diabetes", dist = "binary", formula = .23-.17*T)

#HT .66 TPA vs .6 ECR
def <- defData(def,varname = "HT", dist = "binary", formula = .66-.06*T)

#generate data frame
dtTrial <- genData(500, def)

#define parameter for mRS


#Add conditional column with field name "mRS"



dtTime <- addPeriods(dtTrial, nPeriods = 4, idvars = "id", 
            timevars = c("Y1", "Y2", "Y3","Y4"), timevarName = "Y")
dtTime

#check  that 2 groups are similar at start but not at finish
t.test(Y1~T,data=dtTrial)
t.test(Y4~T,data=dtTrial)
t.test(Male~T,data=dtTrial)

#putting the 4 time periods together - long format
dtTime <- addPeriods(dtTrial, nPeriods = 3, idvars = "id", timevars = c("Y1", "Y2", "Y3","Y4"), timevarName = "Y")

#summarise data using group_by
dtTime2<-dtTime %>%
  group_by(period, T) %>%
  summarise(meanY=mean(Y),
            sdY=sd(Y),
            upperY=meanY+sdY,
            lowerY=meanY-sdY)



#write.csv(dtTime, file="./Data-Use/dtTime_simulated.csv")
#write.csv(dtTrial,          file="./Data-Use/dtTrial_simulated.csv")
```