Dynamic implementation platform for ALS clinical trials

Home_p2.utf8

A work by Ruben P.A. van Eijk

     

 

Home_p3.utf8

Funded by ALS Foundation Netherlands

TRICALS Eligibility - Risk Profile Calculator


Diagnosis:

Dates:

Clinical measures at screening:
Summary of patient characteristics

Design for Time-to-Event endpoints

Mortality, 18-month

Summary

Historical assumptions

Modelled mortality rate


                                

                                

Design parameters

Simulation of Time-to-Event designs

Simulated trial

Hypothetical results:

Simulate design

Reference values GLI-2012


The forced vital capacity (FVC) is a key measure for monitoring disease progression in patients with ALS and often used for medical decision-making (e.g. initiation of volume recruiting techniques or NIV). Importantly, study participants are commonly selected based on their FVC. Correct standardization of the FVC is, therefore, essential for both clinicians and researchers. The global lung function 2012 equations (GLI-2012) are the most recent reference values and may improve the value of FVC measurements for ALS patients.1 Nevertheless, due to the non-linear spline functions, the GLI-2012 is difficult to calculate by hand and implement in daily practice.2  

This tool calculates the predicted FVC based on age, race, gender and height using the GLI-2012 reference values. 

 

Calculations are based on the pred_GLI function from the rspiro package (Lytras T, version 0.1, 2017).  

References
[1] van Eijk RPA, Bakers JNE, Nikolakopoulos S, Eijkemans MJC, van den Berg LH. Implications of spirometric reference values for amyotrophic lateral sclerosis (2019, Link).
[2] Quanjer PH, Stanojevic S, Cole TJ, et al. Multi-ethnic reference values for spirometry for the 3 - 95 age range: the global lung function 2012 equations (2012, Link).

Imbalance Risk Calculator


Randomization balances, on average, both the observed and unobserved prognostic factors. Nevertheless, when looking at individual trials, there is an imbalance risk of important prognostic factors due to chance. The probability of an imbalance depends on the sample size, the prevalence and the size of the imbalance. 

This tool calculates the probability of observing an imbalance given the sample size of the trial and prevalence of the prognostic factor. 

 

Formula underlying tool: \[ P(d) = 2\sum_{j = 0}^{N}(1 - B(p,N,D - 1 + j))f(p,N,j) \] where d is the size of the imbalance (e.g. ≥10%), p the prevalence, N the group size, \(D = dN\) and \(f(p, N, j)\) the mass function of the binomial random variable with parameters N and p:

\[B(p,N,k) = \sum_{j=0}^{k} f (p, N, j) \] References
[1] van Eijk RPA, Eijkemans MJC, Nikolakopoulos S, et al. Pharmacogenetic interactions in ALS clinical trials: a step closer towards a cure? (2019, link).
[2] Cui L, Hung HM, Wang SJ, Tsong Y. Issues related to subgroup analysis in clinical trials (2002, link).

Time-to-Event module

This document contains the source documention for the TRICALS Time-to-Event module. All underlying R code used in the survival models, sample size calculations and simulations are provided below.

Reference: van Eijk RPA, Nikolakopoulos S, Roes KCB, et al. Critical design considerations for time-to-event endpoints in amyotrophic lateral sclerosis clinical trials (2019, link).


Parametric survival

The primary aim of the module is to provide accurate sample size calculations for Time-to-Event endpoints in ALS clinical trials. For any sample size calculation, investigators are required to make assumptions about the underlying distribution (here: the process that generates the survival data). These assumptions are well known to be arbitrary and at high risk of misspecification. The aim of this module is to help investigators to define plausible settings. The module is based on the log-rank test; its primary assumption is proportional hazards (PH). Therefore, the module’s sample size calculations are based on parametric distributions that fulfill the PH assumption: the exponential and Weibull distribution.


Exponential distribution

The exponential distribution is commonly used to calculate the required sample size for clinical trials with a Time-to-Event endpoint. In fact, all previous clinical trials in ALS were based on the exponential distribution. The exponential survival distribution assumes a constant death rate at each point in time. Or, in other words, the instantaneous risk of dying is the same at, for example, month 1 as it is at month 18. The probability to survive until a certain time point according to the exponential distribution is given by:

\[ S_{Exponential}(t) = e^{-\lambda t} \]

Or in R code:

ExpSurv <- function (lambda, t){
  # Computes the probability of survival [Exponential distribution]
  #
  # Args:
  #   lambda = hazard rate or instantaneous death risk
  #   t = time
  # Function:
  exp (-lambda*t)
}

For convenience (as we will see in other parts of this document), it is sometimes necessary to calculate the death rate (lambda) itself. After some rearrangement, the above function translates to the function ExpRate (), which calculates lambda \((\lambda)\) from the survival probability S and time t. Lambda can be interpreted as the number of deaths occuring at each time interval. To examplify, a rate of 0.03/month means that we expect 3 deaths per 100 person-months of follow-up time.

ExpRate <- function (S, t){
  # Computes lambda or death rate [Exponential distribution]
  #
  # Args:
  #   S = probability of survival at time t
  #   t = time
  # Function:
  -log (S) / t
}

Similarly, time t can be calulated by the function ExpRandom (). This function is also used to simulate survival time following an exponential distribution. By sampling the survival probability from a Uniform [0,1] distribution, t can be generated by making use of the probability integral transformation. More information about the inverse transform method of generating random variables can be found in Rizzo ML. Statistical Computing with R (2007, p. 49).

ExpRandom <- function (S, lambda){
  # Computes time t given S and lambda [Exponential distribution]
  #
  # Args:
  #   S = probability of survival at time t
  #   lambda = hazard rate or instantaneous death risk
  # Function:
  -log (S) / lambda
}

# Example simulation:
S <- runif (5, min = 0, max =1) # Sample from a Uniform [0,1] distribution 
ExpRandom (S, lambda = 0.03)

Weibull distribution

As we showed previously (van Eijk RPA, 2018), the constant death rate (i.e. exponential) assumption is severely violated in ALS clinical trials. This violation is due to the fact that only a subset of patients fulfills the rigorous eligibility criteria for clinical trials. By excluding patients who are at immediate risk of dying (e.g. severe disease state or low vital capacity), the number of deaths is relatively low at the start of the trial as compared to later time points. In other words, the risk of dying is low initially, but increases gradually over time, which leads to an acceleration in death rates. For this reason, TRICALS-Reactive uses the Weibull survival distribution as the default distribution. The probability to survive until a certain time point under a Weibull distribution is given by:

\[ S_{Weibull}(t) = e^{-\lambda t^p} \]

Or in R code:

WeiSurv <- function (lambda, t, p){
  # Computes the probability of survival [Weibull distribution]
  #
  # Args:
  #   lambda = hazard rate or instantaneous death risk
  #   t = time
  #   p = shape parameter
  # Function:
  exp (-lambda*t^p)
}

Similar to the Exponential distribution, after some rearrangement lambda \((\lambda)\) can be obtained from the function WeiRate ().

WeiRate <- function (S, t, p){
  # Computes lambda or death rate [Weibull distribution]
  #
  # Args:
  #   S = probability of survival at time t
  #   t = time
  #   p = shape parameter
  # Function:
  ( (- log (S)) / (t^p) )
}

And time t can be obtained from the function WeiRandom ():

WeiRandom <- function (S, lambda, p){
  # Computes time t given S, p and lambda [Weibull distribution]
  #
  # Args:
  #   S = probability of survival at time t
  #   lambda = hazard rate or instantaneous death risk
  #   p = shape parameter
  # Function:
  (-log (S)/lambda)^(1/p)
}

Simulation of Time-to-Event data

Simulation is a powerful tool to investigate the effect of design parameters on empirical power and sample size. The TRICALS time-to-event module has a simulation tool that enables investigators to validate the calculated sample sizes. The required code of the data-generating mechanism is given in the function SurvGen (). Note: the simualtion function requires the functions ExpRandom () & WeiRandom ().

SurvGen <- function (rateE, rateW, HR, n, f, p = 2, a){
  # Generates a dataframe with survival data for two treatment arms
  #
  # Limitations:
  #   1:1 randomization 
  #   uniform accrual
  #
  # Args:
  #   rateE = hazard rate for exponential distribution
  #           --> this can be caculated with the function ExpRate ()
  #   rateW = hazard rate for Weibull distribution
  #           --> this can be caculated with the function WeiRate ()
  #   p = shape parameter Weibull [default for ALS is set to 2]
  #   HR = hazard ratio (placebo vs. treatment)
  #   n = group size per arm
  #   f = minimal follow-up duration
  #   a = accrual period [can be set to 0]
  #  
  # # # Step 1. Generate survival time [exponential & weibul]:
  S <- runif (2*n)
  Sexp <- ExpRandom (S = S, lambda = rep (c (rateE, rateE*HR), each = n))
  Swei <- WeiRandom (S = S, p = p, lambda = rep (c (rateW, rateW*HR), each = n))
  
  # # # Step 2.
  # Accural time per individual:
  A <- runif (2*n) * a

  # Time variables for censoring:
  M <- f + a # = total trial duration
  Mi <- M - A # = individual time in trial
  
  # # # Step 3.
  # Status variable [dead (1) or alive (0)]:
  STATUSexp <- as.numeric (Sexp < Mi) # Exponential
  STATUSwei <- as.numeric (Swei < Mi) # Weibull
  Sexp[Sexp > Mi] <- Mi[Sexp > Mi] # Censoring exponential
  Swei[Swei > Mi] <- Mi[Swei > Mi] # Censoring Weibull
  
  # # # Step 4. Save data in dataframe
  D <- data.frame (Sexp = Sexp,
                   Swei = Swei,
                   STATUSexp = STATUSexp,
                   STATUSwei = STATUSwei,
                   INCL = A,
                   TRT = rep (c (0,1), each = n))
  
  # Additional variables:
  ## Trial time (i.e. chronological order of events):
  D$Texp <- D$INCL + D$Sexp
  D$Twei <- D$INCL + D$Swei
  ## Trial time without accrual a in follow-up:
  D$OLDexp <- D$INCL + ifelse (D$Sexp>f, f, D$Sexp)
  D$OLDwei <- D$INCL + ifelse (D$Swei>f, f, D$Swei)
  ## Survival data if accrual is not included in follow-up [censored after follow-up f]:
  D$STATUS18exp <- ifelse (D$STATUSexp == 0, 0, ifelse (D$Sexp > f, 0, 1))
  D$STATUS18wei <- ifelse (D$STATUSwei == 0, 0, ifelse (D$Swei > f, 0, 1))
  D$S18wei <- ifelse (D$Swei > f, f, D$Swei)
  D$S18exp <- ifelse (D$Sexp > f, f, D$Sexp)
  
  return (D)
  
}

Estimating survival curves

In the R code below we examplify how we can use R to estimate the underlying Weibull distribution in survival data. From the fitted model we extract the death rate and Weibull shape parameter, which could, subsequently, be used in simulations or sample size calculations.

Scenario: A large hypothetical clinical trial (N = 2000) with a beneficial treatment effect (hazard ratio of 0.5). The survival probability in the placebo group is 70% at 12 months. The minimum follow-up period is 18 months. All patients were included in a 12 month period. First, we will use SurvGen () to simulate survival data based on this scenario and, subsequently, use survreg () to estimate the underlying Weibull distribution (which should approximately be the same as the parameters used in the simulation).

## Access the parametric models in R [function: survreg]
library (survival)

## Calculate death rate for given scenario [shape is set to 2]:
rateW <- WeiRate (S = 0.7, t = 12, p = 2)
rateE <- ExpRate (S = 0.7, t = 12)

## Simulate trial data:
set.seed (1)
D <- SurvGen (rateE = rateE, rateW = rateW,
              HR = 0.5, n = 1000, f = 18, p = 2, a = 12)

## Figure of simulated data:
plot (survfit (Surv (Swei, STATUSwei) ~ TRT, data = D),
      xlab = "Time since randomization (months)", 
      ylab = "Probability of survival",
      col = 1:2)
legend ("topright", legend = c ("Active", "Placebo"), col = 2:1, lty = 1)

## Model fitting:
m <- survreg (Surv (Swei, STATUSwei) ~ TRT, data = D, dist = "weibull")
summary (m)
# # # NOTE: survreg uses an alternative parameterization

## Translation of survreg parameters:
1/m$scale # Weibull shape parameter [approx. 2]
exp (-as.numeric (coef (m)["(Intercept)"]))^(1/m$scale) # Lambda [approx. 0.0025]
exp (-as.numeric (coef (m)["TRT"]))^(1/m$scale) # Hazard ratio [approx. 0.5]

Sample size calculation

Sample size calculation for time-to-event endpoints can be subdivided in two parts: (1) calculating the number of events (e.g. number of deaths) necessary to detect a given treatment effect and (2) calculating the probability of an event during the observation period. The required sample size is, subsequently, given by:

\[N = \frac {Number \:of \:events}{Probability \:of \:event} \]


Number of events

The number of events is the driving force in survival models. There have been multiple suggestions how to calculate the required number of events to detect a given treatment effect (i.e. hazard ratio), where Freedman (1982) & Schoenfeld (1981) are the most commonly used.

 
Number of events by Freedman (1982):

\[Events_{Freedman} = \frac{(1 + HR)^2}{(1-HR)^2} (z_{1-\alpha/2} + z_{1-\beta} )^2\]

Sample size function for Freedman (1982) in R code:

EvFreedman <- function (HR, alpha, power){
  # Computes the number of events according to Freedman 1982
  #
  # Args:
  #   HR = hazard ratio
  #   alpha = type 1 error
  #   power = 1 - type 2 error
  # Function:
  ( ((1 + HR)^2) / ((1 - HR)^2) ) * (qnorm (1 - (alpha/2)) + qnorm (power))^2
}

 
Number of events by Schoenfeld (1981):

\[Events_{Schoenfeld} = \frac{4 \:(z_{1-\alpha/2} + z_{1-\beta} )^2}{log^2(HR)} \]

Sample size function for Schoenfeld (1981) in R code:

EvSchoenfeld <- function (HR, alpha, power){
  # Computes the number of events according to Schoenfeld 1981
  #
  # Args:
  #   HR = hazard ratio
  #   alpha = type 1 error
  #   power = 1 - type 2 error
  # Function:
  (4* (qnorm (1 - (alpha/2)) + qnorm (power))^2) / (log (HR)^2)
}

Note: The number of events calculated by Schoenfeld are lower than the number of events calculated by Freedman (i.e. Freedman is more conservative). To exemplify, when Type 1&2 error are set to 5% and 10%, respectively, Freedman requires 95 events, whereas Schoenfeld requires 88 events to detect a hazard ratio of 0.5. The relative difference between the two formulas increases as the treatment effect increases.


Probability of event

The second, most challenging part of the sample size calculation concerns the estimation of the event probability. The probability of observing an event during a given time interval depends, among other things, on the hazard rate and observation time. Below we exemplify the calculation for the simple case of uniform accrual, 1:1 randomization and no dropout. Later, we will incorporate the Weibull distribution in the gsDesign package. The nSurv () function from the gsDesign package allows piece-wise constant accrual, dropout, hazards and unequal group sizes.

Simply put, the probability of an event for a uniform accural a and minimum follow-up f can be calculated by determing the area under de survival curve (see Collet, 2003):

\[P_{Event} = 1 - \frac{1}{a} \int_f^{a+f} {S(u) du} \]

The PeventExp () function computes the exact probability of an event for the exponential distribution (see also: Abel et al. Some issues of sample size calculation for Time-to-Event endpoints using freedman and schoenfeld formulas, 2015) and is based on the following equation:

\[P_{Event \:exponential} = 1 - e^{-\lambda (a+f)} \frac{e^{\lambda a} -1}{\lambda a} \] In R code:

PeventExp <- function (rateE, f, a, HR){
  # Computes the average probability control + treatment [Exponential dist]
  #
  # Limitations:
  #   uniform accrual
  #   1:1 randomization [equal treatment arms]
  #   No drop-out distribution
  #
  # Args:
  #   rateE = hazard rate for exponential distribution
  #           --> this can be caculated with the function ExpRate ()
  #   HR = hazard ratio (placebo vs. treatment)
  #   f = minimal follow-up duration
  #   a = accrual period [can be set to 0]
  #
  # Probability of event = 1-S(t)
  
   if (a == 0){
    # if a is 0, than the formula simplifies to the survival probability at time f:
     
    Sc <- ExpSurv (lambda = rateE, t = f)
    St <- ExpSurv (lambda = rateE*HR, t = f)
    1 - ((Sc + St)/2)
    
  } else {
    
    # i.e. "mean" survival probability over this time period [calculated area under curve]
    Sc <- ((exp (-rateE*(f+a)) * ((exp (rateE*a) - 1) / (rateE*a))))
    St <- ((exp (-HR*rateE*(f+a)) * ((exp (HR*rateE*a) - 1) / (HR*rateE*a)))) 
    1 - ((Sc + St)/2)
  }

}

For the Weibull distribution we use the integrate () function in R to obtain the estimated probability:

PeventWei <- function (rateW, f, a, HR, p){
  # Computes the average probability control + treatment [Weibull dist]
  #
  # Limitations:
  #   uniform accrual
  #   1:1 randomization [equal treatment arms]
  #   No drop-out distribution
  #
  # Args:
  #   rateW = hazard rate for Weibull distribution
  #           --> this can be caculated with the function WeiRate ()
  #   p = shape parameter Weibull [default for ALS is set to 2]
  #   HR = hazard ratio (placebo vs. treatment)
  #   f = minimal follow-up duration
  #   a = accrual period [can be set to 0]
  #   
  # Probability of event = 1-S(t)
  
   if (a == 0){
    # if a is 0, than the formula simplifies to the survival probability at time f:
    
    Sc <- WeiSurv (lambda = rateW, p = p, t = f)
    St <- WeiSurv (lambda = rateW*HR, p = p, t = f)
    1 - ((Sc + St)/2)
    
  } else {
    
    # i.e. "mean" survival probability over this time period [calculated area under curve]
    Sc <- integrate (WeiSurv, lower = f, upper = f+a, lambda = rateW, p = p)$value/a
    St <- integrate (WeiSurv, lower = f, upper = f+a, lambda = rateW*HR, p = p)$value/a
    1 - ((Sc + St)/2)
  }

}

Example sample size

Below we will provide an example of the sample size calculation under both the exponential and Weibull distribution. The calculated sample size will be validated by means of simulation.

We consider the following scenario: in previous data we found a survival probability of 70% at 12 month (the survival time followed a Weibull distribution with shape 2). The new trial will have a follow-up duration of 18 months. We expect to enroll all patients in 12 months time. We want 90% power to detect a hazard ratio of 0.5; type 1 error (alpha) is set to 5%.

#########
# Step 0: set parameters:
#########
rateW <- WeiRate (S = 0.7, t = 12, p = 2)
rateE <- ExpRate (S = 0.7, t = 12)
f <- 18; a <- 12
HR <- 0.5
alpha <- 0.05; power <- 0.9; HR <- 0.5


#########
# Step 1: Calculate number of necessary events:
#########
Ev1 <- ceiling (EvFreedman (HR = HR, alpha = alpha, power = power))
Ev2 <- ceiling (EvSchoenfeld (HR = HR, alpha = alpha, power = power))
# # Note: Freedman 95 events, Schoenfeld 88 events


#########
# Step 2: Calculate event probabilities:
#########
### Exponential:
Pexp <- PeventExp (rateE = rateE, f = f, a = a, HR = HR)
Pexp_noAccrual <- PeventExp (rateE = rateE, f = f, a = 0, HR = HR)

### Weibull:
Pwei <- PeventWei (rateW = rateW, f = f, a = a, HR = HR, p = 2)
Pwei_noAccrual <- PeventWei (rateW = rateW, f = f, a = 0, HR = HR, p = 2)

# # Note: large differences in event probabilities when incorporating accrual (more observation time) and between exponential and Weibull


#########
# Step 3: Calculate sample size [rounded, total sample size]:
#########
ceiling ((Ev2/Pwei)/2)*2
ceiling ((Ev2/Pexp)/2)*2
# # Note: Weibull requires 142 patients, whereas exponential calculates 220


#########
# Step 4: which calculation was right? [exemplified for Schoenfeld + Accrual]
#########
R <- replicate (1000, {
  
  # Generate survival data:
  D <- SurvGen (rateE = rateE, rateW = rateW, 
              HR = HR, n = ceiling ((Ev2/Pwei)/2), 
              f = f, p = 2, a = a)
  
  # Log-rank test in R:
  pchisq (survdiff (Surv (Swei, STATUSwei) ~ TRT, data = D)$chisq, df = 1, lower.tail = F)
  
})
sum (R <= 0.05)/1000 # Emperical power (~90%); increase replications to improve accuracy

gsDesign & Weibull

In future work the TRICALS Time-to-Event module will be expanded with group-sequential trial designs. To accommodate these future extensions, the sample size calculations are provided by the nSurv () function from the gsDesign package (Anderson K 2017, version 2). Unfortunately, the nSurv () function does not accomodate the Weibull distribution. To circumvent this limitation and to provide accurate calculations for ALS clinical trials, we translate the Weibull distribution to piece-wise constant hazard rates, which are supported by nSurv (). At monthly intervals, we estimate from the Weibull distribution the average hazard rate h over each time period. The Weibull hazard h at time t is given by:

\[ h{(t)} = \lambda p t^{p-1} \]

We exemplify this translational process in R:

# nSurv can be found in the gsDesign package:
library (gsDesign)
library (msm) # neccesary for piece-wise constant exponential distribution

# Parameters for sample size calculation [same as scenario above]:
f <- 18; a <- 12
HR <- 0.5
alpha <- 0.05; power <- 0.9; HR <- 0.5
p <- 2

# Compute Weibull hazard rate:
rateW <- WeiRate (S = 0.7, t = 12, p = p)

#########
# Graphical model & fit
#########
t <- seq (0, f+a, by = 0.1)
plot (t, WeiSurv (lambda = rateW, p = p, t = t), type = "l",
      main = "Weibull model", ylab = "Probability of survival", xlab = "Time in months")

# Estimate hazard & survival
FU <- 1 # time interval to calculate hazard
t1 <- seq (0, (f+a)-FU, by = FU)
t2 <- seq (FU/2, f+a, by = FU)

## Cumalative hazard Weibull [Exact = WeiSurv () function]:
H <- rateW * t2^p 
points (t2, exp (-H), col = 2)

## Hazard rate Weibull:
h <- rateW * p * (t2)^(p-1)

## Estimated probability using ppexp [approximated with piece-wise constant]:
points (t2, 1 - ppexp (t2, rate = h, t = t1), pch = 8)
# # NOTE: Change FU to 6 months to see that the approximated survival fits worse


#########
# Sample size calculation
#########

# # # Exact: Schoenfeld
Ev <- EvSchoenfeld (HR = HR, alpha = alpha, power = power)
PrEvent <- PeventWei (rateW = rateW, f = f, a = a, HR = HR, p = p)
Ev/PrEvent # Sample size: 139.51

# # # Use hazard h in nSurv and specify monthly intervals:
nSurv (lambdaC = h, S = rep (FU, (length (h)-1)),
                  hr = HR, hr0 = 1,
                  R = a,
                  T = f + a,
                  minfup = f, 
                  alpha = alpha/2, beta = 1-power)$n
# Sample size: 138.75

In summary, we provide the exact hazards obtained from the Weibull distribution and specify this as the average piece-wise constant hazard over a monthly time interval in nSurv () (using the S & lambdaC arguments). The resulting sample size is close to the exact sample size calculation. Part of the inconsistency comes from a slightly different number of events (Schoenfeld: 87.47 vs nSurv: 87.01). The probability of an event is actually quite close to the exact calculation: PrEvent 0.6270 vs. nSurv (87.01/138.75) 0.6271.


Simulation of ALSFRS-R subscales

This document contains the source documentation for simulating multivariate ALSFRS-R data, including the application of two alternative strategies to determine treatment benefit. All underlying R code and parameter estimates used in the reference paper are provided below.

Reference: van Eijk RPA, de Jongh AD, Nikolakopoulos S, et al. An old friend who has overstayed their welcome: the ALSFRS-R total score as primary endpoint for ALS clinical trials (2020, under review).


Multivariate ALSFRS-R simulation model

The ALSFRS-R can be divided in three or four subscales depending on whether fine and gross motor function are modelled separately. In this work, we opted for the four-subscale parameterization; modelling and simulation approaches are, however, identical when using three subscales.

We used the mvglmer () function to estimate a multivariate mixed model under a Bayesian approach using JAGS (JMBayes). In order to simplify the simulation, we opted to model each subscale with a simple linear mixed effects model with a linear fixed effect of time and random effect for time and intercept per individual. For each individual subscale, the outcome score for the \(i^{th}\) individual is given by:

\[ Score_{ij} = \beta_{0i} + \beta_{1i}Time_{j} + \varepsilon_{ij} \] \[ \beta_{0i} = \beta_{0} + u_{0i} \] \[ \beta_{1i} = \beta_{1} + u_{1i} \]

Where \(\beta_{0}\) and \(\beta_{1}\) are the fixed effects (i.e. fixed intercept and effect of time), \(u_{0i}\) and \(u_{1i}\) the random effects for the \(i^{th}\) individual (i.e. random intercept and effect of time) and \(\varepsilon_{ij}\) the error term (i.e. residual variance). More complicated sub-models with non-linear time trends can be implemented, but are unlikely to significantly alter the results as stated in the reference work.

In order to account for the dependencies between subscales, the four linear mixed subscale models were jointly estimated using a multivariate mixed effects model. In R code, the Bayesian Multivariate Mixed Model was parameterized as follows:

# Multivariate Mixed Model parameterization [Bayesian]
## BU = bulbar; FI = fine motor; GR = gross motor; RE = respiratory
m <- mvglmer (
  list (BU ~ 1 + TIME + (1 + TIME|ID),    # Bulbar domain (item 1-3), ID is subject ID, TIME in months
        FI ~ 1 + TIME + (1 + TIME|ID),    # Fine motor domain (item 4-6)
        GR ~ 1 + TIME + (1 + TIME|ID),    # Gross motor domain (item 7-9)
        RE ~ 1 + TIME + (1 + TIME|ID)),   # Respiratory domain (item 10-12)
  data = dataset, # dataset can be downloaded from the PRO-ACT website
  families = list (gaussian, gaussian, gaussian, gaussian)) # We assume normal distribution

Alternatively, another option is to base the multivariate model on a frequentist framework using the lme () function from the NLME library:

# The data format is transformed to long format: 
## 1. All subscale scores are stacked together in a new variable (OUTCOME)
## 2. SUBSCALE is a factor with 4 levels (i.e. "BU", "FI", "GR", "RE")

## Using the varIdent () function we allow for varying residual variances among subscales 

# Multivariate Mixed Model parameterization [Frequentist]
m <- lme (
  OUTCOME ~ 0 + SUBSCALE + SUBSCALE:TIME, 
  random = ~ 0 + SUBSCALE + SUBSCALE:TIME | ID,
  weights = varIdent (form = ~ 1|SUBSCALE),
  data = dataset.Stacked)

Either using the Bayesian or Frequentist approach, we can extract the variance-covariance matrix from the model. In our case, both models provided a nearly identical result (not shown). The estimated variance-covariance matrix of the multivariate mixed model was subsequently used to simulate correlated longitudinal data for the subscales. The complete R code is given below, trial data is generated using the function mvGen ().

# NOTE: Estimates are based on the Bayesian multivariate mixed model output [mvglmer ()]

# Fixed effects
MEANS <- c (10.28, -0.22,   # Bulbar intercept & slope (B0 & B1)
             8.45, -0.34,   # Fine motor intercept & slope (B0 & B1)
             7.92, -0.31,   # Gross motor intercept & slope (B0 & B1)
            11.45, -0.19)   # Respiratory intercept & slope (B0 & B1)
# The slope can be interpreted as the mean monthly rate of decline. 
# As can be seen, mean rates of decline vary among subscales and is the highest for fine motor function.

# Variance-covariance matrix
SIGMA <- c ( 4.9891,  0.1423, -0.4796,  0.0421, -0.9112,  0.0551,  0.7257,  0.0462,
             0.1423,  0.0600,  0.0755,  0.0323,  0.0292,  0.0317,  0.0096,  0.0322,
            -0.4796,  0.0755,  7.8793,  0.0042,  4.1490,  0.0522,  0.5057,  0.1096,
             0.0421,  0.0323,  0.0042,  0.0731,  0.0708,  0.0514, -0.0064,  0.0336,
            -0.9112,  0.0292,  4.1490,  0.0708,  8.2227, -0.0272,  0.5622,  0.1634,
             0.0551,  0.0317,  0.0522,  0.0514, -0.0272,  0.0583, -0.0083,  0.0303,
             0.7257,  0.0096,  0.5057, -0.0064,  0.5622, -0.0083,  1.2621,  0.0159,
             0.0462,  0.0322,  0.1096,  0.0336,  0.1634,  0.0303,  0.0159,  0.0640)
SIGMA <- matrix (SIGMA, nrow = 8)
rownames (SIGMA) <- colnames (SIGMA) <- c ("B0_BU", "B1_BU",
                                           "B0_FI", "B1_FI",
                                           "B0_GR", "B1_GR",
                                           "B0_RE", "B1_RE")
# BU = bulbar; FI = fine motor; GR = gross motor; RE = respiratory

# Correlations between subscales can be obtained using:
round (cov2cor(SIGMA),2)
# For example, the correlation between the rate of decline in fine and gross motor is 0.79
# The correlation indicates that as fine motor function declines, so does gross motor function

# Residual variances per subscale:
RESID <- c (0.7717,  # Bulbar
            0.8861,  # Fine Motor
            0.8267,  # Gross Motor
            0.9156)  # Respiratory

## Data Generating Mechanism [based on multivariate normal distribution]
library (mvtnorm) # Required for rmvnorm

mvGen <- function (n, trt = c (0,0,0,0), 
                   MEANS, SIGMA, RESID){
  
  # ARGUMENTS:
  ## n = sample size per group (1:1 randomization)
  ## trt = treatment effect for each subscale:
    ##    [1] = bulbar
    ##    [2] = fine motor
    ##    [3] = gross motor
    ##    [4] = respiratory
  ## MEANS = fixed effects
  ## SIGMA = variance-covariance matrix
  ## RESID = residual variance
  
  # AIM:
  ## Simulate longitudinal data for a 1:1 randomized trial for 2xn subjects.
  ## Simulation provides 12 month data, with monthly follow-up
  
  ## Sample individual intercept and slopes for PLACEBO arm patients
  ## Note that we actually sample (B0 + U0i) and (B1 + U1i)
  ## If interested, subtracting B0 and B1 from the simulated data will provide U0i and U1i
  A <- as.data.frame (rmvnorm (n = n, mean = MEANS, 
                               sigma = SIGMA))
  names (A) <- c ("B0_BU", "B1_BU",
                  "B0_FI", "B1_FI",
                  "B0_GR", "B1_GR",
                  "B0_RE", "B1_RE")
  A$TRT <- 0
  
  ## Sample individual intercept and slopes for TREATED arm patients
  ## NOTE: we add only a treatment effect to B1, not to B0 (indicated by the "0").
  A1 <- as.data.frame (rmvnorm (n = n, mean = MEANS + c (0, trt[1],
                                                         0, trt[2],
                                                         0, trt[3],
                                                         0, trt[4]), 
                                sigma = SIGMA))
  names (A1) <- c ("B0_BU", "B1_BU",
                   "B0_FI", "B1_FI",
                   "B0_GR", "B1_GR",
                   "B0_RE", "B1_RE")
  A1$TRT <- 1
  A <- rbind (A, A1); rm (A1)
  A$ID <- 1:nrow (A)
  ## "A" contains now n placebo patients, and n treated patients. TRT indicates treatment status.
  ## NOTE: this is not yet longitudinal data, only the individual slopes and intercepts.
  
  ## Generating longitudinal dataset:
  ## NOTE: This is for a dataset with 12 months follow-up and monthly measurements
  ALS <- data.frame (ID = rep (unique (A$ID), each = length (seq (0, 12, by = 1))),
                     TIME = rep (seq (0, 12, by = 1), nrow (A)),
                     VISIT = rep (seq (0, 12, by = 1), nrow (A)))
  
  # Create some random visit time per monthly interval [we opted for SD = 2.5 days]
  ALS[ALS$TIME > 0, ]$TIME <- rnorm (nrow (ALS[ALS$TIME > 0, ]), 
                                     mean = ALS[ALS$TIME > 0, ]$TIME,
                                     sd = 0.08)
  
  # Add the generated individual intercepts and slopes to longitudinal dataframe
  ALS <- merge (ALS, A, all.x = T)
  
  # Calculate sub-score at time t 
  ALS$BU <- ALS$B0_BU + (ALS$B1_BU * ALS$TIME) + rnorm (nrow (ALS), mean = 0, sd = RESID[1])
  ALS$FI <- ALS$B0_FI + (ALS$B1_FI * ALS$TIME) + rnorm (nrow (ALS), mean = 0, sd = RESID[2])
  ALS$GR <- ALS$B0_GR + (ALS$B1_GR * ALS$TIME) + rnorm (nrow (ALS), mean = 0, sd = RESID[3])
  ALS$RE <- ALS$B0_RE + (ALS$B1_RE * ALS$TIME) + rnorm (nrow (ALS), mean = 0, sd = RESID[4])
  
  # Calculate total score at time t:
  ALS$TO <- ALS$BU + ALS$FI + ALS$GR + ALS$RE + rnorm (nrow (ALS), mean = 0, sd = 0.98)
  # NOTE I: we have added an additional residual variance term to the total score
  # This was done to harmonize the observed variance, effect on power is negligible 
  
  # NOTE II: Total scores may exceed the 0 and 48
  # This is a consequence of assuming a multivariate normal distribution. 
  # Trimming the total score between 0 and 48 resulted in a worse simulation accuracy.
  
  # Return dataset
  return (ALS)
  
}

Comparison of analytical strategies

The second part of this tutorial will illustrate the different analytical strategies used in the reference work. First will use mvGen () to generate a longitudinal dataset with a clear bulbar effect (40% reduction in rate of decline) for a sample size of 200. Subsequently, we will use linear mixed effects models to estimate treatment benefit for each strategy.

library (lme4)

##########
# Step 1 # Generate longitudinal data with 40% bulbar slope reduction
##########

# Sample size per group [based on reference work]:
n <- 124

# Treatment effect bulbar:
## mvGen adds treatment effects on the absolute scale
## We need to transform the 40% reduction to the absolute mean difference in slopes
## Bulbar subscale declines with 0.22 per month, meaning that the TREATED group declines:
## (1-0.4) * 0.22 = 0.132 points per month. The absolute treatment effect is, therefore:
# 0.22 - 0.132 = 0.088 (or 40% * 0.22) points per month
trt <- c (0.088, # bulbar effect on absolute scale
          0,
          0,
          0) # no effect on the other domains

## Note "trt" is defined in a fixed sequence, namely:
##    [1] = bulbar
##    [2] = fine motor
##    [3] = gross motor
##    [4] = respiratory

set.seed (11) # for replication
D <- mvGen (n = n, trt = trt,
            MEANS, SIGMA, RESID)

# Linear mixed model (common intercept, only effect of treatment on time)
m <- lmer (RE ~ 1 + TIME + TIME:TRT + (1 + TIME|ID), data = D)
summary (m)
## Simulated effect: 0.1267 points per month, % reduction: 0.1267/0.22604 = +56.1%
# Similar models can be made for other scales, resulting in the following effect size:
# Total: +6.2%; Bulbar: +56.1%; Fine: -1.1%; Gross: -19.1%; Respiratory: -6.6%
# NOTE: Random variation leads to a variety of effects. 
# Here, there is a positive bulbar effect and a large negative effect on gross motor functioning. 

##########
# Step 2 # Classical analysis
##########

# Model total score only:
m <- lmer (TO ~ 1 + TIME + TRT:TIME + (1 + TIME|ID), data = D, REML = F)
summary (m)

# Obtain test statistic & one-sided p-value:
coef <- summary (m)$coefficients["TIME:TRT", "Estimate"]
se <- summary (m)$coefficients["TIME:TRT", "Std. Error"]
df <- nrow (D) - length (unique (D$ID)) - 2 # 3-1 coefficients
stat <- coef/se
1 - pt (stat, df = df) # one-sided p-value

### CONCLUSION: ineffective treatment (p = 0.28)


##########
# Step 3 # Alternative testing procedure without non-inferiority
##########

# Function to evaluate mixed models:
evalLMER <- function (data, subscale){
  
  # Define models:
  form <- as.formula (paste (subscale, "~ 1 + TIME + TRT:TIME + (1 + TIME|ID)"))
  
  m <- lmer (form, data = data, REML = F,
              lmerControl (calc.derivs = F,
                           optCtrl = list (method = "nlminb",
                                           starttests = F,
                                           kkt = F)))
  
  # Extract p-value (pval), absolute treatment effect (coef) and its standard error (se)
  coef <- summary (m)$coefficients["TIME:TRT", "Estimate"]
  se <- summary (m)$coefficients["TIME:TRT", "Std. Error"]
  df <- nrow (D) - length (unique (D$ID)) - 2 
  stat <- coef/se
  pval <- 1 - pt (stat, df = df) # one-sided p-value

  return (data.frame (pval = pval, coef = coef, se = se, df = df))
}

# Fit model & extract estimates per subscale:
bu <- evalLMER (D, "BU")
fi <- evalLMER (D, "FI")
gr <- evalLMER (D, "GR")
re <- evalLMER (D, "RE")

# Adjust p-value according to Hommel (1988):
pval_adj <- p.adjust (c (bu$pval, 
                         fi$pval,
                         gr$pval,
                         re$pval),
                      method = "hommel") 

# Hypothesis testing:
sum (pval_adj < 0.05) > 0 # Is one of the adjusted p-values smaller than 0.05?

### CONCLUSION: effective treatment, p-value bulbar smaller than 0.05


##########
# Step 4 # Alternative testing procedure with non-inferiority
##########

# First, P-values are identical as in Step 3:
pval_adj <- p.adjust (c (bu$pval, 
                         fi$pval,
                         gr$pval,
                         re$pval),
                      method = "hommel")
# Significant bulbar effect

# Second, we need to obtain the one-sided p-value for the non-inferiority bound:
NI <- c (0.4350, 0.32240, 0.3175, 0.5471) # relative boundaries used in reference work
# Advised to optimize based on sample size.

# Transform relative to absolute values [ = slope * reduction]:
NI <- MEANS[c (2,4,6,8)]*NI

# Calculate the test statistics [mu - mu0 / se]:
stat <- (c (bu$coef, fi$coef, gr$coef, re$coef) - NI) / c (bu$se, fi$se, gr$se, re$se)
pval_NI <- pt (stat, df = c (bu$df, fi$df, gr$df, re$df), lower.tail = F) 

# Hypothesis testing:
(sum (pval_adj < 0.05) > 0) & # Is ANY subscale effect significantly larger than 0 AND
  (sum (pval_NI < 0.05) == 4) # Are ALL effects signifcantly larger than NI?

### CONCLUSION: in-effective treatment 
# --> in this case p-value for gross motor > 0.05 for NI margin and the gross motor NI effect 
# cannot be ruled out. Thus, despite significant bulbar effect, this trial is marked as futile.