Introduction

In this workflow we will show you an implementation of a parameter estimation from the Mendes 2009 paper ¹ on COPASI use cases.

Click here to go to the interactive implementation of this document as a Shiny website which is hosted on the free service of shinyapps.io!

You can see a time course of the species of the model below. The points are measurements of two of the species in the model.

#> # A COPASI model reference:
#> Model name: "Kholodenko2000 - Ultrasensitivity and negative feedback bring oscillations in MAPK cascade"
#> Number of compartments: 1
#> Number of species: 8
#> Number of reactions: 10

As you can see, the model does not describe the data perfectly. We will try and get a better result by changing the parameters of our model.

Parameter estimation

We want to find the parameters of our model that describe the data best. We call this Parameter Estimation. We have 22 parameters in our model, so finding the right combination of parameters by chance and just trying out is nearly impossible. Even only using a limited subset of these parameters (as we will do) will not work that way. We will use algorithms for that.

Parameter estimation is an important topic when handling a model. In CoRC you have to

Build or load a model
Define an experiment (with data to be fitted)
Define parameters that will be fitted
Run Parameter Estimation

We will go through these steps individually and explain what needs to be done as well as show visually how parameter estimation improves your fit.

Setup:

First, we have to load the required packages. Please make sure, you have CoRC as well as ggplot2 installed before calling the library function.

library(CoRC)
library(ggplot2)

1. Load a model

As stated above, to make a parameter estimation, you have to have a model to work on. If you want to know how to build you own model, you can click here.

In this workflow, we will instead load an SBML-model. This model is from Kholodenko (2000)²

loadSBML("https://www.ebi.ac.uk/biomodels/model/download/BIOMD0000000010.2?filename=BIOMD0000000010_url.xml")
#> # A COPASI model reference:
#> Model name: "Kholodenko2000 - Ultrasensitivity and negative feedback bring oscillations in MAPK cascade"
#> Number of compartments: 1
#> Number of species: 8
#> Number of reactions: 10
# loadSBML("https://raw.githubusercontent.com/jpahle/DynamiCoRC/main/models/BIOMD0000000010_url.xml")

We can inspect the species of the model like this:

getSpecies()
#> # A tibble: 8 × 13
#>   key           name    compartment type   unit  initial_concent… initial_number
#>   <chr>         <chr>   <chr>       <chr>  <chr>            <dbl>          <dbl>
#> 1 Mos{uVol}     Mos     uVol        react… nmol…               90        5.42e16
#> 2 Mos-P{uVol}   Mos-P   uVol        react… nmol…               10        6.02e15
#> 3 Mek1{uVol}    Mek1    uVol        react… nmol…              280        1.69e17
#> 4 Mek1-P{uVol}  Mek1-P  uVol        react… nmol…               10        6.02e15
#> 5 Mek1-PP{uVol} Mek1-PP uVol        react… nmol…               10        6.02e15
#> 6 Erk2{uVol}    Erk2    uVol        react… nmol…              280        1.69e17
#> 7 Erk2-P{uVol}  Erk2-P  uVol        react… nmol…               10        6.02e15
#> 8 Erk2-PP{uVol} Erk2-PP uVol        react… nmol…               10        6.02e15
#> # … with 6 more variables: concentration <dbl>, number <dbl>, rate <dbl>,
#> #   number_rate <dbl>, initial_expression <chr>, expression <chr>

This works in a similar way for reactions (getReactions()) and parameters (getParameters())

If you have COPASI installed, you can also have a look at the model there by starting the GUI with this model loaded already:

openCopasi()

2. Define an experiment

Defining an experiment in CoRC means telling the program which data to fit and what the data actually describes.

So we first need data. It is always a good idea to take a look at your data before working with it. This way you can make sure nothing unexpected is happening.

data <- read_tsv("data/MAPKdata1.txt", col_types = cols())

data
#> # A tibble: 10 × 3
#>     Time `Mos-P` `Erk2-P`
#>    <dbl>   <dbl>    <dbl>
#>  1    50   65.7     13.2 
#>  2   100   98.3     20.5 
#>  3   150   91.9     43.1 
#>  4   200   88.5     66.4 
#>  5   300   80.4      2.03
#>  6   400   49.2      0.6 
#>  7   500   37.8      2.03
#>  8   600   24.8      1.45
#>  9   800    8.56     4.79
#> 10  1000    3.8     17

Then you have to define the experiment for COPASI. You need the data, as well as the type and mappings for the species. You can choose a weight method for your data (which prevents parameters getting fitted more closely just because they have higher values).

Your data columns in your data file can be of type “time”, “dependent” and “independent”, and if you want to exclude a column you can choose “ignore”.

The mapping argument in the function maps the data columns with the species in your model. In our case, the provided data is time course data, and our values are “transient concentrations”. They are denoted like this: {[Species]}. You can find these notation by using the function getSpeciesReferences().

Time needs to be mapped with NA.

Allowed weight methods are "mean", "mean_square", "sd", and "value_scaling".

fit_experiments <- defineExperiments(
  data = data,
  type = c("time", "dependent", "dependent"),
  mapping = c(NA, "{[Mos-P]}", "{[Erk2-P]}"),
  weight_method = "mean_square"
)

3. Define parameters

We now have to define parameters that will be fitted.

First, let us take a look at all the parameters in the model:

getParameters()
#> # A tibble: 22 × 5
#>    key                              name  reaction                 value mapping
#>    <chr>                            <chr> <chr>                    <dbl> <chr>  
#>  1 (MAPKKK activation).V1           V1    MAPKKK activation        2.5   <NA>   
#>  2 (MAPKKK activation).Ki           Ki    MAPKKK activation        9     <NA>   
#>  3 (MAPKKK activation).n            n     MAPKKK activation        1     <NA>   
#>  4 (MAPKKK activation).K1           K1    MAPKKK activation       10     <NA>   
#>  5 (MAPKKK inactivation).V2         V2    MAPKKK inactivation      0.25  <NA>   
#>  6 (MAPKKK inactivation).KK2        KK2   MAPKKK inactivation      8     <NA>   
#>  7 (phosphorylation of MAPKK).k3    k3    phosphorylation of MAP…  0.025 <NA>   
#>  8 (phosphorylation of MAPKK).KK3   KK3   phosphorylation of MAP… 15     <NA>   
#>  9 (phosphorylation of MAPKK-P).k4  k4    phosphorylation of MAP…  0.025 <NA>   
#> 10 (phosphorylation of MAPKK-P).KK4 KK4   phosphorylation of MAP… 15     <NA>   
#> # … with 12 more rows

Now, we only want to fit the reaction rates (parameters with V). To make our fit parameters we need to make a list of lists with the attributes of the different parameter-estimation settings. To define a parameter for parameter estimation, we use the defineParameterEstimationParameter() function.

fit_parameters <- list(
  defineParameterEstimationParameter(
    ref = "{(MAPKKK activation).V1}",
    start_value = getParameters("(MAPKKK activation).V1")$value,
    lower_bound = getParameters("(MAPKKK activation).V1")$value * 0.1,
    upper_bound = getParameters("(MAPKKK activation).V1")$value * 1.9
    ),
  defineParameterEstimationParameter(
    ref = "{(MAPKKK inactivation).V2}",
    start_value = getParameters("(MAPKKK inactivation).V2")$value,
    lower_bound = getParameters("(MAPKKK inactivation).V2")$value * 0.1,
    upper_bound = getParameters("(MAPKKK inactivation).V2")$value * 1.9
    ),
  defineParameterEstimationParameter(
    ref = "{(dephosphorylation of MAPKK-PP).V5}",
    start_value = getParameters("(dephosphorylation of MAPKK-PP).V5")$value,
    lower_bound = getParameters("(dephosphorylation of MAPKK-PP).V5")$value * 0.1,
    upper_bound = getParameters("(dephosphorylation of MAPKK-PP).V5")$value * 1.9 
    ),
  defineParameterEstimationParameter(
    ref = "{(dephosphorylation of MAPKK-P).V6}",
    start_value = getParameters("(dephosphorylation of MAPKK-P).V6")$value,
    lower_bound = getParameters("(dephosphorylation of MAPKK-P).V6")$value * 0.1,
    upper_bound = getParameters("(dephosphorylation of MAPKK-P).V6")$value * 1.9
    ),
  defineParameterEstimationParameter(
    ref = "{(dephosphorylation of MAPK-PP).V9}",
    start_value = getParameters("(dephosphorylation of MAPK-PP).V9")$value,
    lower_bound = getParameters("(dephosphorylation of MAPK-PP).V9")$value * 0.1,
    upper_bound = getParameters("(dephosphorylation of MAPK-PP).V9")$value * 1.9
    ),
  defineParameterEstimationParameter(
    ref = "{(dephosphorylation of MAPK-P).V10}",
    start_value = getParameters("(dephosphorylation of MAPK-P).V10")$value,
    lower_bound = getParameters("(dephosphorylation of MAPK-P).V10")$value * 0.1,
    upper_bound = getParameters("(dephosphorylation of MAPK-P).V10")$value * 1.9
  )
)

4. Run parameter estimation

To show how well our parameter estimation works, we want to print the model before and after parameter estimation. To do this, we have to run two time course evaluations, one with the parameters now, and one with the parameters after the parameter estimation.

before <- runTimeCourse(duration = 1000, dt = 1)$result

After doing this, we will now actually run the parameter estimation. We need the parameters, experiments with our data that will be fitted, and specify the method. We want to use the Levenberg Marquardt method but other methods are available as well. You can find them using the function getValidReactionFunctions() with your function as an argument.

Also, we specify that we want to update our model. This means, that all estimated parameters will be updated with the parameters of the best estimation. To compare the fit to the previous parameters we need to make sure we keep the previous fit. We already did that with our time course in the last chunk.

result <-
  runParameterEstimation(
    parameters  = fit_parameters,
    experiments = fit_experiments,
    method = list(
      method = "LevenbergMarquardt",
      log_verbosity = 2
    ),
    update_model = TRUE
  )

You can have a nicely readable version of the result by using the function str(result). For space-reasons we will only take a look at the fitted parameters, but feel free to take a look at anything you find interesting.

result$parameters
#> # A tibble: 6 × 8
#>   parameter              lower_bound start_value value upper_bound std_deviation
#>   <chr>                        <dbl>       <dbl> <dbl>       <dbl>         <dbl>
#> 1 (MAPKKK activation).V1       0.25         2.5  2.39        4.75        0.137  
#> 2 (MAPKKK inactivation)…       0.025        0.25 0.243       0.475       0.00819
#> 3 (dephosphorylation of…       0.075        0.75 0.721       1.42        0.0999 
#> 4 (dephosphorylation of…       0.075        0.75 1.42        1.42        0.481  
#> 5 (dephosphorylation of…       0.05         0.5  0.793       0.95        0.0826 
#> 6 (dephosphorylation of…       0.05         0.5  0.804       0.95        0.105  
#> # … with 2 more variables: coeff_of_variation <dbl>, gradient <dbl>

5. Visualize results

Now we have estimated and updated the parameters of our current model. To compare our old model parameters to our new, we run another time course.

after <- runTimeCourse(duration = 1000, dt = 1)$result

We will use ggplot for visualizing our results. If you have never worked with ggplot, this way to define a plot will look unusual to you.

We first want to plot our experimental data, as well as two time courses (before and after) for Erk2-P and Mos-P.

Erk2P <- 
  ggplot(mapping = aes(x = Time, y = `Erk2-P`)) +
  geom_point(data = data, aes(color = "experimental")) +
  geom_line (data = before, aes(color = "before")) +
  geom_line (data = after, aes(color = "after"))

MosP <-
  ggplot(mapping = aes(x = Time, y = `Mos-P`)) +
  geom_point(data = data, aes(color = "experimental")) +
  geom_line (data = before, aes(color = "before")) +
  geom_line (data = after, aes(color = "after"))

At the end, our result for Erk2-P

Erk2P

as well as our result for Mos-P

MosP

show, how well the parameter estimation was able to fit our model to the data.

At the end, we can unload our model, to free some memory.

unloadModel()

References

Mendes P., Hoops S., Sahle S., Gauges R., Dada J., Kummer U. (2009) Computational Modeling of Biochemical Networks Using COPASI. In: Maly I. (eds) Systems Biology. Methods in Molecular Biology (Methods and Protocols), vol 500. Humana Press. https://doi.org/10.1007/978-1-59745-525-1_2 ↩︎
Kholodenko BN. Negative feedback and ultrasensitivity can bring about oscillations in the mitogen-activated protein kinase cascades. Eur J Biochem. 2000 Mar;267(6):1583-8. doi: 10.1046/j.1432-1327.2000.01197.x. PMID: 10712587.↩︎

Parameter Estimation

Johanna Daas, Jonas Förster and Jürgen Pahle