Blog Archive

Saturday, August 4, 2012

Fitting a Differential Equation System to Data


Here are some helpful links:
Stack overflow discussion on how to use Vode to Integrate stiff ODEs 
SciPy page on the integration class and options
SciPy page on the integration class and ode options
Code for simple Euler Method
SciPy Cookbook: Coupled Spring Mass System 
SciPy Cookbook: Zombie Apocalypse ODEINT
SciPy Cookbook: Lotka Volterra Tutorial
SciPy Central: Integrating and Initial Value Problem (single ODE)
Basic Model of Virus Infection using ODEs
Modeling with ordinary differential equations (ODEs)
Simple examples of solving a system of ODEs

Create a System of ODE's


To run a fit, your system has to be written as a definition. I modified the code from the zombie invasion system ( link above ) to demonstrate how it should be written
# zombie apocalypse modeling
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from scipy import integrate


#=======================================================
def eq(par,initial_cond,start_t,end_t,incr):
     #-time-grid-----------------------------------
     t  = np.linspace(start_t, end_t,incr)
     #differential-eq-system----------------------
     def funct(y,t):
        Si=y[0]
        Zi=y[1]
        Ri=y[2]
        P,d,B,G,A=par
        # the model equations (see Munz et al. 2009)
        f0 = P - B*Si*Zi - d*Si
        f1 = B*Si*Zi + G*Ri - A*Si*Zi
        f2 = d*Si + A*Si*Zi - G*Ri
        return [f0, f1, f2]
     #integrate------------------------------------
     ds = integrate.odeint(funct,initial_cond,t)
     return (ds[:,0],ds[:,1],ds[:,2],t)
#=======================================================
    
#parameters   
P = 0       # birth rate
d = 0.0001  # natural death percent (per day)
B = 0.0095  # transmission percent  (per day)
G = 0.0001  # resurect percent (per day)
A = 0.0001  # destroy perecent (per day)
rates=(P,d,B,G,A)
# initial conditions
S0 = 500.               # initial population
Z0 = 0                  # initial zombie population
R0 = 0                  # initial death population
y0 = [S0, Z0, R0]      # initial condition vector

F0,F1,F2,T=eq(rates,y0,0,5.,1000)

plt.figure()
plt.plot(T,F0,'-b',T,F1,'-r')
plt.legend(('Survivors', 'Zombies'),'upper center')
plt.title('Zombie Apocalypse')


Display Model with Data

 

Well we have a model system for how the Zombie Apocalypse works. However it could be that our parameter guesses weren't very good. If we have data from the zombie apocalypse we can adjust check our parameter estimates by seeing how well our model fits the data.

So in this case, let's say we have zombie population data for several timepoints...

# zombie apocalypse modeling
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from scipy import integrate

#The Model
#=======================================================
def eq(par,initial_cond,start_t,end_t,incr):
     #-time-grid-----------------------------------
     t  = np.linspace(start_t, end_t,incr)
     #differential-eq-system----------------------
     def funct(y,t):
        Si=y[0]
        Zi=y[1]
        Ri=y[2]
        P,d,B,G,A=par
        # the model equations (see Munz et al. 2009)
        f0 = P - B*Si*Zi - d*Si
        f1 = B*Si*Zi + G*Ri - A*Si*Zi
        f2 = d*Si + A*Si*Zi - G*Ri
        return [f0, f1, f2]
     #integrate------------------------------------
     ds = integrate.odeint(funct,initial_cond,t)
     return (ds[:,0],ds[:,1],ds[:,2],t)
#=======================================================


#The Data
#======================================================
Td=[0.5,1,1.5,2,2.5,3,3.5,4,4.5,5]
Zd=[0,2,2,5,2,10,15,50,250,400]
#parameters   
P = 0       # birth rate
d = 0.0001  # natural death percent (per day)
B = 0.0095  # transmission percent  (per day)
G = 0.0001  # resurect percent (per day)
A = 0.0001  # destroy perecent (per day)
rates=(P,d,B,G,A)
# initial conditions
S0 = 500.               # initial population
Z0 = 0                  # initial zombie population
R0 = 0                  # initial death population
y0 = [S0, Z0, R0]      # initial condition vector

F0,F1,F2,T=eq(rates,y0,0,5.,1000)

plt.figure()
plt.plot(T,F0,'-b',T,F1,'-r',Td,Zd,'go')
plt.legend(('Survivors', 'Zombies','Zombie Data'),
           'upper center')
plt.xlabel('days')
plt.ylabel('population')
plt.title('Zombie Apocalypse')


 

Import Model System into a Display Script

 

Sometimes you will want to try out several model systems on the same data set.
Instead of crowding them all into one script, it makes sense to just import them. The script we used above was called zombiewithdata.py (For fewer headaches, I suggest keeping your model system scripts like zombiewithdata.py in the same folder as your display script.)

So here goes an example...

#Zombie Display
# zombie apocalypse modeling
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from scipy import integrate
#=====================================================
#Notice we must import the Model Definition
from zombiewithdata import eq
#=====================================================

#The Data
#======================================================
Td=[0.5,1,1.5,2,2.5,3,3.5,4,4.5,5]
Zd=[0,2,2,5,2,10,15,50,250,400]
#parameters   
P = 0       # birth rate
d = 0.0001  # natural death percent (per day)
B = 0.0095  # transmission percent  (per day)
G = 0.0001  # resurect percent (per day)
A = 0.0001  # destroy perecent (per day)
rates=(P,d,B,G,A)
# initial conditions
S0 = 500.               # initial population
Z0 = 0                  # initial zombie population
R0 = 0                  # initial death population
y0 = [S0, Z0, R0]      # initial condition vector

F0,F1,F2,T=eq(rates,y0,0,5.,1000)

plt.figure()
plt.plot(T,F0,'-b',T,F1,'-r',Td,Zd,'go')
plt.legend(('Survivors', 'Zombies','Zombie Data'),
           'upper center')
plt.xlabel('days')
plt.ylabel('population')
plt.title('Zombie Apocalypse')






















 

Scoring How Well Your Model Fits The Data

 

The green circles represent the 'real' zombie population on those days on the x-axis. The red circles are what the model is predicting the zombie population should be on those same days. Clearly we need to adjust our model a bit.



First however we need to 'score' how badly off the fit is, so the program will know if its guesses are getting better or worse. See this link on fitting if you have never done it before: fitting a line.



#Zombie Display
# zombie apocalypse modeling
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from scipy import integrate
#=====================================================
#Notice we must import the Model Definition
from zombiewithdata import eq
#=====================================================

#1.Get Data
#====================================================
Td=np.array([0.5,1,1.5,2,2.2,3,3.5,4,4.5,5])#time
Zd=np.array([0,2,2,5,2,10,15,50,250,400])#zombie pop
#====================================================

#2.Set up Info for Model System
#===================================================
# model parameters
#----------------------------------------------------
P = 0       # birth rate
d = 0.0001  # natural death percent (per day)
B = 0.0095  # transmission percent  (per day)
G = 0.0001  # resurect percent (per day)
A = 0.0001  # destroy perecent (per day)
rates=(P,d,B,G,A)

# model initial conditions
#---------------------------------------------------
S0 = 500.               # initial population
Z0 = 0                  # initial zombie population
R0 = 0                  # initial death population
y0 = [S0, Z0, R0]      # initial condition vector

# model steps
#---------------------------------------------------
start_time=0.0
end_time=5.0
intervals=1000
mt=np.linspace(start_time,end_time,intervals)

# model index to compare to data
#----------------------------------------------------
findindex=lambda x:np.where(mt>=x)[0][0]
mindex=map(findindex,Td)
#=======================================================



#3.Score Fit of System
#=========================================================
def score(parms):
    #a.Get Solution to system
    F0,F1,F2,T=eq(parms,y0,start_time,end_time,intervals)
    #b.Pick of Model Points to Compare
    Zm=F1[mindex]
    #c.Score Difference between model and data points
    ss=lambda data,model:((data-model)**2).sum()
    return ss(Zd,Zm)

fit_score=score(rates)
#========================================================



#4.Generate Solution to System
#=======================================================
F0,F1,F2,T=eq(rates,y0,start_time,end_time,intervals)
Zm=F1[mindex]
Tm=T[mindex]
#======================================================


#5. Plot Solution to System
#=========================================================
plt.figure()
plt.plot(T,F1,'b-',Tm,Zm,'ro',Td,Zd,'go')
plt.legend(('Zombies','Model Points','Data Points'),
           'upper center')
plt.xlabel('days')
plt.ylabel('population')
title='Zombie Apocalypse  Fit Score: '+str(fit_score)
plt.title(title)
#=========================================================








Finding the Parameters that help the Model Fit the Data

Import fmin or some other optimizer from scipy tools. If you place the scoring function into the optimizer it should help find parameters that give a low score.
You can see that the parameters from the optimizer will help the model fit the data better.





#Zombie Display
# zombie apocalypse modeling
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from scipy import integrate
from scipy.optimize import fmin
#=====================================================
#Notice we must import the Model Definition
from zombiewithdata import eq
#=====================================================

#1.Get Data
#====================================================
Td=np.array([0.5,1,1.5,2,2.2,3,3.5,4,4.5,5])#time
Zd=np.array([0,2,2,5,2,10,15,50,250,400])#zombie pop
#====================================================

#2.Set up Info for Model System
#===================================================
# model parameters
#----------------------------------------------------
P = 0       # birth rate
d = 0.0001  # natural death percent (per day)
B = 0.0095  # transmission percent  (per day)
G = 0.0001  # resurect percent (per day)
A = 0.0001  # destroy perecent (per day)
rates=(P,d,B,G,A)

# model initial conditions
#---------------------------------------------------
S0 = 500.               # initial population
Z0 = 0                  # initial zombie population
R0 = 0                  # initial death population
y0 = [S0, Z0, R0]      # initial condition vector

# model steps
#---------------------------------------------------
start_time=0.0
end_time=5.0
intervals=1000
mt=np.linspace(start_time,end_time,intervals)

# model index to compare to data
#----------------------------------------------------
findindex=lambda x:np.where(mt>=x)[0][0]
mindex=map(findindex,Td)
#=======================================================



#3.Score Fit of System
#=========================================================
def score(parms):
    #a.Get Solution to system
    F0,F1,F2,T=eq(parms,y0,start_time,end_time,intervals)
    #b.Pick of Model Points to Compare
    Zm=F1[mindex]
    #c.Score Difference between model and data points
    ss=lambda data,model:((data-model)**2).sum()
    return ss(Zd,Zm)
#========================================================


#4.Optimize Fit
#=======================================================
fit_score=score(rates)
answ=fmin(score,(rates),full_output=1,maxiter=1000000)
bestrates=answ[0]
bestscore=answ[1]
P,d,B,G,A=answ[0]
newrates=(P,d,B,G,A)
#=======================================================

#5.Generate Solution to System
#=======================================================
F0,F1,F2,T=eq(newrates,y0,start_time,end_time,intervals)
Zm=F1[mindex]
Tm=T[mindex]
#======================================================




#6. Plot Solution to System
#=========================================================
plt.figure()
plt.plot(T,F1,'b-',Tm,Zm,'ro',Td,Zd,'go')
plt.legend(('Zombies','Model Points','Data Points'),
           'upper center')
plt.xlabel('days')
plt.ylabel('population')
title='Zombie Apocalypse  Fit Score: '+str(bestscore)
plt.title(title)
#=========================================================









>>> rates
(0, 0.0001, 0.0095, 0.0001, 0.0001)
>>> newrates
(-0.013625689329636721, 0.00037563615881181336, 0.010603541350136299, 7.5296199499937124e-05, 0.0014565207568917173)
>>> 


6 comments:

  1. When trying to adapt this to the Lotka-Volterra Model I am receiving an error index 2 is out of bound for axis 0 with size 2 and it's appearing on the line: fit_score=score(rates). I have made a ton of adaptations to the code to make it work with the LV model, but I can't quite get it to graph and fit to data. Help would be huge!

    ReplyDelete
  2. Thanks for your code, however, when I run it, there was a problem.

    Zm=F1[mindex]

    IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

    ReplyDelete
    Replies

    1. mindex is a map. You cannot slice a map.
      mindex=list(map(findindex, Td))

      Delete
  3. Thanks for your code. I had some real help with it.
    In the instance that I am stuck in is :

    fit_score=score(rates)
    answ=fmin(score,(rates),full_output=1,maxiter=1000000)

    can I restrict or constraint values of "rates" some how. I dont want it to be totally optimized, I want to fit the data in such manner so that it finds the minimum value of (score) within that range

    ReplyDelete
    Replies
    1. you can use 'minimize' instead of 'fmin' and set a bounding box to constrain the values of the rates

      Delete
  4. Ultimately, a problem that I'm passionate about. I have looked for data of this caliber for the previous various hours. Your site is greatly appreciated. https://python.engineering/dictionary-counter-python-find-winner-election/

    ReplyDelete