# Quick parallel implementation of local sensitivity analysis procedure for agent-based tumor growth model

In the last couple of posts I’ve shown how to implement agent-based model of cancer stem cell driven tumor growth, both in MATLAB and C++. Having the implementations we can go one step further and perform some analysis of the tumor growth characteristics predicted by the model. We will start with performing local sensitivity analysis, i.e. we will try to answer the question how the perturbation of parameter values impacts the growth dynamics. Typically it is being done by perturbing one of the parameters by a fixed amount and analyzing the response of the model to that change. In our case the response will be the percentage change in average tumor size after 3 months of growth. Sounds fairly simple, but…

We have 5 different parameters in the model: proliferation capacity, probability of division, probability of spontaneous death, probability of symmetric division, and probability of migration. Moreover, let us assume that we would like to investigate 3 different values of parameter perturbation magnitude (5%, 10% and 20%). In order to be able to analyze the change in the average size we need to have its decent estimator, i.e. we need sufficient number of simulated stochastic trajectories – let us assume that 100 simulations is enough to have a good estimation of “true” average. So in order to perform the sensitivity analysis we will need to perform 100 + 3*5*100 = 1600 simulations (remember about the growth for nominal parameters values). Even if single simulation takes typically 30 seconds, then we will wait more than 13 hours to obtain the result using single CPU – that is a lot!

After looking at the above numbers we can make a straightforward decision right now – we will use C++ instead of MATLAB, because the model implementation in C++ is a several times faster. However 1) we will need to write a lot of a code in order to perform sensitivity analysis, 2) using multiple CPU is not that straightforward as in MATLAB. Is there a better way to proceed?

Few weeks ago I’ve shown here how to wrap your C++ in order to use it from within MATLAB as a function without loosing the C++ performance. Why not to use it to make our lives easier by making sensitivity code short and harness easily the power of multiple CPUs?

We will start the coding (in MATLAB) with setting all simulation parameters

nSim = 100; %number of simulations to perform for a given set of parameters
tmax = 30*3; %number of days to simulate
N = 1000; %size of the simulation domain

%nominal parameter values [rhomax, pdiv, alpha, ps, pmig]
nominal = [10, 1/24, 0.05, 0.3,10/24];

perturb = [0.05 0.1 0.2]; %perturbation magnitudes, percent of initial value up


Now we just need to construct a loops that will iterate through all possible perturbations and simulations. If we don’t use Parallel Toolbox, i.e. don’t use multiple CPUs, it doesn’t really matter how we will do that – performance will be similar. Otherwise, implementation is not that straightforward even though from MATLABs documentation it seems that it is enough to change for to parfor. The most important thing is how will we divide the work between the CPUs. The simples idea is to spread the considered perturbations values among the CPUs – that will allow to use 15 CPUs in our setting. However, I’ve got machine with 24 CPUs, so that would be a waste of resources – bad idea. The other idea would be to use parfor loop to spread all 100 simulations for a given perturbation value on all 24 CPUs and go through all perturbation values in a simple loop – now we are using all available CPUs. But are we doing that efficiently? No. The thing is that CPUs need to be synchronized before proceeding to the next iteration of the loop going through perturbation values. So some of the CPUs will be idle while waiting for other ones to finish with parfor loop. In order to make the code even more efficient we will just use one parfor loop and throw all 1600 simulation on 24 CPUs at the same time. Let us first prepare the output variable.

HTCG = zeros(1,nSim + length(nominal)*length(perturb)*nSim);


Before writing the final piece of the code we need to solve one more issue. Namely, in the C++ implementation we used srand(time(NULL)) to initiate the seed for random number generator. It is perfectly fine when we use single CPU – each simulation will take some time and we don’t need to worry about uniqueness of the seed. The problem is when we want to use multiple CPUs – all initial parallel simulations will start with exactly the same seed. One way to solve that is to pass the current loop iteration number (i) to C++ and use srand(time(NULL)+i) – that is what I have done. After solving that issue we can write the final piece of the code.

parfor i = 1:length(HTCG)
%%PREPARING PARAMETERS VALUES
params = nominal; %setting parameters to nominal values
if i>nSim %simulation is for perturbed parameters
%translating linear index to considered parameter and perturbation value
j = ceil((i-nSim)/(nSim*length(perturb)));
k = ceil((mod((i-nSim)-1,nSim*length(perturb))+1)/nSim);
%updating parameters
params(j) = params(j)*(1+perturb(k));
if k == 1 %if proliferation capacity parameter we need to round it
params(j) = round(params(j));
end
end

%%RUNNING SIMULATION AND SAVING OUTPUT
[~, cells] = CA(params,[tmax*24,N,i]);
HTCG(i) = length(cells)/3;
clear mex %important! without that the internal
%variables in CA functions won't be cleared and next simulation
%won't begin with single initial cell
end


Then in the command line we start the parallel pool with 24 workers (CPUs), by typing parpool(24) command, and run the code. Screenshot below shows nicely how all of the 24 CPUs are being used – no resource wasted!

We can then add few additional lines of the code to plot the results.


nom = mean(HTCG(1:100)); %average for nominal parameters value
%calculating averages for perturbed sets values
av = squeeze(mean(reshape(HTCG(101:end),nSim,length(perturb),length(nominal))));

%plotting results of sensitivity analysis
set(0,'DefaultAxesFontSize',18)
figure(1)
clf
bar(perturb*100,(av-nom)/nom*100)
legend({'\rho_{max}', 'p_{div}', '\alpha', 'p_s', 'p_{mig}'})
xlabel('% perturbation')
ylabel('% change')



And “voila!” – the resulting figure shows that the perturbation in the proliferation capacity has the highest impact on the tumor growth dynamics.

# Wrapping C++ code of agent-based tumor growth model into MATLAB

Last week I posted a C++ implementation of basic cancer stem cell driven tumor growth model. About 100 lines of a code allowed to perform simulation, but there was nothing about exporting the results, doing more complicated analysis, visualization, or performing data fitting. Of course each of those tasks can be performed by writing another large piece of the code.

Two weeks ago I posted a MATLAB code for the very similar model, which was quite fast but a lot slower than C++ implementation. On the other hand it was straightforward to visualize the tumor and it’s not much of a hassle to perform more complicated analysis or data fitting using variety of MATLABs built-in functions.

Each of those two approaches has their own advantages: C++ is fast, MATLAB has a lot of built-in in tools.

Today I will show how to take the best out of both approaches, i.e. I’ll show how to easily wrap our C++ code into MATLAB. The whole C++ based simulation will be invoked from MATLAB as a function and results will be visible straight from it.

First we need to add necessary libraries.

#include "mex.h"
#include <matrix.h>


Then we need to modify our original definition of global variables (see previous post) so they can be adjusted based on the input parameters (for example lattice has to be dynamically allocated).

int N;
bool* lattice;
vector<cell> cells;

char pmax;
double pDiv, alpha, ps, pmig;
int* indcNeigh;


Then the only thing that we need to do is to delete old main() function and use mexFunction() instead. In the mexFunction() we need to read first the parameters that will be passed from MATLAB (input parameters to the function). Then we need to assign the memory for the lattice and set other auxiliary variables. Finally, after performing the simulation we need to write variables to the output, so the MATLAB can see the results.

void mexFunction(
int          nlhs, //number of output arguments
mxArray      *plhs[], //output arrays
int          nrhs, //number of input arguments
const mxArray *prhs[] //input arrays
)
{
double *par=mxGetPr(prhs[0]); //model parameters, first input variable
double *settings=mxGetPr(prhs[1]); //simulation settings, second input variable

int Nsteps = settings[0]; //number of simulation steps to perform
N=(int)settings[1]; //size of the lattice
pmax = par[0]; //divisions before proliferative capacity exhaustion
pDiv = par[1]; //division probability
alpha = par[2]; //spontaneous death probability
ps = par[3]; //probability of symmetric division
pmig = par[4]; //probability of migration

/*CREATING LATTICE AND AUXILARY VARIABLES*/
//creating lattice with a given size
lattice = new bool [N*N];
fill_n(lattice, N*N, false);

indcNeigh = new int [8];//cell neighborhood
//assigning values
indcNeigh[0] = -N-1; indcNeigh[1] = -N; indcNeigh[2] = -N+1;
indcNeigh[3] = -1; indcNeigh[4] = 1; indcNeigh[5] = N-1;
indcNeigh[6] = N; indcNeigh[7] = N+1;

/*OLD PIECE OF MAIN() FUNCTION*/
srand(time(NULL)); //initialize random number generator
initialize(); //initialize CA
simulate(Nsteps); //simulate CA

/*WRITING TO OUTPUT VARIABLES*/
//1) writing the lattice as a logical matrix
plhs[0] = mxCreateLogicalMatrix(N,N); //lattice
mxLogical *ptr = mxGetLogicals(plhs[0]);
for(int i=0; i<N*N; i++)
ptr[i] = lattice[i];

//2) writing the cells vector as a vector
plhs[1] = mxCreateDoubleMatrix(1,cells.size()*3,mxREAL);
double* ptr2 = mxGetPr(plhs[1]);
for(int i=0; i<cells.size(); i++) {
ptr2[3*i] = cells.at(i).place;
ptr2[3*i+1] = cells.at(i).p;
ptr2[3*i+2] = cells.at(i).is_stem;
}
}


After making all of the changes we can compile our .cpp file using the MATLAB command “mex CA.cpp”. Remember all other pieces of C++ associated with model itself remain the same – there is no need to change anything.

From now on we can perform our simulation in MATLAB environment by invoking the following command

[lattice, cells] = CA([pmax, pdiv, alpha, ps,pmig],[nSteps, N]);


and further use of all MATLABs benefits, especially data fitting features. At the same time all of the simulations will be performed at C++ speed! Imagine using Parallel Toolbox in MATLAB to harness multiple CPUs with only few lines of the code.

# Setting complex domains for agent based models using bitmaps

In the previous posts (CA in MATLAB and C++) I’ve shown how to implement cancer stem sell driven tumor growth model. In both codes I have set the boundaries of the lattice to true without adding those sites to additional cells vector, so the boundary was treated as an occupied site, but not as a viable cell. This way of coding the system makes it easy to implement more complex domains. Today, I’ll show how to read the complex domains from bitmap (.bmp) file and use them for simulations using previously posted codes. The basic idea is that the image will be loaded into the program and the pixels with values “close” to white will be treated as free sites.

First we need to prepare the image. The posted codes will assume that the image is a square (image=width). We can draw anything we want, remembering that white pixels will be interpreted as free sites. We will write C++ code that will be able to read properly 24 bits bitmap images without the embedded information about the color space. In order to export image in that format we can use GIMP program with the following settings during the export.

I’ve prepared two exemplary images that I will use further in the simulations. The first one is adapted from the paper by Enderling et al. and the second is a generated text using PowerPoint.

Firt, the basic C++ code. We could use additional libraries to load images, but we want to make code as platform independent as possible. We assume that the array lattice and its size are already defined as the global variables (see previous code).

#include <math.h>

void domainFromImage(char* fileName) {

FILE* f = fopen(fileName, "rb"); //open to read as binary

//in BMP format the size of each row is rounded up to multiple of 4 bytes
int Nb = ceil(3.*(double)N/4.)*4; //true size of the row in bytes, including padding

unsigned char* img = new unsigned char[Nb]; // allocate one row of pixels

for(int i = 0; i < N; i++) { //for each row
for (int j=0; j<N; j++)
//set to free only those that have high intensity close to white (>240 in all three channels)
lattice[i*N+j] = !(img[3*j]>250 && img[3*j+1]>250 && img[3*j+2]>250);
}
fclose(f);//close file

//filling boundary
for (int i=0; i<N; i++) {lattice[i]=true;};//left
for (int i=0; i<N*N; i+=N) {lattice[i]=true;};//top
for (int i=N-1; i<N*N; i+=N) {lattice[i]=true;};//bottom
for (int i=N*(N-1); i<N*N; i++) {lattice[i]=true;};//right
}


The same task in MATLAB is way easier to implement…

function [L, N] = domainFromImage( filename )
L = ~(L(:,:,1)>250 & L(:,:,2)>250 & L(:,:,3)>250); %setting those values...
N = size(L,1);%size of the domain (assuming square)

%filling border
L([1:N 1:N:N*N N:N:N*N N*(N-1):N*N]) = true; %boundary
end


What is most important MATLAB code is not so much dependent on the format of the image – it can be easily modified to other image types (imread function is very flexible).

Below is the exemplary result of simulation using image representing the tissue.

Tumor growth can be also simulated for the domain generated using PowerPoint text.

# Cancer stem cell driven tumor growth model in C++

Because of the feedback that I’ve received after publishing first three posts, I’ve decided to change a tool of interest from MATLAB to C++. Today, I’ll show how to quickly implement a cancer stem cell driven tumor growth model in C++. It is almost the same model as implemented in my previous post (I’ll explain why it is “almost” the same at the end of this post).

Quick guide to the model: A cell, either cancer stem cell (CSC) or non-stem cancer cell (CC), occupies a single grid point on a two-dimensional square lattice. CSCs have unlimited proliferation potential and at each division they produce either another CSC (symmetric division) or a CC (asymmetric division). CCs are eroding their proliferation potential at each division and die when its value reaches 0. Moreover, at each proliferation attempt, CCs may undergo spontaneous death and then be removed from the system.

First we start with including the necessary headers and setting the namespace. Apart from the standard functions and datatypes, we will use the vector datatype to store the cells and a shuffling function from algorithm library.

#include <vector>
#include <algorithm>
#include <stdlib.h>
#include <time.h>

using namespace std;


Now we can define a cell. It will be defined by three variables: index to the place on the lattice (integer), remaining proliferation capacity (char), and boolean variable defining stemness.

struct cell {//defining cell
int place;
char p;
bool is_stem;
};


Next step is to define the lattice size, the lattice itself, vector that will contain all viable cells, and auxiliary variable defining the cells neighborhood.

static const int N = 2000; //lattice size
bool lattice[N*N] = {false}; //empty lattice
vector<cell> cells; //vector containing all cells present in the system

static const int indcNeigh[] = {-N-1, -N, -N+1, -1, 1, N-1, N, N+1};//neighborhood


The parameters of the model are defined as follows.

char pmax=10; //proliferation capacity
double pDiv=1./24.; //division probability
double alpha=0.05; //spontaneous death probability
double ps=0.05; //probability of symmetric division
double pmig=10./24.; //probability of migration


Having made all of those initial definitions we can finally start coding the main program.
Let us start with writing the function that will initialize the whole simulation, i.e. fill the lattice boundary and put the initial stem cell.

void initialize() {
for (int i=0; i<N; i++) {lattice[i]=true;};//filling left
for (int i=0; i<N*N; i=i+N) {lattice[i]=true;};//filling top
for (int i=N-1; i<N*N; i=i+N) {lattice[i]=true;};//filling bottom
for (int i=N*(N-1); i<N*N; i++) {lattice[i]=true;};//filling right

lattice[N/2*N+N/2] = true; //initial cell in the middle
cell initialCell = {N/2*N+N/2,pmax,true};//initial cell definition
cells.push_back(initialCell);
}


As in the previous post, we set all the boundary values of lattice to true in order to make sure that we will not address any site out of the lattice. Typically one would use an if statement when addressing the lattice site to make sure that index to a site is within the domain. Here, because we have set the boundaries to true without adding those sites to cells vector, we don’t need to do that – boundary will be treated as an occupied site, but not as a viable cell.

The second auxiliary function that we will use returns the index to the randomly selected empty space around a given spot.

int returnEmptyPlace(int indx) {
int neigh[8], nF = 0;
for(int j=0;j<8;j++) {//searching through neighborhood
if (!lattice[indx+indcNeigh[j]]) {//if free spot
neigh[nF] = indx+indcNeigh[j]; //save the index
nF++; //increase the number of found free spots
}
}
if(nF) {//selecting free spot at random
return neigh[rand() % nF];
} else {//no free spot
return 0;
}
}


If there is no free spot the function returns 0.
Finally we can code the part in which all the magic happens – main simulation procedure. It is hard to dissect the whole procedures into parts, so I did my best to explain everything in the comments.

void simulate(int nSteps) {

vector<cell> cellsTmp;
int newSite;
cell currCell, newCell;

for (int i=0; i<nSteps; i++) {
random_shuffle(cells.begin(), cells.end()); //shuffling cells
while (!cells.empty()) {
currCell=cells.back(); //pick the cell
cells.pop_back();
newSite = returnEmptyPlace(currCell.place);

if (newSite) {//if there is a new spot
newCell = currCell;
newCell.place = newSite;
if ((double)rand()/(double)RAND_MAX < pDiv) {
if (currCell.is_stem) {
lattice[newSite]=true;
cellsTmp.push_back(currCell);
if ((double)rand()/(double)RAND_MAX > ps) {//asymmetric division
newCell.is_stem = false;
}
cellsTmp.push_back(newCell);
} else if (currCell.p > 0 && (double)rand()/(double)RAND_MAX > alpha) {
currCell.p--;
newCell.p--;
lattice[newSite] = true;
cellsTmp.push_back(currCell);
cellsTmp.push_back(newCell);
} else {
lattice[currCell.place] = false;
}
} else if ((double)rand()/(double)RAND_MAX < pmig) {
lattice[currCell.place] = false;
lattice[newSite] = true;
cellsTmp.push_back(newCell);
} else {//doing nothing
cellsTmp.push_back(currCell);
}
} else {//no free spot
cellsTmp.push_back(currCell);
}
}
cells.swap(cellsTmp);
}
}


Now we wrap everything in the main function, compile and run.

int main() {
srand(time(NULL)); //initialize random number generator
initialize(); //initialize CA
simulate(24*30*6);
return 0;
}


Everything in about 100 lines of the code.
What about the speed of the code? It took about 40 seconds to simulate the tumor presented below, which is consisted of about 460,000 cells (on 2000×2000 lattice).

Why is it “almost” the same model as the one presented in previous post? The answer is in details. In the above C++ implementation lattice is updated after every single cell movement/death, i.e. because of the random shuffling a new free spot can be created for a cell that at the beginning of the iteration was quiescent (without a free spot), so it can successfully proliferate in the same iteration. In the MATLAB implementation that kind of behavior was avoided.

# Visualization of 3D tumor using isosurfaces and simple blur

Today, I’ll show how to simply visualize 3D tumor using MATLAB’s isosurface function combined with a simple blurring technique.

First, we will modify the code from the previous post in order to simulate tumor in 3D. There are only a few changes: redefinition of the domain and the neighborhood; generating permutations on the fly instead of storing them in Pms varaible; and the way in which the initial cell is placed.

N = 200; %cubic domain dimension
nSteps = 30*24; %number of simulation steps

pprol = 1/24; %probability of proliferating
pmig = 10/24; %probability of migrating
pdeath = 1/100; %probability of dying
pmax = 10; %proliferation capacity
ps = 3/10; %probability of symmetric division

L = false(N,N,N); %domain definition
L(1,:,:) = true; L(end,:,:) = true; %filling boundary
L(:,1,:) = true; L(:,end,:) = true; %filling boundary
L(:,:,1) = true; L(:,:,end) = true; %filling boundary

L(N*N*round(N/2)+N*round(N/2)+round(N/2)) = true;
cells = int32(N*N*round(N/2)+N*round(N/2)+round(N/2));
cellsIsStem = true;
cellsPmax = uint8(pmax);

aux = int32([[-N-1 -N -N+1 -1 1 N-1 N N+1] ...
[-N-1 -N -N+1 0 -1 1 N-1 N N+1]-N*N ...
[-N-1 -N -N+1 0 -1 1 N-1 N N+1]+N*N])'; %indices to heighborhood

for i = 1:nSteps
sh = randperm(length(cells));
cells = cells(sh);
cellsIsStem = cellsIsStem(sh);
cellsPmax = cellsPmax(sh);

Pms = cell2mat(arrayfun(@(x)randperm(26),(1:length(cells))','UniformOutput',0))';
S = bsxfun(@plus,cells,aux(Pms));
S(L(S)) = 0; %setting occupied spots to 0
indxF = find(any(S)); %selecting cells with at least one free spot
nC = length(indxF); %number of cells with free spot

P = rand(1,nC)<pprol; %proliferation
Ps = P & rand(1,nC)<ps & cellsIsStem(indxF);%symmetric division
De = P & (cellsPmax(indxF) == 0);%proliferation exhaution
D = P & (rand(1,nC)<pdeath) & ~cellsIsStem(indxF); %death at proliferation attempt
M = ~P & (rand(1,nC)<pmig); %go when no grow

del = D | De; %cells to delete
act = find((P | M) & ~del); %indices to the cells that will do something
for ii = act %only for those that will do anything
ngh = S(:,indxF(ii));
ngh(ngh==0) = [];
indO = find(~L(ngh),1,'first'); %selecting free spot
if ~isempty(indO) %if there is still a free spot
L(ngh(indO)) = true;
if P(ii) %proliferation
cells = [cells uint32(ngh(indO))];
if Ps(ii) %symmetric division
cellsIsStem = [cellsIsStem true];
cellsPmax = [cellsPmax cellsPmax(indxF(ii))];
else
cellsIsStem = [cellsIsStem false];
cellsPmax = [cellsPmax cellsPmax(indxF(ii))-1];
if ~cellsIsStem(indxF(ii))
cellsPmax(indxF(ii)) = cellsPmax(indxF(ii))-1;
end
end
else %migration
L(cells(indxF(ii))) = false;
cells(indxF(ii)) = uint32(ngh(indO));
end
end
end

if ~isempty(del) %updating death
L(cells(indxF(del))) = false;
cells(indxF(del)) = [];
cellsIsStem(indxF(del)) = [];
cellsPmax(indxF(del)) = [];
end
end


Using the above code I simulated two tumors that have different growth characteristics, both with pmig = 5/24 and one with ps = 3/10 and the other one with ps = 5/100. Now we need to visualize them.

Having the whole tumor stored in the L variable we can use isosurface function straight away and plot the tumor as it is. The only thing that we need to do is to remove the cells from the L boundary.

function visualize( N, L)

%clearing cells from the boundary
L(1,:,:) = false; L(end,:,:) = false;
L(:,1,:) = false; L(:,end,:) = false;
L(:,:,1) = false; L(:,:,end) = false;

%calculating isosurfaces and plotting
p = patch(isosurface(1:N,1:N,1:N,L,0.25));
isonormals(1:N,1:N,1:N,L,p)
set(p,'FaceColor','red','EdgeColor','none');

xlim([1 N]);
ylim([1 N]);
zlim([1 N]);
view([90 0]);
camlight
lighting gouraud
end


The results of the above function are plotted below.

As you can see the visualization doesn’t really allow to see the 3D structure of the tumors, as individual separated cells are plotted and there is no clear 3D surface. We will fix that by using a simple blurring technique. The basic idea is that we will replace each site value by the average value from its neighborhood (several times in a loop). That will allow to delete separated cells and make 3D surface smoother. We add an input variable blLevels to the function in order to define how may iterations of that procedure should be performed.

function visualize( N, L, cells, blLevels )

%clearing cells from the boundary
L(1,:,:) = false; L(end,:,:) = false;
L(:,1,:) = false; L(:,end,:) = false;
L(:,:,1) = false; L(:,:,end) = false;

if blLevels %if perform basic blur

%auxilary variable with indices to the cell negiborhood
aux = int32([[-N-1 -N -N+1 -1 1 N-1 N N+1] ...
[-N-1 -N -N+1 0 -1 1 N-1 N N+1]-N*N ...
[-N-1 -N -N+1 0 -1 1 N-1 N N+1]+N*N])';

%creating indices to cells and their beignorhoods
S = [cells unique(reshape(bsxfun(@plus,cells,aux),1,[]))];
%creating indices to neigborhood of indices in S
S2 = bsxfun(@plus,S,aux);
%making sure that indices are still within the lattice
S2(S2<1) = []; S2(S2>N*N*N) = [];

%changing lattice from logical variable to float
L = single(L);
for i = 1:blLevels %for number of blurs
L(S) = mean(L(S2)); %taking the average of neighborhood
end
end

%calculating isosurfaces and plotting
p = patch(isosurface(1:N,1:N,1:N,L,0.25));
isonormals(1:N,1:N,1:N,L,p)
set(p,'FaceColor','red','EdgeColor','none');

xlim([1 N]);
ylim([1 N]);
zlim([1 N]);
view([90 0]);
camlight
lighting gouraud
end



If we apply blLevels = 1 we get the following, nicer plot of the tumor.

For blLevels = 2 we get the following plots.

Finally, when applying three iterations of blur (blLevels = 3) we get 3D plots nicely showing the tumor shape.

What is also great it takes only about 3 seconds to plot a tumor with more than quarter of a million of cells (left one).