Cancer stem cell driven tumor growth model in less than 70 lines of code

Today I’ve decided to show how to efficiently code a cancer stem cell driven tumor growth model in less then 70 lines of code in MATLAB.

Quick guide to the model: A cell, either cancer stem cell (CSC) or non-stem cancer cell (CC), occupies a single grid point on a two-dimensional square lattice. CSCs have unlimited proliferation potential and at each division they produce either another CSC (symmetric division) or a CC (asymmetric division). CCs are eroding their proliferation potential at each division and die when its value reaches 0. Moreover, at each proliferation attempt, CCs may undergo spontaneous death and then be removed from the system.

We start with defining initial settings of a domain and simulation timespan.

N = 1000; %square domain dimension
nSteps = 6*30*24; %number of simulation steps

Next we define the values of all parameters used in the simulation.

pprol = 1/24; %probability of proliferation
pmig = 10/24; %probability of migrating
pdeath = 1/100; %probability of death
pmax = 10; %initial proliferation capacity of CC
ps = 3/10; %probability of symmetric division

Now we can intialize domain and place initial CSC in its center. Our computational domain is represented by N x N Boolean matrix (a true value indicates the lattice point is occupied). An additional integer vector cells is maintained to store indices for all viable cells present in the system (to avoid costly search for cells on the lattice).

L = false(N,N);
L([1:N 1:N:N*N N:N:N*N N*(N-1):N*N]) = true; %boundary
L(N*round(N/2)+round(N/2)) = true;
cells = int32(N*round(N/2)+round(N/2));
cellsIsStem = true;
cellsPmax = uint8(pmax);

To reduce the amount of used memory we utilized int32 and unit8 types as they occupy less memory than double (double is default type in MATLAB). We also set all the boundary values of L to true in order to make sure that we will not address any site out of the lattice. Typically one would use an if statement when addressing the lattice site to make sure that index to a site is within the domain. Here, because we have set the boundaries to true without adding those sites to cells vector, we don’t need to do that – boundary will be treated as an occupied site, but not as a viable cell.

We can now define auxiliary variables that will be utilized further.

aux = int32([-N-1 -N -N+1 -1 1 N-1 N N+1])'; %indices to heighborhood
Pms = perms(uint8(1:8))'; %permutations
nP = size(Pms,2); %number of permutations

Variable aux simply defines cell’s neighborhood and Pms is a variable in which we store all possible permutations of variable aux.

Now we can start the main loop of the program and randomly shuffle cells at its beginning.

for i = 1:nSteps

sh = randperm(length(cells));
cells = cells(sh);
cellsIsStem = cellsIsStem(sh);
cellsPmax = cellsPmax(sh);

Now few lines in which we first create indices to neighborhoods of all of the cells already in random order (variable S), then we flag by 0 all of the spots that are occupied and finally select only indices to those cells that have at least one free spot in the neighborhood (variable indxF).

S = bsxfun(@plus,cells,aux(Pms(:,randi(nP,1,length(cells)))));
S(L(S)) = 0; %setting occupied spots to 0
indxF = find(any(S)); %selecting cells with at least one free spot
nC = length(indxF); %number of cells with free spot

Now we can decide about the faith of the cells that can perform action (are not completely surrounded by other cells).

    P = rand(1,nC)<pprol; %proliferation
    Ps = P & rand(1,nC)<ps & cellsIsStem(indxF); %symmetric division
    De = P & (cellsPmax(indxF) == 0); %proliferation capacity exhaution
    D = P & (rand(1,nC)<pdeath) & ~cellsIsStem(indxF); %death at proliferation attempt
    M = ~P & (rand(1,nC)<pmig); %go when no grow

In the additional loop we perform update of the system.

    del = D | De; %cells to delete
    act = find((P | M) & ~del); %indices to the cells that will perform action
    for ii = act %only for those that will do anything
        ngh = S(:,indxF(ii)); %cells neighborhood
        ngh(ngh==0) = []; %erasing places that were previously occupied
        indO = find(~L(ngh),1,'first'); %selecting free spot
        if ~isempty(indO) %if there is still a free spot
            L(ngh(indO)) = true; %updating occupancy
            if P(ii) %proliferation
                cells = [cells uint32(ngh(indO))]; %adding new cell
                if Ps(ii) %symmetric division
                   cellsIsStem = [cellsIsStem true];
                   cellsPmax = [cellsPmax cellsPmax(indxF(ii))];  
                   cellsIsStem = [cellsIsStem false];
                   cellsPmax = [cellsPmax cellsPmax(indxF(ii))-1];
                   if ~cellsIsStem(indxF(ii))
                    cellsPmax(indxF(ii)) = cellsPmax(indxF(ii))-1;
            else %migration
                L(cells(indxF(ii))) = false; %freeing spot
                cells(indxF(ii)) = uint32(ngh(indO));

At the end we need to erase cells that died.

    if ~isempty(del) %updating death
        L(cells(indxF(del))) = false;
        cells(indxF(del)) = [];
        cellsIsStem(indxF(del)) = [];
        cellsPmax(indxF(del)) = [];

This is it. Whole CA is coded and we are ready to run simulations!

In order to visualize simulations results we can implement short function.

function visualize( N, cells, cellsIsStem, cellsPmax, pmax )

    M = ones(N,N,3); %matrix for image
    color = hot(3*pmax); %colormap
    %drawing CCs
    M(cells(~cellsIsStem)) = color(cellsPmax(~cellsIsStem)+1,1);
    M(cells(~cellsIsStem)+N*N) = color(cellsPmax(~cellsIsStem)+1,2);
    M(cells(~cellsIsStem)+2*N*N) = color(cellsPmax(~cellsIsStem)+1,3);
    %drawing CSCs, we want to draw them as 3x3 points
    aux = int32([-N-1 -N -N+1 -1 1 N-1 N N+1])';
    CSCs = cells(cellsIsStem);
    plusSurr = bsxfun(@plus,CSCs,aux);
    M(plusSurr) = color(2*pmax,1);
    M(plusSurr+N*N) = color(2*pmax,2);
    M(plusSurr+2*N*N) = color(2*pmax,3);

Below is the visualization of the tumor simulated using the above code. It is consisted of 369,504 cells in total (8563 CSCs). The whole simulation took about 12 minutes on my laptop, what taking into account the lattice size and number of cells is a reasonable time.



Loop-free search for proliferative cells in MATLAB

When thinking about implementation of agent-based model only few of us consider MATLAB as a coding environment, even though it has plenty of built-in visualization and analysis tools. I get that. It’s a high level language that is not very efficient with for loops, so a basic loop-based code will execute much faster when coded in C++ (and others). However, there is a very cool feature of MATLAB – vectorization, that allows to skip plenty of loops and significantly speed-up the computations. In this post I’ll show how to utilize this feature to make a loop free search for a cells that have a free spot in the neighborhood.

Our computational domain is represented by N x N Boolean matrix L (a true value indicates the lattice point is occupied). An additional integer vector cells is maintained to store indices for all viable cells present in the system (to avoid costly search for cells on the lattice). Our task is to write a code that returns indices to those cells that have at least one free spot in the 8 spot neighborhood.

A basic loop-based code in MATLAB that will do the task is the following:

cellsInd = false(size(cells));
    for j = 1:length(cells)
        loop = true; 
        for ii = -1:1:1
            if ~loop
            for kk = -N:N:N
                if ~L(cells(j)+ii+kk)
                   loop = false; 
                   cellsInd(j) = true;
    cellsInd = find(cellsInd);

The code is similar to the one that would be written in C++. For 2000×2000 lattice with 800,000 cells evaluation of 200 iterations of the above code take about 83 seconds. Not very impressive… Same task for a code written in C++ takes about 3 seconds (when using vectors from STL library). Can we do something about that?

As I wrote, vectorization is a very cool MATLAB feature. When using it, coding the whole task takes only one two lines:

aux = [-N-1; -N; -N+1; -1; 1; N-1; N; N+1]';
cellsInd = find(~all(L(bsxfun(@plus,cells,aux))));

Moreover, the evaluation time is reduced from about 83 seconds to about 9.5 seconds. Impressive speed-up and elegant code in two lines.

Still, both codes are outperformed by C++, but the difference when using vectorized code is bearable.