1. A DECADE IN THE LIFE OF ONCOLOGY MARKET
FORECASTING
By Nic Talbot-Watt
The field of oncology is changing at an unprecedented pace. Development of new, targeted therapeutics has
revolutionised patient life expectancy, outcomes and epidemiology. The market is now extremely dynamic and
a number of new factors must be considered when valuing and forecasting products and patient populations
in the oncology arena.
In the beginning…
One of the first major projects I can recall from my former days post-university was being asked to forecast the
entire oncology pipeline for a major Pharma company. I think I had been in the role of forecaster for about 3
years at the time, but had never been asked to forecast any asset in oncology previously. Even today, being
asked to forecast oncology casts a shadow and strikes a fear of dread into the analyst concerned. It has always
been considered as a ‘challenge’ (partly due to lack of data, partly due to terminology & classification –
oncology is not an easy therapy area to grasp).
A baptism of fire by any means, I had around one month to forecast 10+ assets. The limited time frame was, in
itself, a challenge – how to produce TEN + forecast models in just a month? However, coming from an
oncology-naïve state (with the exception of what I had covered at university), I first had to come up to speed
with the nuances of oncology markets:
differences in use of cancer epidemiology for forecasting (use of incidence versus prevalence which is
used in nearly all other Pharma markets)
different treatment settings in which these potential future products could be used
Once I had a grasp on all of that, I might be in a position to think about how I would structure a forecast model
for something in oncology.
INCIDENCE VS. PREVALENCE
Let’s start with the epidemiology, since most good forecasts start with the patient as the starting foundation.
Whereas most forecast models I had come across or built to that point were based on prevalence, for
oncology the forecaster needs to be concerned with incidence (or active cases of disease - which ten years ago
was mostly incidence). Thankfully, most epidemiology data for oncology is presented as incident cases (while 5
and 10 year prevalence figures were also stated to provide a snapshot of survival).
Crude data were available for me to be able to estimate rates of recurrence, and since nothing much was
changing to affect patient survival, those could be applied with relative certainty to the incident figures. Time
to recurrence was not something we tended to include as the populations were pretty ‘static’.
2. TREATMENT OPTIONS VS. LINE OF THERAPY
So, starting with incident cases was fine and didn’t present too many problems. But, it was highly suspected
that these oncology assets would most likely be reserved (in the first instance) for ‘salvage’ or 2nd
/3rd
line cases
– we couldn’t simply use the straight incidence data. This introduced the problem of estimating how many of
those incident cases would still be alive for a 2nd
line, 3rd
line or ‘salvage’ treatment. It also raised the issue of
definition - was salvage 3rd
line or 4th
line? Did the definition change from cancer to cancer? Did oncologists
agree and even use the term ‘salvage’?
Added to that was the concept that patients could not receive the same treatment in more than one line of
therapy – if a product launched in 3rd
line, but progressed to 2nd
and then to 1st
line, how should we handle
that shift?
But the bigger problem was estimating survival of these patients over time, and how that should be built into a
forecast.
Patients in downstream lines of treatment would be highly sensitive to any changes upstream (i.e. let’s be
facile and say that a new treatment launches which puts 100% of treated patients into remission – what would
happen to those in the next line of treatment downstream?).
Of course, there are methods in Excel whereby we can use OFFSET functions and other wizardry to refer back
to patient populations in earlier years, but since survival was not exact to 12 months or a whole number of
years, it did not exactly ‘add-up’ in an annual forecast – what patient number should one draw from to apply
relapse rates where patients took (e.g.) 2.5 years from point of diagnosis to when they should be included for
relapse? Annual periodicity was not a granular enough measure to use for oncology forecasting.
DATA VS. TECHNOLOGY
Providing a 25 year forecast by month was, at that time, outside the programme limitations of Excel 97 and
2003 – the maximum number of columns available was around the 255 mark – insufficient for 25 x 12 months
(Excel has much more capacity these days in terms of spreadsheet column width). It was probably possible to
re-orient the spreadsheet to work in rows, but this would have made for a very unmanageable forecast
spreadsheet, not only to build but also for the end user to, well, use.
At that time, data availability and sources of patient data - such as syndicated patient diary or record studies -
were either non-existent or far too early in their infancy to be widely available at reasonable cost. I certainly
wasn’t aware of syndicated patient record studies until a few years later when I had joined the client-side of
the industry.
Let’s face it, the data side of the industry has undergone something of a (r)evolution in the last decade. We
take it for granted that there are patient diary studies that can be bought as datasets, available online at the
click of a few buttons. Even published journals are mostly available online in pdf format (back then although
PubMed existed, it required a day in a library to source the actual articles: find the physical copy of the printed
3. journal, photocopy the article and return to the office armed with the (hopefully) relevant pieces of
information to complete an epidemiology or forecast model).
So back in the day we were faced with:
1. Fairly good incidence data
2. IMS or equivalent unit volume audit data
3. Very little the way of published, useable clinical data (when I say useable, I mean useable for
forecasting)
And that was really about it. Oncology (along with analgesia) was one of those areas where it was nigh on
impossible to reconcile drug audit data with patient numbers (the holy grail of forecasting – top down and
bottom up, getting the two to meet in the middle – the so-called ‘Hybrid’ methodology – called a hybrid
because it used patient-based (top-down) AND volume-based (bottom-up) to provide a more complete view of
a market).
CLINICAL TREATMENT ALGORITHMS
Dosing regimens were complex and varied with few standardised drug options (while protocols might have
been standard, oncologists’ selection of a protocol was not). Plus the drugs of the time (mostly
chemotherapies) were used so widely and across so many cancer types and treatment settings that it was
impossible to narrow it down to the chunk of a drug used in a specific cancer in a specific line of treatment
(incidentally, a problem we are now faced with again with trans-disease targets – e.g. the wide range of drugs
used to treat autoimmune (AI) diseases or that span both AI and oncology).
So at the turn of the century (we are only talking about 2000 here, but still), what the forecaster could do for
an oncology market was fairly limited from a data perspective. We were still flying blind, assuming that all
patients diagnosed (which we largely knew - these were the incident reported cases) were treated. We could
add a layer of sophistication by splitting the patients by stage at diagnosis and some idea of recurrence, but as
far as anything else went, that was probably about it.
Of course, market research was always an option to help plug any gaps (and there were many), so it was
possible to get estimates of how many patients that started in 1st
line would still be alive to start 2nd
line etc.
But in one month, we didn’t really have time to conduct market research (again, these days it’s mostly done
online, with far shorter lead times – back in 2000, it would take significantly longer to collect the data – I recall
the first time I commissioned an online market research study – it was only small and run in the UK, but was
considered a ‘new thing’ in 2002… Considering the pace of change in the last decade, two years marked a big
move in data collection and technological advancement).
What were we left with? Some patient level data, but not a lot else. We even had to estimate treatment rates!
The next challenge was not only to ‘get to grips’ with the pipeline assets and their likely targets (some of these
assets were seriously early stage, not to mention targeting some very odd things that I had only read about as
cutting edge science at university).
Of course, the programme directors for each of these assets had made some bold claims with respect to the
efficacy of their particular project. In theory, to produce a good forecast for any asset, you should aim to
4. model how the market works and functions, both now and in the future. As a forecaster it is YOUR JOB to
predict the future.
Call me naïve, but at the time I ambitiously wanted to believe the claims or at least understand the impact on
the market should the product be able to deliver what the in-house clinical team were promising (you should
be open to any possibility as a forecaster). If these drugs did change patient survival, that could have
enormous consequences on a patient-based forecast! For example, a patient goes from living 1 year to living 2
years – if that patient is on medication for the entire time, this would DOUBLE the forecast. Huge implications.
I was faced with the issue of how to forecast that particular event in the model. There were no precedents
that I could use as a blueprint and little (if any) data to use for modelling such an event. I sought council from
those older and wiser than I for help on how to incorporate this into my forecast…
The universal message that came back was not to include it. The consensus from the wider team was that it
was too complex to build into a spreadsheet model given the data and time available. Plus it was a real outside
chance that these drugs would actually even make it to market let alone be so revolutionary as to change
patient survival. I was to assume that the drugs would not have any significant impact on patient survival and
stick with a static model driven by the incident patient population.
All in all, I suppose I agreed with the approach, given the time constraints and lack of data. But it ignited a
thought, a burning question that I kept coming back to over the years. Like a cold case, I would revisit the
unsatisfactory model that I had built many years ago, have a prod and see if I thought anything different ‘this
time’ - had my subconscious worked away at a solution yet? Generally, the answer was ‘not yet’. But with each
visit, my perspective on the problem was slightly different.
THE FIRST DYNAMIC MODEL
Over the years, my perspective on the healthcare industry changed. Some of it was down to working directly in
the industry (what we call ‘client-side’), some of this was down to working in diagnostic companies (you need a
different perspective to be able to estimate the market for a diagnostic – in the realm of traditional Pharma
company forecasting, patients tend to already have a diagnosis – in fact, it’s often one of the standard
assumptions that we include in any forecast model – how many of the prevalent or incident patients have
been diagnosed?).
And much of my change in perspective has come from being asked to do some very challenging forecasting
projects across a variety of different therapeutic areas.
In turn this has led me to think about “what we know”. With regard to this, I mean more specifically, what ‘we’
(as the collective human consciousness) know about a given disease topic. And more importantly, what we
know about states of health and causes of those states of health.
You could view the healthcare system (including research) as a continuum – with prevention of death on the
one extreme (Florence Nightingale et al) through to prevention of any sort of illness on the other (the closest
we have to that today are vaccines). Everything in-between is a sliding scale based on how much we know
about the causal determinants of a ‘disease’ and what we can do about them.
5. Depending on where your disease sits in this knowledge continuum, and if that knowledge state is likely to
change through the period you are forecasting will determine how successful you are likely to be when trying
to predict/ forecast potential sales of a new product.
It is sheer, unbridled arrogance to believe that we know everything there is to know – and surprisingly, people
are very poor at estimating what they don’t know. Many people when faced with the simple question of: “How
much do you think is known about ‘disease X’?” (especially those in Pharma companies – not to point fingers)
suggest that ‘nearly everything’ is known about ‘disease X’.
This is rarely the case, as evidenced by the fact that we are mainly still treating the symptoms of disease X
rather than curing or preventing disease X (of course, prevention would be the end game – much easier than
cure?).
The first time I was commissioned to build a proper dynamic forecast model was around eight years ago –
nearly seven years after my first oncology modelling assignment. This time, the therapy area in question was
HIV – another area I had never really tackled in any particular depth. The company in question knew an
inordinate amount about the market, and thus I was able to interrogate them for information that may be
pertinent to building them a predictive market model.
The first issue, as with oncology, was to get to grips with the market terminology, classes of drug and
particular nuances of the HIV market place. To say that I had never touched HIV is slightly misleading – during
my first client-side role I had worked in an analysis team – one member of whom covered HIV. While I was not
directly involved in the therapy area I had supported the analyst to build models and was aware of some of the
issues with the market.
The essence of the company’s requirement was a model that allowed them to model and analyse in detail a
new product launch and how they would grow a new product by gaining patients from those newly initiating
treatment and those switching from other HIV medications. In addition, the medication they were seeking to
model and forecast bridged three classes of drug in the market. The challenges posed by the HIV market were
substantial from a forecasting point of view.
The familiar issue of how to reconcile patients with drug volume (audit data) raised its head, as did issues of
polypharmacy (dealing with > 100% of patients – never a “good thing” when working with spreadsheet
mathematics). Not to mention that this drug would universally change the amount of ‘polypharmacy’ in the
market (since it combined 3 classes of drug in one). Now not only was a reliable estimate of the patient
universe required, but also an understanding of how the market definition was going to change, alongside how
to capture and model that in Excel, and at the same time allow future events to impact that structure.
There was a lot to consider.
Still, my client and I persisted, and nearly a year later, we had something that we were broadly happy with: a
dynamic model allowing a new product launch to ‘build’ its share from two different patient populations (each
modelled separately to allow changes to diagnosis rates and thresholds for initiating treatment).
We had to model the market almost independently of the products within it – dealing with the high level
patient populations and their dynamics before delving into the products.
This new way of forecasting highlighted a few of the missing pieces (but not all) that the industry had been
blindsided to (I can only assume the industry was blindsided since no one had really addressed doing it in this
6. way previously). For instance the rationale for forecasting by treatment regimen rather than by class or
individual product - the traditional method for forecasting markets such as this would have been to divvy up
the patients by main class of agent, then by product in a very hierarchical fashion (in fact, we audited many of
the client’s in-house local country market models to try and generate a way of forecasting the market which
made sense and captured the drivers and dynamics required - many of them had treatment rates in excess of
100% or some adjustment factor to inflate or deflate according to concomitant medication use). But this was
most likely a product of the way that the available data allowed the market to be deconstructed for
forecasting. It was not the way that the market had to be forecast.
Forecasting at the class & product level was fine, but in a market with a high degree of mix & match drug
“cocktails”, estimating the overall treated population from the drugs consumed was not really possible
without a lot of assumptions in between.
The thing that made this approach of forecasting by regimen possible was the availability of patient record
data. For a number of years, large scale patient record studies had been gaining momentum in the industry. It
is unclear what enabled it (push based on the ability to do it enabled by the technology or pull based on
demand from clients – probably a little of both), but for once, it was possible to triangulate from patients to
prescription volumes and validate the outputs of a forecast built in this way with some degree of certainty,
and without a lot of unfounded assumptions.
The other mitigating factor here was the consistency of approach from the physician community – for many
years prescribing habits in many therapy areas have been diverse. In fact, you could almost tell the degree of
clinical effectiveness of the available medications by the consistency of prescribing habits of the attending
physicians. The more scatter-gun and broad the use of medications, the less likely it was that there was an
effective therapy on the market and vice versa. This had definitely been the case in oncology and was certainly
the case in HIV, although there had been a progressive move toward preference to prescribe combination
tablets over single tablet ‘cocktails’.
Overall, the data were becoming available, the therapies were better (and constantly improving), guidelines
were in place and overall there was just more data and information available on which to build and test a new
theoretical approach to forecasting a complex market than ever before.
This model was undoubtedly a step-change away from the way the industry (on the whole) had previously
forecast. It was a start, but not the whole journey by any means.
A SHIFT IN PERSPECTIVE FROM STATIC TO DYNAMIC
There were still technical issues which could cause bothersome spreadsheet errors (we had implemented a
sophisticated method by which new products could gain share from competitor market products, but this
method was still built along the old-school traditional lines and as such had the propensity to create negative
share values for products with not much share).
On a wider scale however, it caused arguments and debate – the new method caused unforeseen outcomes.
These unforeseen outcomes were entirely logical but not straightforward. For instance, there were a group of
patients who were deemed to reside in a category called ‘seen for care’ or ‘watchful waiting’. These patients
were to all intents and purposes healthy – their viral load was not currently an issue, CD4 counts were in a
7. healthy range, but the individual had a positive diagnosis of HIV. They were infected. At a point in time, their
CD4 count would fall to a level that suggested initiation of anti-retroviral treatment, but until that time, they
would be monitored regularly for any change in blood count or viral load. Increasing the rate of initiation on to
treatment for those in this ‘watchful waiting’ pool over time eventually led to a decrease in patients starting
treatment. This was because this ‘latent’ pool was only being filled by a constant rate each year (let’s say the
pool was 10,000 in size, being fed by 2,000 newly diagnosed but otherwise healthy HIV patients per year).
Once the residual bolus of patients had been drawn down from the pool, the residual population was reduced
to the size of the newly incident/diagnosed population. This was an ‘unforeseen’ impact of the dynamic model,
albeit entirely rational and logical if one thinks through where the patients are and how they are flowing
through the model. However, it was not something anyone had come across previously and, naturally, this was
challenged.
The second great debate was around what share of the market the new product could get and by when.
FORCED TO RECONSIDER WHAT’S POSSIBLE…
Since we had deconstructed the market into those newly initiating treatment and those switching from
another medication, there were only two ways in which a new product could gain patients.
It was decided that the newly initiating patients were fairly constant and that a new product launch would not
influence the rate of initiation (fair enough, you can’t magic new patients out of the ether just because a new
product decides to launch). So this led the focus to be on the other source of patients for a new product: those
already on something else.
Based on market research, it was decided that there was a certain rate at which patients were switching on
and off HIV medications – mainly driven by inability to withstand the side-effects (tolerability) or inability for
the medication to control viral load (treatment failure). According to the market research, there were only
about as many patients switching as there were starting treatment for the first time (actually, estimating the
size of the switch pool still remains an issue in this market). However, both pools of patients combined were
not substantial enough for the new product to reach the ambitious market share which had been set for its
first year (for example – product needs 30% market share - % of patients new to the market or switching are
only around 10% of the market altogether – even if the new product gets every one of those 10% of patients, it
will fail to reach 30% - it would need at least 3 full years on the market).
If the share target was to be reached, something would have to change in the market dynamics. Or, if the
market dynamics were not going to change, the first year share target would not be achievable.
Everyone was adamant that the patient pools would not change, but that they HAD to be able to produce that
30% share from the model within a year of launch.
The structure of the model would not allow these two things to both be true.
It turned out that the first year share was achievable, and that the market dynamics shifted in response to the
new product launch.
8. A hitherto unforeseen shift happened in the market – one which the model could adequately deal with in its
structure, but that the market research had not adequately captured. This was the propensity to switch
medication driven by the presence of a new, more conveniently dosed medication in the market. The received
wisdom was that this was a static feature of the market, which was not likely to change.
The model included this propensity to switch medications as a model driver, a variable that could be changed
and therefore the market could adjust to reflect what actually happened. More importantly, patients were not
randomly assigned to the new product; careful thought had to be given to how and where patients for the
new product came from.
It’s easy to be wise after the event, but eight years on, the company are using the same dynamics and way of
looking at, modelling and tracking their market, but it was a challenge at the time to establish a new
perspective or to get some of the analysts involved to think broader and challenge what they knew or had
accepted as fact about their market.
In some ways the industry has not changed all that much – the majority of people have a hard time challenging
and overturning what they know or accept to be fact. It does not occur to most people that this is the case or
needs to happen.
BACK TO ONCOLOGY – NEW TOOLS FOR A NEW ERA OF HEALTHCARE
Some time later (a few years further on) we were tasked with forecasting a new product acquisition in
oncology. The product in question was to be used as a 2nd
line treatment, but in a group of patients that had
received a specific class of agent at 1st
line. The current 1st
line treatment setting was also set to evolve rapidly
with the launch of a number of promising new assets (ahead of the one we were to be concerned with) which
could have a serious downstream impact on our asset, and as such HAD to be incorporated into the forecast.
Leaning heavily on what I had learnt from the HIV model, I started to construct something. This ‘something’
was a bit rough and raw, sort of worked, but wasn’t as robust as I wanted.
We started out with the incident patients, took a % of those patients and allocated them to the 1st
line class of
agents we needed to consider (market research came through and was used to estimate what % that was plus
how that might change over time with launch of new products and classes).
What we then needed was an estimate of how long a patient would remain on their 1st
line treatment
(something we didn’t include in the HIV model as it was unclear how to model that dynamic and also
impossible to calibrate – patient record studies had not been around long enough, plus we would have needed
to start our forecast model back in the 1970’s to enable enough patients to ‘flow’ through the model to be
able to correctly forecast a new product launch).
Oncology being a more acute condition than HIV, with (in the case of the cancer we were dealing with) life
expectancy of a few years rather than a few decades, including treatment duration as a variable parameter in
the forecast made sense.
I again revisited the unsatisfactory model I had built years earlier – how to estimate the impact of truly
efficacious therapies in oncology?
9. Well, now I had my answer (or part of it). There were two dynamics which could be built into a spreadsheet
model to cope with this:
1. Duration of treatment – for advanced oncology patients who are ‘treated until progression’, we just
had to estimate the overall length of treatment. For products already in the market, we had plenty of
data to analyse to tell us (on average) how long patients were treated with previous regimens. For
new products entering the market, we could use clinical trial data to estimate how much longer
patients would be treated on the new therapy.
2. What happens to the patient once they stopped their 1st
line treatment. For any given patient, the
possible options could only be: stopped (and taken out of a model as discontinued or dead), relapsed
(medication no longer efficacious, patient moved to next line of treatment) or patient enters remission
(and from remission of course, the patient can relapse, die or stay in remission).
By controlling the length of time a patient stays in any given state, the flow and availability of a patient
downstream is affected and reflected accordingly. Allocating more or less patients to a longer treatment
regimen in an earlier setting produced impacts that ‘ripple’ through the patient flow – a higher % of patients in
an earlier line would have the immediate impact of reducing patients in the next line, but long term if survival
were increased, would actually increase the number of patients in the next downstream line of treatment
(with the assumption that relapse rates remain relatively consistent).
The other element that enabled this approach was the expansion of Excel.
Despite the HIV model being built during 2007-2008, the version of Excel that was being used by the client was
2003. By the time we were building the oncology flow model, most clients had moved to either Excel 2007 or
Excel 2010. Microsoft radically changed the Office suite between 2003 and 2007 versions. Now Excel could
handle far more columns and rows than before and had added functionality.
The tools and data were in place for better oncology forecasting.
Now, not only could we model the patients in the way that reflected the market, but we could be more
granular, more specific.
We could model patients at the MONTHLY level, rather than at the ANNUAL level, which we had been forced
to do previously thanks to (mainly) the tools and (partly) the data.
Today, forecasting oncology markets is far more satisfactory and even enjoyable (I don’t get out much).
Basing the forecast on the clinical management of the patient and how that is likely to change over time
means that we can be more predictive, more accurate in how we forecast oncology and how we expect
markets to evolve.
Building any forecast in a more dynamic way allows us to get closer to how therapies are being used, how
physicians and patients relate to medications and management of diseases. It also allows us to incorporate the
effects of new market dynamics in a healthcare environment which is on the cusp of radical change.
Demonstrating the impact of truly novel treatments on patient outcomes has never been more possible.