Randomized Controlled Trials in Evaluating Socially Complex Interventions: A Square Peg in a Round Hole?
This lecture will discuss a number of challenges and problems in Randomized Controlled Trials (RCTs), in particular in evaluating interventions aimed at (i) altering complex human behaviour, (ii) in marginalized and stigmatized populations; and, (iii) by socially complex interventions. Using examples from the literature and his own research, Dr. Grund will provide a transdisciplinary perspective on the utility of the RCT model in evaluating interventions aimed at, for example, people who use drugs or homeless people, two very complex “Real World” problems in the Czech Republic and elsewhere.
He argues that the arena of services for PUD, the homeless and other marginalised populations is rife with poorly understood contingencies. Consequently, the complexity of the research environment becomes a function of I, ii and iii above, but with enigmatic mathematical operators. Strategies for addressing this complexity through accompanying process evaluation and qualitative research will be discussed.
4. Well addictology is not narcology, the Darth Vader version of addiction treatment, still
practised in Russia. Stricktly speaking, “addictology” would translate into “Addiction Science,”
but in the Czech Republic it represents a trans-disciplinary, issue-driven framework for the
entire approach to potentially addictive substances and behaviours that merges, for example
biomedical, psychological and social scientific perspectives, as well as aspects of criminology
and law enforcement.
GPS ČR 2008 4
6. The concept of addictive behaviour centres around compulsive substance use. but extends
the scope of inquiry to a broader area of human activities that may have compulsive (thus
addictive) propensities, such as pathological gambling, gaming or internet use and result in
harm being incurred by individuals or (segments of) society.
But not all drug use is compulsive. How to account for recreational use of intoxicants?
Addictive behaviours do not exist in a vacuum, but are subject to a choice of dynamically
interlocking influences—for example, from the interaction between drug and brain chemistry
to the relations between substance use patterns and drug control regimes. I made this one
bold, because the same applies to RCTs and other research; what we do, does not occur in
a vacuum!
Traditionally, we see different perspectives on drug use and addiction dealing with different
aspects of the use of mood altering substances and there are equally many disciplines
involved. [click]
GPS ČR 2008 6
7. I highlighted the term setting in the previous slide. [click]
Traditionally, we see different scientific disciplines around
different aspects of drug use and addiction: explain drug, set
& setting + [click] traditional disciplines.
Important need for theory and models that capture the
dynamically interlocking influences that affect both the phenomenon of
drug use itself and its outcomes – e.g. interaction between drug and
brain chemistry; relations between substance use patterns, social norms,
economic conditions or drug control regimes
[click] There are, of course, various examples of cross-disciplinary
collaboration, such as social work and psychopharmacology.
Typical of the early Dutch drug policy is that it reflects the thinking that
these different aspects of the use of drugs are intertwined.
GPS ČR 2008 7
11. As you can see, there are different levels of scientific collaboration and interaction
[click] Transdisciplinary collaboration crosses…
GPS ČR 2008 11
12. As you can see, there are different levels of scientific collaboration and interaction
[click] Transdisciplinary collaboration crosses…
GPS ČR 2008 12
13. So, with this in mind, we are going to take a critical look at the use of Randomized Controlled
Trials outside the realm of clinical medicine.
13
14. Following the lead of evidence-based medicine, practice based on effectiveness
research has become the new gold standard of contemporary public policy.
Studies of this sort are increasingly demanded to evaluate services provided by
mental health, social services and criminal justice systems.
XXX It has been implicitly assumed with the diffusion of effectiveness research from
medication and surgical interventions to socially complex service interventions that
the design of the randomized controlled trial, the sine qua non of effectiveness
research, is independent of the service intervention itself.
14
15. Evidence hierarchies are integral to evidence-based medicine.
Evidence hierarchies reflect the relative authority of various types of biomedical research. [click]
There is no single, universally-accepted hierarchy of evidence, but there is broad consensus on
the relative strength of the principal types of research, or epidemiological studies.
[click] Randomized controlled trials (RCTs) rank above observational studies, while expert opinion
and anecdotal experience are ranked at the bottom.
Certain evidence hierarchies place systematic review and meta analysis above RCTs, since these
often combine data from multiple RCTs, and possibly from other study types as well.
RE: 2. RCTs:
2.1 Randomised controlled trials with definitive results (confidence intervals that do not
overlap the threshold clinically significant effect)
2.2 Randomised controlled trials with non-definitive results (a point estimate that
suggests a clinically significant effect but with confidence intervals overlapping the
threshold for this effect)
The use of evidence hierarchies has been criticized as allowing RCTs too much authority. Not all
research questions can be answered through RCTs, either because of practical issues or because
of ethical issues. Moreover, even when evidence is available from high-quality RCTs, evidence
from other study types may still be relevant.
15
18. Standardized intervention protocols
The 1st key assumption underpinning effectiveness research is that (dose)
interventions—both experimental and control— can be defined precisely or
standardized and monitored specifically for adherence.
This involves:
describing who is doing what and when (uniform implementation)
measured accurately and reliably
protocols need to be defined in ways that capture the structure of the delivery
mechanism and the process of the interactions among staff, as well as between the
staff and patients.
all factors other than the dose intervention must be identical between the two
interventions to rule out the possibility that unknown and idiosyncratic factors
contribute to the effects measured.
Critical to validity, replicability and generalizability
There are two critical control factors: study samples and trial environment, which are
assumed to be equivalent in a neutral therapeutic environment.
18
19. Assumption 2: Study Sample Equivalence
To ensure that the measured effects are the result of the dose interventions, all
intervention arms must be (equally) representative of a broader group of people that
might gain from the diffusion of the intervention.
Population Definition
Defining the target population for intervention is complex.
what is the intervention expected to impact on (symptom or behaviour)?
Are there internal or external factors that may mitigate or militate the symptom or
intervention pathways? Are these factors representative of population targeted?
Random Assignment
The goal of random assignment is to create equivalent groups. It ensures random
distribution of systematic or unmeasured differences within the sample among the
interventions.
Random assignment does not guarantee that sample groups will be balanced or
equivalent. With small sample sizes may result in unequal assignment of cases (e.g. in
severity of cases).
Power needs to be high enough to average out any chance asymmetries.
19
20. 3: Trial Environment Equivalence and Neutrality
The trial environment is expected to be unaffected by factors such as financing,
supportive assistance, inter-agency behavior and community dynamics, or, if it is
affected, it is assumed that the effect is equivalent between interventions. Keeping
the environments ‘clean’ and balanced between the groups ensures that only the
interventions, as specified in the protocols, are producing the relative differences
between the outcomes.
20
21. In contrast to surgery or medications, “Socially Complex Interventions” are services
that are characterized by complex, diverse and non-standardized staffing
arrangements; ambiguous protocols; hard-to-define study samples and unevenly
motivated subjects and dependence on broader social environments.
[Click] This paper by Nancy Wolff explores the difficulties of ensuring precise
protocols, equivalent groups (meaningfully resembling the target population) and
neutral and equivalent trial environments under real world conditions and the
implications of not achieving standardization and equivalence.” In this lecture I will
make extensive use of Wolff’s work.
21
22. Lifestyle: e.g. obesitas; norm digression: e.g. delinquency
combination of Txs: In drug treatment, for example, in Europe the standard is
increasingly a combination pharmacotherapy, counseling and/or psychotherapy and
social reintegration (HOUSING!).
New: Brain training, Gaming (Pills-Talk-Training)
Evaluating such complex interventions is thus equally complex.
What about context? Well, context is everything…
The arena of services for PUD, the homeless and other marginalised populations is
rife with poorly understood contingencies. Consequently, the complexity of the
research environment becomes a function of i, ii and iii above, often with enigmatic
mathematical operators.
22
23. Simply by comparing the ideal clinical intervention with socially complex services on
what are called “key imputs, we start to get an idea of the discrepancy between the
ideal RCT situation and its assumptions and the reality of SCIs.
Most commonly important inputs:
staffing arrangements – e.g. staff motivation, team composition, interactional
processes and collaboration
These skills are not easily standardized or professionalized through a certification or
licensing process.
protocol specificity – more complex interventions more ambiguity
subject involvement – e.g. motivation to participate in research
Environmental boundaries range along a continuum from hard to soft and determine
the susceptibility of the trial to outside influences.
23
27. [click] based on UCONN-Yale study of small scale Peer Drive Intervention
among PWID with HIV, in support of ARV adherence.
[click] slide tekst
“Staffing”: EXAMPLE:
Digression from protocol by staff
unequal motivation, staff time available, and performance by coaches
implementing the intervention, combined with limited funding solved only by
raising additional funding for Coach/coordinator (part of research team). still:
more sites means greater variety in implementation
changes to important part of the intervention due to practical /
logistical issues: meeting of large PDI meeting…
Also protocol changes that improved design after process evaluation
of pilot study.
27
28. Target groups of SCS have multiple, co-occurring problems, difficult to define
uniquely/with precision invites professional discretion.
For example: in the PROZE RCT one PDI “coach” digressed from the intervention
protocol by changing the reward system, giving rewards as encouragementto
participants, not for accomplishing goals, in trying to encourage compliance.
Definitional ambiguity can create tensions between researchers and service agencies,
as the clinical definitions set by researchers can differ from those of the agencies
where the study is implemented, which are guided by e.g. mental health laws,
eligibility criteria and funding.
Then there are two parts to the selection of study participants:
Proper definition of inclusion and exclusion criteria;
voluntary participation
In theory, inclusion and exclusion screens shape the characteristics of the study
sample to the characteristics of the target population. But, if the level of diagnostic or
problem uncertainty within the targeted population is high, screening instruments
may not have the required level of specificity for distinguishing true from false cases.
Likewise, who participates in the study?
How are characteristics such as illness severity, functional impairment and other
28
29. related problems distributed in the sample? Ideally, these would replicate the
distributional properties of the target population, but the actual distribution can get
distorted by self-selection. Those who agree to participate in trials may be
systematically different from the target population.
Insight in illness; disagreement between client’s and professional
assessment unwillingness to participate in particular relevant to the use of
drugs, which is a heavily debated topic.
28
30. The trial environment is expected to be unaffected by factors such as financing, supportive assistance, inter-agency behavior
and community dynamics, or, if it is affected, it is assumed that the effect is equivalent between interventions.
environmental boundaries –
Environmental boundaries range along a continuum from hard to soft.
Hard boundary settings: confounding external effects can be controlled
Examples of hard boundary settings are structured and relatively isolated environments, such as hospital clinics or therapeutic
communities. In contrast, soft boundaries are typical of or ambulant interventions that operate in and interact with a wider
community-based setting (is directly or indirectly influenced by the other), e.g. opioid substitution programs or assertive
outreach programs. Thus, in SCIs the divide between the intervention setting and the social environment may often by very
porous, in particular in interventions targeting vulnerable groups in the community.
Implementation is influenced by local conditions. Although the design of an intervention may be theoretically driven, the way it
is eventually implemented depends greatly on local conditions.
Unlike medical interventions, SCSs for problem drug users and other marginalized populations draw on resources from and the
cooperation of a wide variety of agencies and funding streams. Such interventions are shaped by many outside influences,
including financial, social, organizational, history of inter-agency relationships, willingness to work together a web of
contingencies.
Can influence results in two ways:
1. Perfect conditions do not resemble those of (subsequent) implementation in less favorable consitions;
2. In contrast, problems in research consortium, e.g. defaulting on earlier agreements (e.g. RE: funding, staffing, time
allocation), may influence the trial outcomes.
“Even if a neutral and equivalent environment could be created at the beginning of an experiment, there is no guarantee that it
would endure. Social environments are both complex and dynamic.”
Dose interventions are shaped by practical issues and personality factors.
Responsible, competent, trustworthy and pleasant staff who are professionally well connected is perhaps the most vital part of
an intervention.
Example PROZE: Staff from Social Care Orgs were much more enthusiastic over and involved in the intervention than those of
mental health org. (We had coaches taking a vacation day when the PROZE meeting was scheduled without giving notice…)
29
33. Therefore, in the last part of my presentation I will discuss Process evaluation and
Qualitative sub-studies.
A process evaluation is a systematic methodology for studying the performance of an
intervention or research project, based on the clear guidelines and criteria. A process
evaluation aims to provide a (i) comprehensive picture of the implementation and
practice of an intervention; supports in optimizing (ii) the intervention and of (iii) the
organization of the research; and (iv) helps in the interpretation of intervention
effects.
An example is the PREFFI (Molleman et al, 2003). Preffi stands for Prevention Effect
Management Instrument. PREFFI is a quality tool that supports a systematic approach
and aims to increase effectiveness, consisting of eight clusters.
32
34. The 8 clusters of the Preffi describe essential factors that contribute to optimal
implementation of a research project.
A problem analysis provides insight into the view of all those involved in the issue
under study.
The second cluster focusses on factors underlying these problematic issues.
The third cluster concentrates on the way the target population of the intervention
perceives the intervention in terms of motivation and expectations.
Cluster four examines the aimed targets of the intervention.
The following two clusters focus on intervention development and implementation of
the research project (such as evaluation of supporting research materials).
Cluster seven put focusses on evaluates the quality of the intervention, promoting
both effect evaluation as well as process evaluation.
Finally, the eighth cluster describes preconditions and feasibility. The preconditions
(such as time, money, available expertise, involvement of cooperation partners)
determine partly the feasibility of a project. These clusters serve as a guiding principle
for the systematic construction of key questions in interviews and topic lists of
observations and in document analysis.
33
36. Triangulation of multiple data sources increases the data reliability and data validity
(Green & Thorogood, 2004; Peters, 2003
In the Dutch RCT that I drew some examples from before, we were able to diagnose
most of the problems that I discussed through the use of the PREFFI and careful
monitoring of the research process, as well as regular meetings and debriefings with
the staff involved of the implementing organisations.
Nonetheless, given the mismatch of the three core assumptions of the RCT model
with the reality of SCIs, I am not longer sure that the RCT model – surely in its
standard format – is the ideal choice in evaluating SCIs.
35
37. According to Wolff, SCIs violate the core assumptions of the simple RCT model. This
raises important questions about the validity, reliability and generalizability of SCI
trials results. Such services and their environments are simply at odds with the
assumptions of the RCT design in myriad ways. Even with perfect randomisation, SCIs
produce ‘noise’ between the dose and the effect, between the sample and the
population, and both concern considerable threats to the validity, reliability and
generalizability of RCT findings.
Without major design innovations, effectiveness research of SCIs using the simple
RCT model is probably unlikely to yield valid, reliable and generalizable findings.
Improved RCT designs must reflect and control the complexity of interventions in the
“Real World.” As a result, these will become more complex in design, intending to
increase their sensitivity to issues of selection bias, unmeasured variables, the role of
exogenous influences and endogeneity (potentially attractive but wrongful paths of
causation). Wolff gives 10 recommendations for improvement, which, if you are
considering conducting a trial of a complex intervention, I urge you to check, as well
as related literature.
In the last part of this lecture I have focused on the value of conducting structured
process evaluation and additional qualitative sub-studies as part of a trial. You would
think – given the value of a structured process analysis for understanding the actually
implemented intervention and research process vs. what is formally described in the
research proposal and implementation protocols; or its utility to organizing the study
and optimising the intervention; and its help in interpreting trial results – one would
36
38. think that process evaluation would be broadly applied in RCT designs fo SCIs.
Unfortunately, the absence of process analysis or qualitative substudies in RCTs is still
the norm, certainly in studies of people who use drugs, the homeless and other
marginalized populations.
36
39. Scientific evaluations of services only make sense when designed to meet the needs
and challenges of the services under scrutiny. The traditional RCT design of simple
clinical trials is not suitable for socially complex interventions. There is, indeed a lot of
woodwork to do.
Unfortunately, as Wolff writes, alternative options to the RCT are unlikely to perform
much better and she thinks that with the right modifications our best hope still rests
with the RCT design. But every modification is likely to make studies more time-
consuming and expensive.
I agree with most of Wolff’s recommendations, but, in the last part of my lecture, I
offered a few options towards what we may call “harm reduction of the application of
the RCT model to evaluating SCIs,” so that we, at least, better understand what may
(potentially) confound our RCT results. I also think that some of the friction between
the RCT model’s core assumptions and the complexity of SCIs and the real world in
which they operate may never be solved. I tend to disagree with Wolff, thinking that
there certainly is a place for less rigidly controlled studies.
Let me give you an example, putting 2 questions before you.
37
40. [click] I will give you one clue: the guy on the left is using a one cc syringe…
38
41. I do not have to explain the rationale of syringe exchange to you, I assume. But, why
not just give PWID Low Dead Space syringes?
39
44. Finally, I feel that the Department of Adiktologie at Charles University offers
important opportunities to further the development of methodologies that both
meet the needs of the customers of research – in the end that is the Tax payer; thus
the community – and that of science. The department’s transdisciplinary orientation
provides the “intellectual environment” for discussion, exchange and collaboration
between various scientific traditions, policy makers and community based
stakeholders in public health and social justice. Such transdisciplinary teams should
aim towards developing shared theory and innovative methodologies, which truly
reflect the complexity of contemporary society, as well as evidence-based AND
attainable recommendations, respectful of human rights that benefit both the
primary target groups of services and society as a whole.
Let me try to end this lecture by formulating my main recommendation for you in a
more presidential way [click]: “My fellow researchers, ask not how you can fit your
research question to the RCT model, ask how the RCT model fits the research
question!
[click] (Free after JF Kennedy – [click] who learned his most famous quote from his
old high school headmaster* ;)
*: http://www.dailymail.co.uk/news/article-2056020/JFK-stole-ask-country-speech-
old-headmaster.html
42
48. Not always in research, but always curious: Macedonia anecdote? … Fieldwork in just
about every country I worked in…
Observed home cooking of injectable drugs in Russia, Ukraine, Moldova, Kazakhstan.
46
50. Here again a picture (not mine) of frontloading; this time in Georgia and this time of a
drug that has gained a dark notoriety in the past years, Krokodil.
[click] A homemade drug associated with extreme local and systemic harms.
[click] Last year
48