— If we can melt the ice-caps then surely we can understand our own fallibility, says one UK cardiologist.
The strongest known force in the universe is the ability of the human mind to deceive itself. If we know something to be true then we find the evidence to prove it.
There is no better example of this than renal denervation (RDN). There was near universal agreement among hypertension experts and interventional cardiologists that RDN, with its 20-30 mm Hg blood pressure drops in multiple unblinded trials, represented a “major breakthrough” that could “cure” the unmanageable problem of resistant hypertension. Instead, in 2014 they ran into a brick wall of reality, the SYMPLICITY HTN-3 trial. This was the first true test of the new technology, as it was the only blinded and randomized controlled trial of the procedure.
Now, two of the best-known RDN investigators, Deepak Bhatt and Henry Krum, have collaborated with a group in the UK who were the first to predict the failure of SIMPLICITY HTN-3. In their paper, published in Circulation: Cardiovascular Quality and Outcomes, the authors analyze the extensive published literature on RDN in order to determine the probable cause for the dramatic mismatch between initial great expectations and the dismal result in SIMPLICITY HTN-3.
“Never before has so many trials been carried out in such a short period of time with such a wide variety of designs, and on a therapy whose true effect size is so small, so as to make it this easy to quantify the different forms of bias,” said Darrel Francis (Imperial College London), the senior author of the paper. “This may be a once-in-a-lifetime opportunity to study how fallible we are, despite having split the atom, travelled to the moon, and melted the ice-caps.”
The paper concludes that “adding a control group without blinding is far less helpful in resisting bias than commonly assumed.” The largest sources of bias come from the knowledge– of both patients and physicians– of the treatment assignment. The authors say that device trials in the future “should not report effect sizes except by comparison with blinded placebo (sham) procedure, to avoid waste of research resources.”
In an email interview, Deepak Bhatt (Brigham & Women’s Hospital) who was co-principal investigator of SYMPLICITY HTN-3, discussed the need for sham controls in device trials. He said that sham controls were valuable but not necessary for some device trials:
It depends on the disease state, in my opinion. That is, I don’t think a sham control is necessary for a trial of primary PCI versus lysis in STEMI patients where mortality is the endpoint. But once subjectivity is introduced into either the endpoint or the ascertainment of the endpoint, then I would say some form of sham control is a good idea. Hypertension is an area where (as we describe in the paper) sham controls seem to be quite important. Another example might be PFO closure for relief of migraines, which have a subjective component, and for that reason some type of sham control procedure would be advisable. Now the degree of sham should balance risks to the patient. For example, in SYMPLICITY HTN-3 we had a sham control. That consisted of a renal angiogram (which the patient needed anyway prior to enrollment in the study), but we did not actually take the catheter down into the renal artery and risk dissecting it in the control arm – so there are levels of sham control that balance good science with patient safety. Device trials are trickier than drug trials, where a double-blind is almost always possible (though sometimes not done for reasons of cost or logistics).
Meta-Analysis of Renal Denervation Trials
In their meta-analysis the authors of the paper used the differences in trial designs in 140 nonrandomized trials, 6 randomized open-label trials, and 2 randomized blinded studies to identify and quantify the sources of bias leading to the wild overestimation of the effect of RDN. The authors found that regression to the mean, which is often thought to play a major role in the overestimation of effect, actually played only a small role in the RDN story. More important, according to the authors, was the tendency of physicians and patients to alter their behavior based on their knowledge of treatment assignment. Physicians tended to remeasure blood pressure when it did not accord with their expectations. “After an efficacious intervention, but seeing no fall,” the doctors “would remeasure rather than document a seemingly incorrect value,” the authors write. For patients, they write, “undergoing a new intervention causes patients to increase adherence to already prescribed antihypertensive therapy.” In other words, “this might also be termed the nondenervation effect of the denervation procedure, which includes the placebo effect.”
In his comments, Francis said that “renal denervation has been the world’s largest study of the typical magnitude of unintentional bias by specialists.” Estimating a total development cost of approximately $1 billion, Francis said that “we owe it to our patients to learn that lesson, since ultimately they have paid for it, to the tune of $10,000 per doctor based on a worldwide cohort of 100,000 cardiologists.”
David Kandzari (Piedmont Heart Institute), the other co-principal investigator of SYMPLICITY HTN-3, was not a co-author of the paper. He expressed some small but important differences with the authors:
For present renal denervation trials, there is already agreement among trialists and regulators that blinded, sham-controlled trials that rely on ABPM represent the model for existing studies. But the reasons may not entirely coincide with those posed in this analysis. First, the exaggerated difference between office SBP and ABPM may not represent a ‘bias’ or ‘check once more phenomenon’ inasmuch as the simple inaccuracy associated with a single moment assessment of SBP. Whereas office SBP represents a single measure in daytime, and may be highly varied relative to timing of taking medications, ABPM includes (lower) nighttime blood pressure, averages a 24 hour period, and is a result to which both patient and investigator are blinded. Daytime ABPM and office SBP often do move in parallel, but the relationship between 24 hr ABPM and office SBP is more complex.
Second, there is an inherent assumption that the risk of the control group is both negligible and predictable. This is not the case, as evidenced by the SPRINT and SYMPLICITY HTN 3 trials, respectively.
Third, one of the confounders not readily addressed by the authors is the impact of home BP monitoring. When patients take their blood pressure, and it is high, they take their medications. When their blood pressure is low, their behavior is opposite. (And, the increasing number of medications is predictive of not taking medications.) Because of this, there is an opportunity for values to regress to the mean and mitigate the impact of a potentially efficacious treatment.Finally, a blinded, sham controlled trial might be best suited for current RDN trials, in part also because the potential benefit is not readily apparent. What if instead there were a therapy with an immediate, predictable reduction in BP, and in addition to BP reduction, the result of the treatment in some way resulted in the patient’s awareness of treatment (eg, patients were advised of a commonly anticipated side effect)? As we may see with other developing technologies, this might really challenge sham and blinding models, and may call for us to revise how we design hypertension trials.
Finally, hypertension experts Franz Messerli (Mt. Sinai Icahn School of Medicine) and Sripal Bangalore (New York University), appear to agree with the authors about the necessity for sham procedures:
When the findings of SYMPLICITY 3 were published in 2009 Dr. Gottlieb wrote a rather inflammatory comment in the Wall Street Journal under the heading “The FDA Wants You for Sham Surgery”. Among other items he stated: “There are better ways to test medical devices than by having patients be placebos who get fake operations.” Yet ironically SYMPLICITY 3 was exactly the trial that brought the widespread practice of renal denervation to a screeching halt because there was no difference between active treatment and the sham group.
As of now there is still uncertainty whether renal denervation may ultimately turn out to be a beneficial antihypertensive treatment, not only lowering the surrogate endpoint, i.e. blood pressure but also reduce the risk of heart attack, stroke and death. There is no uncertainty however, that without a sham controlled study we would have continued to expose thousands of patients to potential harm of invasive, possibly ineffective treatment.
Clearly ethical considerations should be weighed very carefully when considering a sham controlled trial. However, consideration should also be given to the many patients who are possibly harmed by monetarily incentivized unproven treatment strategies. The current study is an attempt to quantify biases in non-sham controlled trials and the authors estimate that the biases can total ~19 mm Hg, a number good enough for drug/devices to be approved and be a blockbuster.