A More Useful Way of Evaluating New FDA Biosimilar Applications

A Conversation With Gillian Woollett, MA, DPhil, Senior Vice President, Avalere Health

In the conclusion of a two-part conversation with one of the real go-to experts in the biosimilar field and US regulatory process, we talk with Dr. Woollett about the need for a new approach by the FDA when evaluating biosimilar applications.

Biosimilars Review and Report: Gillian, you and your colleagues have been advocating the FDA to move away from its “totality of evidence” approach in its decision making on new biosimilar applications. You’ve dubbed your approach “confirmation of sufficient likeness” (CSL).

Gillian Woollett, MA, DPhil: We believe that it’s much closer to how we should be evaluating biosimilars going forward and capitalizes on the experience gained to date in the highly regulated markets. We have to be more efficient in the interests of patients and their access to critical biologics, including biosimilars. And our approach means no reduction in the quality, safety, or efficacy of the biologics approved.

Gillian Woollett

BR&R: What are the main differences between your new paradigm and the FDA’s totality of evidence approach?

Woollett: There is a semantic difference, but it’s also conceptual. And to be fair to the FDA, it’s what they’ve actually been doing in some cases already.

The current approach presupposes residual uncertainty in a manner that we don’t apply to comparability in support of manufacturing changes for currently approved biologics. This leads to the presumption that what you don’t know for a biosimilar makes it different from the reference product. We tried to turn it around, which is to say that you’re manufacturing a drug that you expect to be essentially the same. By so doing, you are actually building up your confidence, hence confirmation, of sufficient likeness. As you complete each step in biosimilar development, you’re confirming that the target product profile is actually what you expect it to be and the drug performs the way you expect it to do. The same molecule inevitably behaves the same way. That is the premise of comparability that is now being applied, as a regulatory scientific matter, to enable biosimilars. After all, the only difference is a different sponsor. Bottom line: The science is the same, so let’s use it better in regulatory practice.


Woollett: In our paper, we evaluated what actually happened in Europe, US, Australia, and Canada. We reviewed the outcomes of all those biosimilar applications, finding a very clear pattern that if a biosimilar candidate’s pharmacokinetic (PK) data matches the reference product, the product will be approved. This leads to next question: Haven’t we already confirmed at that point the drug is sufficiently similar? What do we learn from subsequent clinical studies?

While we reached a conclusion that may be controversial, it is scientifically rock-solid, and no one has challenged our reasoning. Given that any additional clinical studies are less sensitive than the PK analysis, you actually know what the outcome will be. Therefore, additional clinical studies lack scientific validity, and this means, inevitably, that they also lack ethical validity. They are telling you nothing new and as such cannot comply with the Declaration of Helsinki.

Now while some may not be keen on this conclusion, no one has repudiated it scientifically. Efforts to date appear to dismiss the value of the PK study, even as European regulators have independently reached the same conclusion we have using a different data set.

BR&R: Well, the FDA itself has started moving away from the confirmatory studies, the large phase II and phase III evaluations.

Woollett: It’s a mixed bag. I think it was at the FDA Advisory Committee meeting for the first adalimumab biosimilar, the FDA said that the sponsor did an extra study that was not requested. The study involved a second indication, and the sponsor did the study of their own volition, for marketing purposes. Sadly, that is seen as necessary, because of the lack of understanding by physicians, but it is a very cynical response.

Part of the reason we came out with this paper is to say that expecting these additional studies has massive consequences. Quite apart from being unethical (which should be reason enough not to do them), they are expensive, they take a lot of time, and they are barriers to entry for the smaller companies that can’t afford investments of this magnitude, and they don’t result in better biosimilars. These large clinical studies are essentially meaningless, and they come at a price for competition and patient access.

BR&R: On a gut level, it seems reasonable to apply this approach to drug categories that have already been reviewed by the FDA. Can you easily use this CSL approach in biosimilars for new reference drug classes, like ustekinumab or aflibercept?

Woollett: There’s no need to rely on guts. Let’s stick to scientific fundamentals—you do a study because it is going to tell you something. If you’ve established that the PK studies have demonstrated a good match with the reference product, what should that next study test? On what scale? Are we reverting to a full, phase III? We always used to call them confirmatory—not phase III—studies. What is it confirming, if you don’t have a statistically meaningful sample size to demonstrate outcome differences? At the moment these additional studies fall between two stools—and that simply won’t do.

Another interesting consideration is the situation in which a great deal of real-world evidence (RWE) from Europe or other highly regulated markets exists, with good documentation of adverse events matching those of the reference. This should also be considered confirmatory, because it is actually a postmarketing study of real-world use, demonstrating that the product is behaving in the expected manner.

That can come with risks for the originator, too. For example, the trastuzumab issue that has been in the news: The structure of the originator (Herceptin®) has varied sufficiently through glycosylation changes to apparently alter its immunological properties. Biosimilars that were produced to be comparable to the original version of Herceptin did not match in PK to other later versions of the reference. The glycosylation changes also led to clinical differences. This has been documented in the peer-reviewed literature.1-4

BR&R: That version of the originator trastuzumab was not even “biosimilar” to the original reference product?

Woollett: Apparently, the original sponsor has now changed back to a version that more closely matches the earlier version of their own product. There were seemingly clinical consequences in terms of reduced efficacy, so the reason for biosimilar’s failure of the PK testing was that it was more effective than the altered reference, as in the biosimilar was actually as effective as the original Herceptin. My understanding is that the reference is now back to where it was, and that the PKs of the biosimilars match. 

When its sponsor wanted to produce a more concentrated version of Humira® that does not contain citrate, the FDA noted that this new version failed a PK comparison with the original version. It was approved. It didn’t fail by very much, according to the FDA’s review, but it is still arguably a lesser standard than we expect of biosimilars. I don’t know if it’s clinically meaningful for adalimumab, but studies showed the changes to be clinically meaningful for trastuzumab. Hence, I’m reaching the point of saying, “We’ve seriously got to consider the use of PK in comparability.” And that may be even more heretical than our ideas on biosimilars—but we must be consistent, and we must use sound science as the basis for all our regulatory decisions. My drumbeat is consistency, consistency, consistency.

And that then brings us back to CSL: you should never do any study, let alone a clinical study, if the result doesn’t tell you something. But in all cases, with all biologics, you had better do the studies that tell you most—and that may indeed be PK. Indeed the European regulators have termed it the “gatekeeper assay.” That might just be right even more broadly.


BR&R: From an economic perspective, changing to the CSL approach could save prospective biosimilar manufacturers millions of dollars.

Woollett: That is likely true, but the science demands it, as do requirements for ethical clinical studies.

It’s not just the money, it’s also time and that may enable more, as well as potentially even more cost-effective, biologics available to patients sooner, too.

BR&R: Has anyone analyzed the potential time savings in bringing a biosimilar product to market using the CSL approach?

Woollett: We didn’t do that, but the numbers could become really interesting if we combined it with the use of a global reference. At that point, we are beginning to approach the concept of a global dossier. This is what we expect today for originator biologics—as we should. But the same should apply to biosimilars to those originators, surely. Again, our real theme here is consistency.

Meanwhile we do support the regulators continuing to be agnostic as to the business model of any sponsor. We are also suggesting that they should be efficient with every product; reassuring or “feel good” studies really won’t do. Indeed, I believe the biggest opportunity for the regulatory efficiency we advocate is with originator products, worldwide of course.


BR&R: Well, that’s a perfect segue to a discussion about global comparators. We alluded earlier to the two versions of the reference product licensed separately in the EU and US. These two molecules were approved using essentially the same investigational data. To an extent, the EU and US originator versions are expected to be “biosimilar” to each other. This is not proven unless “bridging studies” are performed, comparing these reference licensed versions head to head. Yet no one argues that an originator agent in the US is associated with different outcomes than the one used in the EU.

It seems to be pointless today to require the bridging studies between the two. How do you position and select the global comparator?

Woollett: There is never an expectation of clinical differences within a single BLA. As such anything that the regulators have accepted within a single application is still contributing to one set of “goalposts.” The, fortunately extremely rare, comparability failures, including Eprex® (epoetin alfa) in the early 2000s, the trastuzumab issue, and the different adalimumab versions we just discussed, are not an indictment of the power of analytics plus PK. They are just examples where the approach wasn’t followed for the originator products and where we may need to be more careful in the future.

Just like with Rituxan® in Schiestl’s 2011 paper, a change in glycosylation for Herceptin® led to a change in antibody-dependent cellular cytotoxicity (ADCC).1–4 This is an omission in the use of analytics to establish high similarity, not evidence against the use of analytics as the primary basis for comparability and biosimilarity.

The single biggest message I have on all of this is that biosimilars cannot succeed until we are clearer about how much the originators have varied. It is the originators that have broadened the “goalposts” that the biosimilars must target. And that is, in the vast majority of cases, perfectly OK.

Many physicians and their patients appear to think the originator is cast in stone. But all biologics are essentially “biosimilars” to themselves as a scientific matter. Some stakeholders are not comfortable with that message being out there, and yet it is a good one for the continued use of comparability. That is what was meant in Schiestl’s brilliant title: “Acceptable Quality Variation in Approved Biologics.” This variation is acceptable and is carefully overseen by regulators—and has been since 1996 when FDA led the world with the concept.

BR&R: One more practical aspect on the use of a global comparator: Would it be a hell of a lot easier for prospective manufacturers to get samples for the product?

Woollett: In all likelihood yes, and make it more affordable, because the US versions are usually the most expensive. But that is not the biggest reason. By far the most important reason is that it is scientifically and ethically right to minimize unnecessary clinical studies, and to be efficient in the development of all medicines, always. Science-based regulators have to want that too, as do all other stakeholders. Let’s get there together.


1. Pivot X, Bondarenko I, Nowecki Z, et al. Phase III, randomized, double-blind study comparing the efficacy, safety, and immunogenicity of SB3 (trastuzumab biosimilar) and reference trastuzumab in patients with neoadjuvant therapy for human epidermal growth factor receptor 2–positive early breast cancer. J Clin Oncol. 2018;36:968-974.
2. Lee JH, Paek K, Moon JH, Ham S, Song J, Kim S. Biological characterization of SB3, a trastuzumab biosimilar, and the influence of changes in reference product characteristics on the similarity assessmentBioDrugs[published online, June 12, 2019].
3. Pivot X, Pegram MD, Cortes J, et al. Evaluation of survival by ADCC status: Subgroup analysis of SB3 (Trastuzumab Biosimilar) and reference trastuzumab in patients with HER2-positive early breast cancer at three-year follow-up. Presented at American Society of Clinical Oncology Annual Meeting 2019; May 31-June 4, 2019; Chicago, IL. Abstract 580.
4. Pivot X, Pegram M, Cortes J, et al. Three-year follow-up from a phase 3 study of SB3 (a trastuzumab biosimilar) versus reference trastuzumab in the neoadjuvant setting for human epidermal growth factor receptor 2–positive breast cancer. Eur J Cancer. 2019;120:1-9.