Using Real World Data, an interview with Elise Bordet – RWD and Analytics Lead

Every month, Resolving Pharma interviews the stakeholders who shape the health and pharmaceutical industries of tomorrow. In this first interview, Elise Bordet honors us with her participation, many thanks for your time and your insights!

“Data access and analytics capabilities will become an increasingly important competitive advantage for pharmaceutical companies.”

Resolving Pharma] To begin with, could you introduce yourself and talk about your background? Why did you choose to work at the intersection of Data and Pharma?

[Elise Bordet] I am an agronomist, I did a PhD in Immunology-Virology and I then did an MBA before joining my current company. I am passionate about very technical and cutting-edge topics, and the implementation of new research approaches. I was very impressed by a conference on Artificial Intelligence and the notion of a 4th industrial revolution, I didn’t want to miss this subject.

I was very attached to fundamental research in the public sector, but I still wanted to form my own opinion about the pharmaceutical industry, and I am not disappointed at all. I think that it is a great place to contribute to research and the common good.

I love the ever-changing topics, where everything changes on a daily basis, where you always have to challenge yourself to stay updated on the latest innovations. Pharma, Data and AI subjects are heaven for me!

Can you tell us what Real World Data is and how the pharmaceutical industry uses it?

Real World Data is defined as data that is not collected in a randomized clinical trial. Therefore, it is a huge topic. It ranges from data collected in registries to larger databases such as medico-administrative databases.

This data allows the pharmaceutical industry to create drugs that are better adapted to the reality of Health systems. It also allows the creation of new research approaches, to support “drug repurposing” approaches for example.

How do Real World Evidence-based approaches differ from traditional pharmaceutical industry approaches? What are their added values?

Actually, these approaches have existed for a long time, particularly in Pharmacovigilance (the famous Phase IV). However, the amount of data available, its quality, our calculation and analysis capacities have been turned upside down. All these changes allow us to answer new research questions. Questions that remained unanswered because we did not have the capacity to look at what was happening in reality. The second subject is the major contributions of Artificial Intelligence: scientifically, we will be able to go much further.

In your opinion, how is the pharmaceutical industry going to balance the use of Real World Evidence with more traditionally generated clinical and pre-clinical data in the future?

Real World data will play an increasingly important role. Each type of data has its advantages and disadvantages. In fact, it is not a question of opposing data against each other, quite the contrary, the most interesting thing is to be able to bring all these data together and extract the most of information from them.

What impact could this type of data have on the drug value chain and the partnerships that the pharmaceutical industry needs to put in place?

Data access and analysis capabilities will become an increasingly important competitive advantage for pharmaceutical companies. The Data strategy of companies is one of the essential pillars. I imagine that in the future we will look not only at the value of a company’s portfolio, but also at the value and the impact of the analytics that can be performed by the company. Data is going to play so much on the projects’ probability of success that it is difficult to imagine not taking it into account in the metrics of economic valuation.

You recently gave a presentation on digital twin technology. Can you explain what it is?

Digital twin is a very elegant concept that can be summarized as follows: with each development, we generate new data we have to rely on for the next projects. This data should allow us to model most of the levels of biological organization: molecular, cellular, tissular and then at the scale of organs or even of organisms. This modeling will prevent replicating knowledge that has already been created and will notably allow us to accelerate pre-clinical and clinical development, and why not to model the first Phase I results very precisely.

How do you see the pharmaceutical industry in 30 years’ time?

Wow! Everything is going to be different! First of all, I think that, as in all industries, technology will have enabled a profound transformation of all decision making, what we call “data-driven decision making”. Science will have made incredible progress, calculation and prediction capacities will have been multiplied, there will be new approaches in Artificial Intelligence that we do not know today. We will have made immense progress in the interoperability of the various health databases that are fragmented today. It is a good exercise to try projecting ourselves in 30 years’ time. We won’t remember how we did things before, that’s the principle of technological revolutions; we’ve already forgotten how we lived without cell phones and the Internet! We will no longer see ourselves without Data and AI at the center of our decisions and projects. From a more organizational point of view, data sharing will have facilitated public and private scientific collaborations and the implementation of projects that will accelerate research, such as the Health Data Hub in France or the European Health Data Space that will be launched by the European Union.

Do you have any advice for someone who wants to work in Data Science in the Healthcare sector?

We scientists learned through doubt and are still haunted by it. Just because you have expertise in one field (clinical trials, laboratory research, etc.) does not mean that you cannot acquire other skills in Data Science or Artificial Intelligence, for example. Versatile profiles are and will be the most sought after. So my advice is: don’t panic!

If you can, start quickly to train yourself, the Internet puts us at a click of the best courses on programming, Data Science and many other advanced subjects, take advantage of it!

Go ahead and start tomorrow!

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Exploratory research Generalities

Dismantling the scientific publishers’ cartel to free innovation ?

” Text mining of academic papers is close to impossible right now. “

Max Häussler – Bioinformatics researcher, UCSC

Faced with the explosion of published scientific articles and the exponential increase in computing capacities, the way we will read the scientific literature in the future will probably have nothing to do with the tedious, slow, and repetitive current reading work and will undoubtedly involve more and more the use of intelligent text-mining techniques. By increasing tenfold our analytical capacities, these techniques make it possible – and will make it even easier in the future – to unleash creativity and bring about scientific innovation faster and cheaper. For the time being, however, this bright outlook faces a major obstacle: the scientific publishing cartel – one of the world’s most lucrative industries, which is determined to not jeopardize its enormous profits.

Text-mining and its necessity :

Text-mining is a technology that aims to obtain key and previously unknown information very quickly from a very large quantity of text – in this case the biomedical literature. This technology is multi-disciplinary in nature, using machine learning, linguistic and statistical techniques.

The purpose of this article is not to constitute a technical study of text-mining, but it is nevertheless necessary, for the full understanding of the potential of this technology, to describe its main steps :

  • Selection and collection of texts to be analyzed : This first step consists of using search algorithms to automatically download abstracts of interest from scientific article databases (such as PubMed, for example, which alone references 12,000,000 scientific articles). A search of the grey literature can also be conducted to be as exhaustive as possible.
  • Preparation of the texts to be analyzed : The objective of this step is to put the texts to be analyzed in a predictable and analyzable form according to the task to be accomplished. There is a whole set of techniques to carry out this step which will make it possible to remove the “noise” of the text and to “tokenize” the words inside the sentences.
  • Analysis of data from the texts : The analysis of the data will largely depend on the preparation of the text. Different statistical and data science techniques can be used: support vector machines, hidden Markov models or, for example, neural networks.
  • Data visualization : The issue of data visualization is probably more important than one might think. Depending on the chosen option: tables or 3D models, for example, the information and meta-information to which the user of the model has access will be more or less relevant and explanatory.

Text-mining has already proven its usefulness in biomedical scientific research: among other things, it has been used to discover associations between proteins and pathologies; to understand interactions between proteins or to elucidate the docking of certain drugs to their therapeutic target. However, most of the time, this technology is only implemented on the abstracts of articles, which considerably reduces its power in terms of reliability of the obtained data as well as the number of its applications.

So why not using the millions of scientific articles available online? New research hypotheses could be formulated, new therapeutic strategies could be created. This is technologically within reach, but scientific publishers seem to have decided differently for the moment. Here are some explanations.

The problems posed by scientific publishers :

At their emergence, at the end of the second world war, scientific publishers had a real utility in the diffusion of science: indeed, the various learned societies had only weak means to diffuse the work and conclusions of their members. At that time, the dissemination of published articles was done through the publication of paper journals, which were too expensive for most learned societies. Since the birth of this industry and despite the considerable changes in the means of transmission of scientific knowledge with the Internet, its business model has not evolved, becoming anachronistic and bringing its gross margins to percentages that make online advertising giants like Google or Facebook look like unprofitable businesses. Scientific publishers are indeed the only industry in the world that obtains the raw material (scientific articles) for free from its customers (scientists from all over the world) and whose transformation (peer-reviewing) is also carried out by its customers on a voluntary basis.


The triple-payment system set up by scientific publishers.

Scientific publishers have set up an “odd triple-payment system”, allowing private entities to capture public money intended for research and teaching. The States finance the research leading to the writing of scientific articles, pay the salaries of the scientists who voluntarily participate in the peer-reviewing and finally pay once again, through the subscriptions of universities and research laboratories, to have access to the production of scientific knowledge that they have already financed twice! Another model, parallel to this one, has also been developing for a few years, the author-pays model in which researchers pay publication fees in order to make their work more easily accessible to readers…are we heading towards a quadruple-pay system?

The deleterious consequences of the system put in place by scientific publishers are not only financial but also impact the quality of the scientific publications produced and therefore the validity of potential artificial intelligence models based on the data in these articles. The business model based on journal subscriptions leads publishers to favor spectacular and deeply innovative discoveries over confirmatory work, which pushes some researchers, driven by the race for the “impact factor”, to defraud or to publish statistically unconsolidated results very early on: This is one of the reasons of the reproducibility crisis that science is currently experiencing and also one of the possible causes of the insufficient publication of negative, yet highly informative, results: one out of every two clinical trials does not result in any publication.

Finally, and this is the point that interests us most in this article, scientific publishers are an obstacle to the development of text-mining on the huge databases of articles they possess, which has, in fine, a colossal impact on our knowledge and understanding of the world as well as on the development of new drugs. Indeed, it is currently extremely difficult to perform text-mining on complete scientific articles on a large scale because it is not allowed by the publishers, even when you have a subscription and are legally entitled to read the articles. Several countries have legislated so that research teams implementing text-mining are no longer required to seek permission from scientific publishers. In response to these legal developments, scientific publishers, taking advantage of their oligopolistic situation, have set up completely artificial technological barriers: for example, it has become impossible to download articles rapidly and in an automated way, the maximum rate imposed being generally 1 article every 5 seconds, which means that it would take about 5 years to download all the articles related to biomedical research. The interest of this system for scientific publishers is to be able to hold to ransom – the term is strong, but it is the right one – the big pharmaceutical companies who wish to remove these artificial technical barriers for their research project.

The current system of scientific publications, as we have shown, benefits only a few companies at the expense of many actors – researchers from all over the world, and even more when they work from disadvantaged countries, governments and taxpayers, health industries and finally, at the end of the chain, patients who do not benefit from the full potential of biomedical research. Under these conditions, many alternatives to this model are emerging, some of which are largely made possible by technology.

Towards the disruption of scientific publishing ?

” You only really destroy what you replace. “

Napoléon III – 1848

Doesn’t every innovation initially come from a form of rebellion? This is especially true when it comes, so far, to the various initiatives undertaken to unleash the potential of free and open science, as these actions have often taken the form of piracy operations. Between manifestos and petitions, notably the call for a boycott launched by Mathematics researcher Timothy Gowers, based on the text “The cost of knowledge”, the protest movements led by scientists and the creation of open-source platforms like have been numerous. However, few actions have had as much impact as those of Aaron Swartz, one of the main theorists of open source and open science, who tragically commit suicide at the age of 26, one month before a trial during which he was facing 35 years of imprisonment for having pirated 4.8 million scientific articles, or of course, those of Alexandra Elbakyan, the famous founder of the Sci-Hub website, which allows free – and illegal – access to most of the scientific literature.

Aaron Swartz and Alexandra Elbakyan

More recently, the proponents of the open-source movement have adapted to the radical turn of text-mining, notably through Carl Malamud’s project, aiming to take advantage of a legal grey area to propose to academic research teams to mine the gigantic database of 73 million articles he has built. The solution is interesting but not fully completed, this database is for the moment not accessible from Internet for legal reasons, it is necessary to travel to India, where it is hosted, to access it.

These initiatives operate on more or less legal forms of capturing articles after their publication by scientific publishers. In the perspective of a more sustainable alternative, the ideal would be to go up the value chain and therefore work upstream with researchers. The advent of the blockchain technology – a technology for storing and exchanging information with the particularity of being decentralized, transparent and therefore highly secure, on which future articles of Resolving Pharma will come back in detail – is thus for many researchers and thinkers of the subject a great opportunity to definitively replace scientific publishers in a system inducing more justice and allowing the liberation of scientific information.

The transformation of the system will probably be slow – the prestige accorded by researchers to the names of large scientific journals belonging to the oligopoly will persist over time – perhaps it will not even happen, but the Blockchain has, if successfully implemented, the capacity to address the issues posed earlier in this article in a number of ways :

A fairer financial distribution

As we have seen, the business model of scientific publishers is not very virtuous, to word it mildly. At the other end of the spectrum, Open Access, despite its undeniable and promising qualities, can also pose certain problems, being sometimes devoid of peer-reviewing. The use of a dedicated cryptocurrency for the scientific publishing world could eliminate the triple-payment system, as each actor could be paid at the fair value of their contribution. A researcher’s institution would receive a certain amount of cryptocurrency when he or she publishes as well as when he or she participates in peer-reviewing another paper. As for the institutions’ access to publications, it would be done through the payment of a cryptocurrency amount. Apart from the financial aspects, the copyright, which researchers currently waive, would be automatically registered in the blockchain for each publication. Research institutions will thus retain the right to decide at what price the fruits of their labor will be available. A system of this kind would allow, for example, anyone wishing to use a text-mining tool to pay a certain amount of this cryptocurrency, which would go to the authors and reviewers of the articles used. Large-scale text-mining would then become a commodity.

Tracking reader usage and defining a real « impact factor »

Currently, even if we try to count the number of citations to articles, the use of scientific articles is difficult to quantify, although it could be an interesting metric for the different actors of the research ecosystem. The Blockchain would allow to precisely trace each transaction. This tracing of readers would also bring a certain form of financial justice: one can imagine that through a Smart Contract, a simple reading would not cost exactly the same amount of cryptocurrency as the citation of the article. It would thus be possible to quantify the real impact of a publication and replace the “impact factor” system by the real-time distribution of “reputation tokens” to scientists, which can also be designed in such a way as not to discourage the publication of negative results (moreover, in order to alleviate this problem, researchers have set up a platform dedicated to the publication of negative results:

With the recent development of Non-Fungible Tokens (NFT), we can even imagine tomorrow the emergence of a secondary market for scientific articles, which will be exchanged from user to user, as is already possible for other digital objects (video game elements, music tracks, etc.).

A way to limit fraud

Currently, the peer-reviewing system, in addition to being particularly long (it takes on average 12 months between the submission and the publication of a scientific article, compared to two weeks on a Blockchain-based platform such as ScienceMatters), is completely opaque to the final reader of the article, who has no access to the names of the researchers who took part in the process, nor even to the chronological iterations of the article. The Blockchain could allow, through its unforgeable and chronological structure, to record these different modifications. This is a topic that would deserve another article on its own, but the Blockchain would also allow to record the different data and metadata that led to the conclusions of the article, whether it is for example preclinical or clinical trials, and thus avoid fraud while increasing reproducibility.

Manuel Martin, one of the co-founders of Orvium, a Blockchain-based scientific publishing platform, believes: “by establishing a decentralized and competitive marketplace, blockchain can help align the goals and incentives of researchers, funding agencies, academic institutions, publishers, companies and governments.”

The use of the potential of artificial intelligence in the exploitation of scientific articles is an opportunity to create a real collective intelligence, to make faster and more efficient research happen and probably to cure many diseases around the world. The lock that remains to be broken is not technological but organizational. Eliminating scientific publishers from the equation will be a fight as bitter as it is necessary, which should bring together researchers, governments and big pharmaceutical companies, whose interests are aligned. If we can be relatively pessimistic about the cooperation capacities of these different actors, we cannot doubt the fantastic power of transparency of the Blockchain which, combined with the determination of some entrepreneurs like the founders of Pluto, Scienceroot, ScienceMatters or Orvium platforms, will be a decisive tool in this fight to revolutionize the access to scientific knowledge.

The words and opinions expressed in this article are those of the author. The other authors involved in Resolving Pharma are not associated with it.

To go further :
Stephen Buranyi ; Is the staggeringly profitable business of scientific publishing bad for science? ; The Guardian ; 27/06/2017;
The Cost of Knowledge :
Priyanka Pulla ; The plan to mine the world’s research papers ; Nature ; Volume 571 ; 18/07/2019 ; 316-318
Bérénice Magistretti ; Disrupting the world of science publishing ; TechCrunch ; 27/11/2016
Daniele Fanelli ; Opinion : Is science really facing a reproducibility crisis, and do we need it to ? ; PNAS March 13, 2018 115 (11) 2628-2631; first published March 12, 2018;
D.A. Eisner ; Reproducibility of science: Fraud, impact factors and carelessness ; Journal of Molecular and Cellular Cardiology, Volume 114, January 2018, Pages 364-368
Chris Hartgerink ; Elsevier stopped me doing my research ; 16 Novembre 2015 ;
Joris van Rossum, The blockchain and its potential and academic publishing, Information Services & Use 38 (2018) 95-98 ; IOS Press
Douglas Heaven, Bitcoin for biological literature, Nature, 7/02/2019/ Volume 566
Manuel Martin ; Reinvent scientific publishing with blockchain technology ;
Sylvie Benzoni-Gavage ; The Conversation ; Comment les scientifiques s’organisent pour s’affranchir des aspects commerciaux des revues ;

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !


Eroom’s Law, the pharmaceutical industry of tomorrow and Resolving Pharma

372. It is the number of days between the discovery of the first Covid-19 case in Wuhan, and the vaccination of Margaret Keenan in central England, the first person to receive a dose of Covid vaccine after its commercialization. Never in its history has humanity been so quick to find a solution to a new disease. However, this dazzling success of the pharmaceutical industry should not blind us, the development of new drugs is becoming increasingly inefficient. More than ever, initiatives that enable the use of technology in the research and development of new drugs are essential to maintain innovation.

The pharmaceutical industry is broken. It will not be able, based on its current means and methods of R&D, to replicate the medical progress it has made in the past years. Each new molecule brought to the market will inevitably cost more to develop than the previous one. This is what Eroom’s Law states, describing empirically the decline in the efficiency of the pharmaceutical industry (1). For example, the R&D profitability of the world’s 12 largest pharmaceutical groups reached an all-time low of 1.9% in 2018, whereas it was still 10.1% in 2010 (2).

Figure 1 – An illustration of the decreasing efficiency of the pharmaceutical industry: every 9 years, the number of drugs approved by the FDA per billion dollars spent on R&D decreases by half (1).

Despite the many scientific advances we have witnessed recently (i.e. increase in the size of chemical libraries, identification of new therapeutic targets through DNA sequencing, three-dimensional protein databases, high-throughput screening, use of transgenic animals, etc) and the fact that they allow us both to produce more drug candidates and to select them with greater acuity, various structural problems in the pharmaceutical industry have led over the years to a considerable increase in the amount of money needed to bring a new molecule to the market.

A quick review of the literature helped us to identify some causes of this phenomenon :

  • The structurally incremental nature of the quality of each new product proposed by the pharmaceutical industry: to be marketed and reimbursed, each new drug must be superior or at least non-inferior to the drug corresponding to the treatment of reference for the targeted disease.
  • The gradual tightening of regulations, which is difficult to fight against and is even, for patients and health systems, most likely a good thing.
  • The tendency for pharmaceutical companies to over-invest unnecessarily, based on past returns on investment.
  • The concentration of research projects in therapeutic areas corresponding to unmet medical needs, with higher failure rates and less well understood biological mechanism3.

From an economic point of view, it is conceivable that the cost of capital (corresponding to the rate of return required by capital providers within a company with the regard to the remuneration they could obtain from an investment with the same risk profile on the market) becomes higher than the expected return on R&D : mechanically, available capital will decrease and companies will cut back on their research budgets, which will weaken the position of pharmaceuticals in the drug value chain.

Faced with this bleak outlook for the pharmaceutical industry, there are several answers to it: developing new models of collaboration with biopharmaceutical companies, subcontracting to specialized players, developing a policy of risk aversion, but also and above all, the development of new methods of innovation. This last point will be the focus of our attention.

Thus, Resolving Pharma will be interested, first through a newsletter, in documenting the various technologies that will improve and make the development of new therapeutics and treating patients more efficient. In response to this problem, Resolving Pharma will attempt to unite diverse and complementary actors around a bold and radical approach to innovation and entrepreneurship. Topics will include artificial intelligence, blockchain, quantum computing, 3D printing and many others. Each issue, through articles and interviews, will explore the different opportunities that these disruptive technologies bring or could bring to a particular field of therapeutic development. It will also highlight the emergence of the «PharmaTech» field, technology companies providing services to the pharmaceutical industry.

The fight against the fatality of Eroom’s law is huge and uncertain, but it is our responsibility as health professionals to fight it for the tens of millions of patients around the world, suffering from incurable diseases and for whom research and science are the only hope for a better future. Every long journey begins with a first step and Resolving Pharma is ours. We will see where it lead us.

(1) Scannell et al. «Diagnosing the decline in pharmaceutical R&D efficiency», Nature Reviews Drug Discovery, Volume 11/March 2012
(2) Unlocking R&D productivity – Measuring the return from pharmaceutical innovation 2028, Deloitte Centre for Health Solutions, 2019
(3) Pammolli et al. «The productivity crisis in pharmaceutical R&D» Nature Reviews Drug Discovery, Volume 10

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !