Entrepreneurship Exploratory research Generalities

Introduction to DeSci

How Science of the Future is being born before our eyes

« [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and limbs of actual human beings » Jessica Sacher, Phage Directory co-founder

In a previous article, one of the very first published on Resolving Pharma, we looked at the problems posed by the centralizing role of scientific publishers, which in addition to raising financial and ethical issues, is a brake on innovation and scientific research. At that time, in addition to making this observation, we proposed ways of changing this model, mainly using NFTs and the Blockchain. For several months now, and thanks to the popularization of Web3 and DAOs, initiatives have been emerging from the four corners of the world in favour of a science that facilitates collective intelligence, redesigns the methods of funding and scientific publication and, ultimately, considerably reduces the path between the laboratory and patients. It is time to explore this revolution, which is still in its infancy, and which is called DeSci for Decentralized Science.

The needed emergence of DeSci

One story that illustrates the inefficiencies of current science is often taken as an example in the DeSci world: that of Katalin Kariko, a Hungarian biochemist who carried out numerous research projects from the 1990s onwards (on in vitro-transcribed messenger RNA) which, a few decades later, would be at the origin of several vaccines against Covid-19. Despite the innovative aspects of Kariko’s research, she was unable to obtain the research grants necessary to pursue her projects because of political rivalry: the University of Pennsylvania, where she was based, had chosen to give priority to research on therapeutics targeting DNA directly. This lack of resources led to a lack of publications, and K. Kariko was demoted in the hierarchy of her research unit. This example shows the deleterious consequences of centralized organization on funding allocation (mainly from public institutions and private foundations) and on the reputation of scientists (from scientific publishers). 

How many researchers spend more time looking for funding than working on research topics? How many applications do they have to fill in to access funding? How many promising but too risky, or unconventional, research projects are abandoned for lack of funding? How many universities pay scientific publishers a fortune to access the scientific knowledge they themselves have helped to establish? How many results, sometimes perverted by the publication logic of scientific journals, turn out to be non-reproducible? With all the barriers to data exchange related to scientific publication, is science still the collective intelligence enterprise it should be? How many scientific advances that can be industrialized and patented will not reach the market because of the lack of solid and financed entrepreneurial structures to support them (although considerable progress has been made in recent decades to enable researchers to create their own start-ups)? 

DeSci, which we could define as a system of Science organization allowing, by relying on Web3 technologies and tools, everyone to finance and take part in research and scientific valorization in exchange for a return on investment or a remuneration, proposes to answer all the problems mentioned above. 

This article will first look at the technical foundations of Decentralized Science and then explore some cases in which decentralization could improve Science efficiency.

Understanding Web3, DAOs and Decentralized Science

In the early days of the Web, there were very high barriers to entry for users wishing to post information: before blogs, forums and social networks, one had to be able to write the code for one’s website or pay someone to do it in order to share content. 

With the advent of blogs and social networks, as we mentioned, Web2 took on a different face: expression became considerably easier. On the other hand, it has been accompanied by a great deal of centralization: social networking platforms now possess the content that their users publish and exploit it commercially (mostly through advertising revenue) without paying them a cent.

Web3 is a new version of the Internet that introduces the notion of ownership thanks to the Blockchain. Indeed, whereas Web2 was built on centralized infrastructures, Web3 uses the Blockchain. Data exchanges are recorded in a Blockchain and can generate a remuneration in cryptocurrencies with a financial value but also giving, in certain cases, a decision-making power on the platforms used by the contributors. Web 3 is therefore a way of marking the ownership of content or easily rewarding a user’s action. Web 3 is without doubt the most creative version of the Internet to this day. 

Finally, we cannot talk about Web3 without talking about Decentralized Autonomous Organizations (DAOs). These organizations are described by Vitalik Buterin, the iconic co-founder of the Ethereum blockchain, as: “entities that live on the Internet and have an autonomous existence, while relying on individuals to perform the tasks it cannot do itself”. In a more down-to-earth way, they are virtual assemblies whose rules of governance are automated and transparently recorded in a blockchain, enabling its members to act collectively, without a central authority or trusted third party, and to take decisions according to rules defined and recorded in smart contracts. Their aim is to simplify and make collective decisions-making and actions more secure, transparent and tamper-proof. DAOs have not yet revealed their full potential, but they have already shown that they can operate as decentralized and efficient investment funds, companies or charities. In recent months, science DAOs have emerged, based on two major technological innovations.

The technological concepts on which DeSci relies on: 

To understand the inner workings of DeSci and especially its immense and revolutionary potential, it is important to clarify two concepts, which are rather uncommon in the large and growing Web3 domain, but which lie at the heart of a number of DeSci projects:

  • IP-NFTs: The concept of IP-NFTs was developed by the teams of the company Molecule (one can find their interview on Resolving Pharma). It is a meeting point between IP (intellectual property) and NFTs (non-fungible tokens): it allows scientific research to be tokenized. This means that a representation of a research project is placed on the Blockchain in the form of an exchangeable NFT. A legal agreement is automatically made between the investors (buyers of the NFT) and the scientist or institution conducting the research. The owners of the NFT will be entitled to remuneration for licensing the intellectual property resulting from the research or creating a start-up from this intellectual property.

Figure 1 – Operating diagram of the IP-NFT developed by Molecule (Source:

  • Data-NFTs: Many Blockchain projects are concerned with Data ownership , but one of the most successful project is Ocean Protocol.  A Data-NFT represents a copyright (or an exclusive licence) registered in the Blockchain and relating to a data set. Thus, it is possible for a user to exploit its data in several ways: by charging other users for temporary licences, by selling its datasets or by collectivizing them with other datasets in a “Data Union”.

These two concepts make it possible to make intellectual property liquid, and thus to create new models of financing and collaboration. To take a simple example, a researcher can present a project and raise funds from investors even before a patent is filed. In exchange, the investors have an IP-NFT that allows them to benefit from a certain percentage of the intellectual property and revenues that will potentially be generated by the innovation. 

Let’s now turn to some DeSci examples.

Transforming scientific reviewing

When researchers want to communicate to the scientific community, they write an article and submit it to scientific publishers. If the publishers accept the research topic, they will seek out other researchers who verify the scientific validity of the article, and a process of exchange with the authors ensues: this is called peer-reviewing. The researchers taking part in this process are not paid by the publishers and are mainly motivated by their scientific curiosity.

This system, as it is currently organized – centrally, gives rise to several problems:

  • It takes a long time: in some journals, it takes several months between the first submission of an article and its final publication. This avoidable delay can be very damaging to the progress of science (but we will come back to this later in the article!). Moreover, given the inflation in the number of scientific articles and journals, the system based on volunteer reviewers is not equipped to last in the future.
  • The article is subject to the bias of the editor as well as the reviewers, all in an opaque process, which makes it extremely uncertain. Studies have shown that by resubmitting a sample of previously published papers and changing the names and institutions of the authors, 89% of them were rejected (without the reviewers noticing that the papers were already published)
  • The entire process is usually opaque and unavailable to the final reader of the paper.

Peer-reviewing in Decentralized Science will be entirely different. Several publications have demonstrated the possibility of using thematic scientific DAOs to make the whole process more efficient, fair and transparent. We can thus imagine that decentralization could play a role in different aspects: 

  • The choice of reviewers would no longer depend solely on the editor , but could be approved collectively.
  • Exchanges around the article could be recorded on the blockchain and thus be freely accessible.
  • Several remuneration systems, financial or not, can be imagined in order to attract quality reviewers. We can thus imagine that each reviewer could earn tokens allowing them to register in a reputation system (see below), to participate in the DAO’s decision-making process but also to participate in competitions with the aim of obtaining grants. 

Decentralized peer-reviewing systems are still in their infancy and, however promising they may be, there are still many challenges to be overcome, starting with interoperability between different DAOs.

Creating a new reputation system

The main value brought about by the centralized system of science is that of the reputation system of the actors. Why do you want to access prestigious schools and universities, and why are you sometimes prepared to go into debt over many years to do so? Having the name of a particular university on your CV will make it easier for you to access professional opportunities. In a way, companies have delegated some of their recruitment to schools and universities.  Another system of reputation, which we mentioned earlier in this article, is that of scientific publishers. Isn’t the quality of a researcher measured by the number of articles he or she has managed to have published in prestigious journals?

Despite their prohibitive cost (which allows scientific publishers to be one of the highest gross margin industries in the world – hard to do otherwise when you are selling something you get for free!), these systems suffer from serious flaws: does being accepted into a university and graduating accurately reflect the involvement you had during your studies and the skills you acquired through various experiences at the intersection of the academic and professional worlds? Is a scientist’s reputation proportional to his or her involvement in the ecosystem? Jorge Hirsch, the inventor of the H-index, which aims to quantify the productivity and scientific impact of a researcher according to the level of citation of his or her publications, has himself questioned the relevance of this indicator.  Peer-reviews, the quality of courses given, the support of young researchers and the real impact of science on society are not considered by the current system.

Within the framework of DeSci, it will be possible to imagine a system based on the Blockchain that makes it possible to trace and authenticate a researcher’s actions – and not just the fact of publishing articles – in order to reward him or her through non-tradable reputation tokens. The main challenge of this reputation system will be the transversality, the interoperability and  adoption by different DAOs. We can imagine that these tokens could be used to participate in votes (in the organization of conferences, in the choice of articles, etc.) and that they will themselves be allocated according to voting mechanisms (for example, students who have taken a course will be able to decide collectively on the number of tokens to allocate to the professor). 

Transforming the codes of scientific publication to bring out collective intelligence

Science is a collective and international work in which, currently, as a researcher, you can only communicate with other research teams around the world through:

  • Publications in which you cannot give access to all the data generated by your research and experiments (it is estimated that about 80% of the data is not published, which contributes to the crisis of scientific reproducibility)
  • Publications that other researchers cannot access without paying the scientific publishers (in the case of Open Science, it is the research team behind the publication that pays the publisher so that readers can access the article for free)
  • Publications which, because of their form and the problems linked to their access, make it very difficult to use Machine Learning algorithms which could accelerate research 
  • Finally, scientific publications which, because of the length of the editorial approval mechanisms, only reflect the state of your research with a delay of several months. Recent health crises such as COVID-19 have shown us how important it can be to have qualitative data available quickly.

The Internet has enabled a major transformation in the way we communicate. Compared to letters, which took weeks to reach their recipients in past centuries, e-mail and instant messaging allow us to communicate more often and, above all, to send shorter messages as we obtain the information they contain, without necessarily aggregating it into a complex form. Only scientific communication, even though most of it is now done via the Internet, resists this trend, to the benefit of scientific publishers and traditional forms of communication, but also and above all at the expense of the progress of science and patients in the case of biomedical research.

How, under these conditions, can we create the collective intelligence necessary for scientific progress? The company thinks it has the solution: micro-publications, consisting of a title designed to be easily exploited by an NLP algorithm, a single figure, a brief description and links giving access to all the protocols and data generated. 

Figure 2 – Structure of a micro-publication (Source:

This idea of micro-publications, if not directly linked to the Blockchain, will be, since it allows for the rapid and easy sharing of information, a remarkable tool for collective intelligence and certainly the scientific communication modality best suited to the coming era of Decentralised Science. The objective will not be to replace traditional publications but rather to imagine a new way of doing science, in which the narrative of an innovation will be built collectively throughout successive experiments rather than after several years of work by a single research team. Contradictory voices will be expressed, and a consensus will be found, not fundamentally modifying the classic model of science but making it more efficient.

Facilitating the financing of innovation and the creation of biotechnology start-ups

Today, the financing of innovation, particularly in health, faces a double problem: 

  • From the point of view of scientists and entrepreneurs: despite the development of numerous funding ecosystems, non-dilutive grants and the maturation of venture capital funds, the issue of fundraising remains essential and problematic for most projects. Many projects do not survive the so-called “Valley of Death”, the period before the start of clinical studies, during which raising funds is particularly complicated. 
  • On the investor side: It is particularly difficult for an individual to participate in the financing of research and biotech companies in a satisfactory way. 
  • It is possible to be a Business Angel and to enter early in the capital of a promising start-up: this is not accessible to everyone, as a certain amount of capital is required to enter a start-up (and even more so if one wishes to diversify one’s investments to smooth out one’s risk)
  • It is possible to invest in listed biotech companies on the stock market: the expectation of gain is then much lower, as the companies are already mature, and their results consolidated
  • It is possible to fund research through charities, but in this case, no return on investment is possible and no control over the funded projects can be exercised.
  • It is possible to invest through crowdfunding sites, but here again there are structural problems: the choice of companies is limited, and the investors are generally in the position of lenders rather than investors: they do not really own shares in the company and will be remunerated according to a predefined annual rate.

These days, one of the pharmaceutical industry’s most fashionable mantras is to put the patient at the center of its therapeutics, so shouldn’t we also, for the sake of consistency, allow him to be at the center of the systems for financing and developing therapeutics?

DeSci will allow everyone – patients, relatives of patients or simply (crypto)investors wishing to have a positive impact on the world – via IP-NFT, data-NFT or company tokenization systems to easily finance drug development projects whatever their stage, from the academic research of a researcher to a company already established. 

This system of tokenization of assets also makes it possible to generate additional income, both for the investor and for the project seeking to be financed:

  • The “Lombard loan” mechanisms present in DeFi will also allow investors to generate other types of income on their shares in projects. Indeed, DeFi has brought collateralized loans back into fashion: a borrower can deposit digital assets (cryptocurrencies, but also NFTs or tokenized real assets (companies, real estate, etc) in exchange for another asset (which represents a fraction of the value they deposited, in order to protect the lender) that they can invest according to different mechanisms specific to Decentralized Finance (we will not develop in this article). Thus, in a classic private equity system, the money invested in a start-up is blocked until the possibility of an exit and does not generate returns other than those expected due to the increase in the company’s value. In the new decentralized system, part of the money you have invested can be placed in parallel in the crypto equivalent of a savings account (let’s simplify things, this site is not dedicated to Decentralized Finance!)
  • Furthermore, another possibility for biotech projects, whether they are already incorporated or not, to generate additional revenues is to take advantage of the liquidity of the assets (which does not exist in the traditional financing system): it is quite possible to apply a tax of some % to each transaction of an IP-NFT or a data-NFT.

We are in a world where it is sometimes easier to sell a picture of a monkey for $3 or $4 million than to raise that amount to fight a deadly disease. It’s time to understand this and pull the right levers to get the money where it is – sometimes far off the beaten track. 

Conclusion: a nascent community, a lot of work and great ambitions

Despite the high-potential initiatives presented in this article, and the growing involvement of a scientific community throughout the world, DeSci is still young and has yet to be structured. One of the main ones, apart from the aspects related to the regulatory framework, will undoubtedly be that of education in the broadest sense, which is not yet addressed by the current projects. By using Web3 tools to reinvent the way in which a high-level curriculum can be built and financed (tomorrow you will be paid to take online courses – yes!), the DeSci will give itself the means to integrate the most creative and entrepreneurial minds of its time, in the same way that large incubators or investment funds such as Y Combinator or Tech Stars have relied on education to create or accelerate the development of some of the most impressive companies of recent years. The DeSci Collaborative Universities need to emerge, and the connection between Ed3 (education and learning in the Web3 era) and DeSci has yet to be implemented.

Figure 3 – Presentation of the embryonic DeSci ecosystem at the ETH Denver conference, February 17, 2022 (in the last 3 months, the burgeoning ecosystem has grown considerably with other projects)

Web 3.0 and DAOs have the great advantage of allowing people to be rewarded with equity, or the equivalent, for contributing their skills or financial resources to a project at any stage of its development.  Thus, in a decentralized world where skills and research materials are at hand, and where the interests of the individuals involved in a project are more aligned, the time between the emergence of an idea and its execution is significantly shorter than in a centralized world. This model, which can reinvent not only work but also what a company is, applies to all fields but is particularly relevant where collective intelligence is important and where advanced expertise of various kinds is needed, such as scientific research. 

In the same way that we can reasonably expect Bitcoin to become increasingly important in the international monetary system in the coming years and decades, we can expect DeSci, given its intrinsic characteristics and qualities, to become increasingly important in the face of what we may in the next few years call “TradSci” (traditionally organized Science). By allowing a perfect alignment of interests of its different actors, DeSci will probably constitute the most successful and viable large-scale and long-term collaborative tool of Collective Intelligence that Homo Sapiens will ever have. Whether it is the fight against global warming, the conquest of space, the eradication of all diseases, or the extension of human longevity, DeSci will probably be the catalyst for the next few decades of scientific innovation and, in so doing, will positively impact your life. Don’t miss the opportunity to be one of the first to do so!

Further reading: 
  • General information on DeSci: 
  • Understanding DAOs:
  • Understanding Web3:
  • On the IP-NFTs concept:
  • On the Data-NFTs concept:
  • On the decentralized peer-reviewing:
  • On the micro-publication concept:
  • On the decentralized construction and financing of Biotechs:
  • On the ED3:

Credits for the illustration of the article :
  • Background: @UltraRareBio @jocelynnpearl and danielyse_, Designed by @katie_koczera
  • Editing: Resolving Pharma

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Clinic Exploratory research Preclinical

Artificial intelligence against bacterial infections: the case of bacteriophages

« If we fail to act, we are looking at an almost unthinkable scenario where antibiotics no longer work and we are cast back into the dark ages of medicine » – David Cameron, former UK Prime Minister

Hundreds of millions of lives are at stake. The WHO has made antibiotic resistance its number one global priority, showing that antibiotic resistance could lead to more than 100 million deaths per year by 2050, and that it already causes around 700,000 deaths per year, including 33,000 in Europe. Among the various therapeutic strategies that can be implemented, there is the use of bacteriophages, an old and neglected alternative approach that Artificial Intelligence could bring it back. Explanations.

Strategies that can be put in place to fight antibiotic resistance

The first pillar of the fight against antibiotic resistance is the indispensable public health actions and recommendations aimed at reducing the overall use of antibiotics. For example :

  • The continuation of communication campaigns aimed at combating the excessive prescription and consumption of antibiotics (in France a famous slogan is: “Antibiotics are not automatic”?)
  • Improving sanitary conditions to reduce the transmission of infections and therefore the need for antibiotics. This measure concerns many developing countries, whose inadequate drinking water supply causes, among other things, many cases of childhood diarrhea.
  • Reducing the use of antibiotics in animal husbandry, by banning the addition of certain antibiotics to the feed of food-producing animals.
  • Reducing environmental pollution with antibiotic molecules, particularly in establishing more stringent anti-pollution standards for manufacturing sites in the pharmaceutical industry.
  • The improvement and establishment of comprehensive structures, for monitoring human and animal consumption of antibiotics and the emergence of multi-drug resistant bacterial strains.
  • More frequent use of diagnostic tests, to limit the use of antibiotics and to select more precisely which molecule is needed.
  • Increased use of vaccination

The second pillar of the fight is innovative therapeutic strategies, to combat multi-drug resistant bacterial strains against which conventional antibiotics are powerless. We can mention :

  • Phage therapy: the use of bacteriophages, natural predatory viruses of bacteria. Phages can be used in therapeutic cases where they can be put directly in contact with bacteria (in the case of infected wounds, burns, etc.) but not in cases where they should be injected into the body, as they would be destroyed by the patient’s immune system.
  • The use of enzybiotics: enzymes, mainly from bacteriophages like lysine, that can be used to destroy bacteria. At the time of writing, this approach is still at an experimental stage.
  • Immunotherapy, including the use of antibodies: Many anti-infective monoclonal antibodies – specifically targeting a viral or bacterial antigen – are in development. Palivizumab directed against the F protein of the respiratory syncytial virus was approved by the FDA in 1998. The synergistic use of anti-infective antibodies and antibiotic molecules is also being studied.

Each of the proposed strategies – therapeutic or public health – can be implemented and their effect increased tenfold with the help of technology. One of the most original uses of Artificial Intelligence concerns the automation of the design of new bacteriophages.

Introduction to bacteriophages

Bacteriophages are capsid viruses that only infect bacteria. They are naturally distributed throughout the biosphere and their genetic material can be DNA, in the vast majority of cases, or RNA. Their discovery is not recent and their therapeutic use has a long history, in fact, they started to be used as early as the 1920s in Human and Animal medicine. Their use was gradually abandoned in Western countries, mainly because of the ease of use of antibiotics and the fact that relatively few clinical trials were conducted on phages, their use being essentially based on empiricism. In other countries of the world, such as Russia and the former USSR, the culture of using phages in human and animal health has remained very strong: they are often available without prescription and used as a first-line treatment.

The mechanism of bacterial destruction by lytic bacteriophages

There are two main types of bacteriophages:

  • On the one hand, lytic phages, which are the only ones used in therapeutics and those we will focus on, destroy the bacteria by hijacking the bacterial machinery in order to replicate.
  • On the other hand, temperate phages, which are not used therapeutically but are useful experimentally because they add genomic elements to the bacteria, potentially allowing it to modulate its virulence. The phage cycle is called lysogenic.

The diagram below shows the life cycle of a lytic phage:

This is what makes lytic phages so powerful, they are in a “host-parasite” relationship with bacteria, they need to infect and destroy them in order to multiply. Thus, the evolution of bacteria will select mainly resistant strains, as in the case of antibiotic resistance, however, unlike antibiotics, which do not evolve – or rather “evolve” slowly, in step with the scientific discoveries of the human species – phages will also be able to adapt in order to survive and continue to infect bacteria, in a kind of evolutionary race between the bacteria and the phages.

The possible use of Artificial Intelligence

One of the particularities of phages is that, unlike some broad-spectrum antibiotics, they are usually very specific to a bacterial strain. . Thus, when one wishes to create or find appropriate phages for a patient, a complex and often relatively long process must be followed, even though a race against time is usually engaged for the survival of the patient: the bacteria must be identified, which implies sample cultivation from the patient, characterizing the bacterial genome and then determining which phage will be the most likely to fight the infection. Until recently, this stage was an iterative process of in-vivo testing, which was very time-consuming, but as Greg Merril, CEO of the start-up Adaptive Phage Therapeutics (a company which is developing a phage selection algorithm based on bacterial genomes), points out: “When a patient is severely affected by an infection, every minute is important.”

Indeed, to make phage therapy applicable on a very large scale, it is necessary to determine quickly and at a lower cost which phage will be the most effective. This is what the combination of two technologies already allows and will increasingly allow: high frequency sequencing and machine learning. The latter makes it possible to process the masses of data generated by genetic sequencing (the genome of the bacteriophage or the bacterial strain) and to detect patterns in relation to an experimental database indicating that a phage with a genome X was effective against a bacterium with a genome Y.  The algorithm is then able to determine the chances of success of a whole library of phages on a given bacterium and determine which will be the best without performing long iterative tests. As with every test-and-learn domain, phage selection can be automated.

In addition to the determination of the best host for a given bacteriophage (and vice versa) discussed below, the main use cases described for artificial intelligence in the use of phages are

  • Classification of bacteriophages: The body in charge of classification is the International Committee on Taxonomy of Viruses (ICTV). More than 5000 different bacteriophages are described and the main family is the Caudovirales. Traditional approaches to the classification of bacteriophages are based on the morphology of the virion protein that is used to inject the genetic material into the target bacterium. These approaches are mainly based on electron microscopy techniques. A growing body of scientific literature suggests that Machine Learning is a relevant alternative for a more functional classification of bacteriophages.
  • Predicting the functionality of bacteriophage proteins: Machine Learning can be useful to elucidate the precise mechanisms of the PVP (Phage Virion Protein), involved, as mentioned above, in the injection of genetic material into the bacterium.
  • Determining the life cycle of bacteriophages: As discussed earlier in this article, there are two categories of phages: lytic and temperate. Traditionally, the determination of whether a phage belongs to one of these two families was determined by culture and in-vitro The task is more difficult than one might think because under certain stress conditions and in the presence of certain hosts, temperate phages have the ability to survive by performing lytic cycles. At present, PhageAI algorithms are able to determine 99% of the phage category.

It is also possible, as illustrated in the diagram below, for rare and particularly resistant bacteria, to combine the techniques seen above with synthetic biology and bio-engineering techniques in order to rapidly create “tailor-made” phages. In this particular use case, Artificial Intelligence offersits full potential in the development of an ultra-personalised medicine.


Despite its usefulness, phage therapy is still complicated to implement in many Western countries. In France, this therapy is possible within the framework of a Temporary Authorisation for Use under the conditions that the patient’s life is engaged or that his functional prognosis is threatened, that the patient is in a therapeutic impasse and that he or she is the subject of a mono-microbial infection. The use of the therapy must also be validated by a Temporary Specialised Scientific Committee on Phagotherapy of the ANSM and a phagogram – an in vitro test that studies the sensitivity of a bacterial strain to bacteriophages, in the manner of antibiograms – must be presented before treatment is started. Faced with these multiple difficulties, many patient associations are mobilizing to campaign for simplified access to phagotherapy. With the help of Artificial Intelligence, more and more phagotherapies can be developed, as illustrated in this article, and given the urgency and scale of the problem of antibiotic resistance, it is essential to prepare the regulatory framework within which patients will be able to access the various alternative treatments, including bacteriophages. The battle is not yet lost, and Artificial Intelligence will be a main key ally.

Would you like to discuss the subject? Would you like to take part in writing articles for the newsletter? Would you like to participate in an entrepreneurial project related to PharmaTech?

Contact us at!

To go further :

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Clinic Exploratory research

Reshaping real-world data sharing with Blockchain-based system

“[Blockchain] is a complicated technology and one whose full potential is not necessarily understood by healthcare players. We want to demonstrate […] precisely that blockchain works when you work on the uses!” Nesrine Benyahia, Managing Director of DrData


Access to real-world health data is becoming an increasingly important issue for pharmaceutical companies and facilitating the acquisition of this data could make the development of new drugs faster and less costly. After explaining the practices of data acquisition in the pharmaceutical industry, and the current initiatives aiming at facilitating them, this article will then focus on the projects using the Blockchain, in the exchange, monetization and securing of these precious data.

Use of real-world data by the Pharmaceutical Industry, where do we stand?

Real-world data are commonly defined as data that are not collected in an experimental setting and without intervention in the usual way patients are managed, with the aim of reflecting current practice in care. These data can sometimes complement data from randomized controlled trials, which have the disadvantage of being true only in the very limited context of clinical trials. The use of real-world data is likely to grow for two key reasons. First, new technological tools allow us to collect them (connected medical devices, for example) while others allow us to analyze them (data science, text-mining, patient forums, exploitation of grey literature, etc.). Secondly,  for a few years now, we have been observing a regulatory evolution that allows more and more early access and clinical evidence on small number of patients (especially in the case of cancer drug trials) and that tends to move the evidence cursor towards real-world data.

The uses of real-world data are varied and concern the development of new drugs – in particular in order to define new management algorithms, or to discover unmet medical needs through the analysis of databases – but also the monitoring of products already on the market – we can cite several cases of use such as the monitoring of safety and use, access to the market with conditional financial support or payment on performance. These data can be used to inform the decisions of health authorities and also the strategic decisions of pharmaceutical companies.

Current acquisition and use of real-world data: Data sources are varied, with varying degrees of maturity and availability, as well as varying access procedures. Some of these data come directly from healthcare, such as data from medico-administrative databases or hospital information systems, while others are produced directly by patients, through social networks, therapy management applications and connected medical devices. Access to this data for the pharmaceutical industry takes place in various ways. Like many other countries, France is currently working to implement organizational and regulatory measures to facilitate access to this real-world data, and to organize its collection and use, notably with the creation of the Health Data Hub. However, to this day, in the French and European context, no platform allows patients to have access to all of their health data and to freely dispose of them in order to participate in a given research project.

Imagining a decentralized health data sharing system, the first steps:

As a reminder, blockchain is a cryptographic technology developed in the late 2000s that allows to store, authenticate, and transmit information in a decentralized (without intermediaries or trusted third parties), transparent and highly secure way. For more information about how blockchain works, please refer to our previous article about this technology: “Blockchain, Mobile Applications: Will technology solve the problem of counterfeit drugs?” As we already explained in that article, the young Blockchain technology has so far mainly expressed its potential in the field of crypto currencies, but it is possible to imagine many other applications.

Thus, several research teams are working on how this technology could potentially address the major challenges of confidentiality, interoperability, integrity, and secure accessibility – among others – posed by the sharing of health data.

These academic research teams have envisioned blockchains that bring together different stakeholders: healthcare services, patients, and data users (who may be the patients themselves or other healthcare-producing organizations). These systems do not provide data to third parties (industrialists, for example); their only objectives are to improve the quality of care and to offer patients a platform that brings together their fragmented health data: in the United States, data is siloed because of the organization of the health system; in France, although the Social Security system has a centralizing role, the “Mon Espace Santé” service, which allows patients to access all of their data and is a descendant of the Shared Medical Record, is slow to be implemented.

These academic projects propose, on the one hand, to store medical information on a private blockchain – and on the other hand to operate Smart Contracts with different uses. Smart Contracts are computerized equivalents of traditional contracts, but they are different because their execution does not require a trusted third party or human intervention (they are executed when the conditions provided by the computer code are met). In these proposals for real-world data sharing systems, they allow, among other things, to authenticate the identity of the users, to guarantee the integrity of the data, their confidentiality, and the flexibility of their access (unauthorized persons cannot access the patient data).

Despite their theoretical qualities, these academic projects do not integrate the possibility for patients to share their data in an open access fashion, to different research projects. In the last part of this article, we will review two examples of start-ups seeking to address this issue using the Blockchain.

Examples of two blockchain projects that allow patients to share their health data:

Embleema is a startup that offers a platform where patients can upload their health data – ranging from their complete genome to the results of their medical tests, to data from connected medical devices. At the same time, pharmaceutical companies can express their needs, and an algorithm on the platform will then select patients who could correspond to this need, by their pathology or by the treatments they are prescribed. They will then be asked to sign a consent document to participate in an observational study, in exchange for which they will be paid (in the USA) or may choose a patient association that will receive funding (in France).  The data produced by patients are stored on centralized servers of specialized health data hosts, and only the industrialists who have purchased it have access to it. The Ethereum blockchain and its system of smart contracts are used in the Embleema model only to certify compliance and organize the sharing of documents related to the study (collection of patient consent, etc.). We can therefore wonder about the added value of the blockchain in this model. Couldn’t these documents have been stored on centralized servers? And the actions triggered by the smart contracts carried out from a centralized database, with Embleema acting as a trusted third party? How much of the marketing use of the term Blockchain is in this model? In any case, the Patient Truth platform developed by Embleema has the great merit of proposing a model in which patients have control over their health data, and the choice to get involved in this or that academic or industrial research project.


The second company we will focus on is MedicalVeda, a Canadian start-up in which blockchain plays a more central role, including the launch of an ERC-20 token (a standard cryptocurrency using the Ethereum blockchain that can be programmed to participate in a Smart Contract). The workings of this company, which seeks to solve several problems at once – regarding access to healthcare data by the healthcare industries but also about access to care on the patient side – is quite complex and conceptual and we will try to simplify it as much as possible. MedicalVeda’s value proposition is based on several products:

  • The VEDA Health Portal, which is a platform to centralize patient’s health data for the benefit of caregivers and pharmaceutical industry research programs to which the patient can choose to provide access. Similar to the projects previously mentioned in this article, the goal is to overcome the challenge of data siloing. The data is secured by a private blockchain.
  • The Medical Veda Data Market Place, which aims to directly connect patients and pharmaceutical companies according to their needs. Transactions are made using the blockchain and are paid for in crypto-currencies.
  • Two other products are worth mentioning: the MVeda token, which is the cryptocurrency of the data sales platform, which pays patients, and Medfi Veda, a decentralized finance system that allows American patients to borrow money to fund medical interventions by collateralizing their MVeda crypto-currency tokens. This collateral lending system is classic in decentralized finance, but admittedly the details of the system developed by MVeda remain murky. The objective of the system is to allow patients to collateralize their health data in order to facilitate their access to healthcare.

In conclusion, Blockchain is still a young technology that experienced a very high level of interest in the healthcare world in 2018 before gradually drying up since then, mainly due to a misunderstanding of its potential and a lack of education of healthcare professionals on the subject on the one hand, and on the other hand due to too much marketing use of what had become a “buzz-word.” The intrinsic qualities of this technology make it possible to imagine creative and ambitious models for sharing health data, which may be the source of accelerated development of new drugs in the future. For this time being, and despite courageous and intelligent initiatives, some of which have already been commercialized, no solution is fully functional on a very large scale; everything remains to be built.

To go further:

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Clinic Exploratory research Preclinical

3D printing and artificial intelligence: the future of galenics?

“Ten years from now, no patient will take the same thing as another million people. And no doctor will prescribe the same thing to two patients.”

Fred Paretti from the 3D drug printing startup Multiply Labs.

3D printing – also known as additive manufacturing – is one of the technologies capable of transforming pharmaceutical development, and will certainly play a role in the digitalization of the drug manufacturing sector. This short article will attempt to provide an overview of how 3D printing works, its various use cases in the manufacture of personalized medicines, the current regulatory framework for this innovative technology, and the synergies that may exist with Artificial Intelligence.

3D printing, where do we stand?

The principle of 3D printing, developed since the early 2000s and now used in a large number of industrial fields, consists of superimposing layers of material in accordance with coordinates distributed along three axes (in three dimensions) following a digital file. This 3D file is cut into horizontal slices and sent to the 3D printer, allowing it to print one slice after another. The terminology “3D printing” brings together techniques that are very different from each other:

  • The deposition of molten wire or extrusion: a plastic wire is heated until it melts and deposited at points of interest, in successive layers, which are bound together by the plastic solidifying as it cools. This is the most common technique used by consumer printers.
  • The photopolymerization of the resin: a photosensitive resin is solidified with the help of a laser or a very concentrated light source, layer by layer. This is one of the techniques that allows a very high level of detail.
  • Sintering or powder fusion: a laser is used to agglomerate the powder particles with the energy it releases. This technique is used to produce metal or ceramic objects.

In the pharmaceutical industry, 3D printing is used in several ways, the main ones being :

  • The realization of medical devices, using the classic techniques of printing plastic or metallic compounds or more particular techniques allowing medical devices to acquire original properties, like the prostheses of the start-up Lattice Medical allowing adipose tissue to regenerate.
  • Bio-printing, allowing, by printing with human cells, to reconstitute organs such as skin or heart patches, like what is done by another French start-up: Poietis
  • Finally, and this is what will be discussed in this article, 3D printing also has a role to play in galenics by making it possible to print, from a mixture of excipient(s) and active substance(s), an orally administered drug.

What are the uses of 3D printing of medicines? 

3D printing brings an essential feature to drug manufacturing: flexibility. This flexibility is important for:

  • Manufacturing small clinical batches: clinical phases I and II often require small batches of experimental drugs for which 3D printing is useful: it is sometimes economically risky to make large investments in drug manufacturing at this stage. Moreover, it is often necessary to modify the active ingredient content of the drugs used, and 3D printing would enable these batches to be adapted in real time. Finally, 3D printing can also be useful for offering patients placebos that are as similar as possible to their usual treatments.
  • Advancing towards personalized medicine: 3D printing of drugs allows the creation of “à la carte” drugs by mixing several active ingredients with different contents for each patient. In the case of patients whose weight and absorption capacities vary over time (children or the elderly who are malnourished, for example), 3D printing could also adapt their treatments in real time according to changes in their weight, particularly in terms of dosage and speed of dissolution.

To address these issues, most major pharmaceutical companies are increasingly interested in 3D printing of drugs. They are investing massively in this field or setting up partnerships, like Merck, which is cooperating with the company AMCM in order to set up a printing system that complies with good manufacturing practices. The implementation of this solution has the potential to disrupt the traditional manufacturing scheme, as illustrated in the diagram below.

Figure 1 – Modification of the manufacturing steps of a tablet by implementing 3D printing (Source : Merck)


The first commercialized 3D printed drug was approved by the FDA in 2015. Its active ingredient is levetiracetam. The goal of using 3D printing for this drug was to achieve a more porous tablet that dissolves more easily and is more suitable for patients with swallowing disorders. Despite these initial approvals and market accesses, the regulatory environment has yet to be built, as it is still necessary to assess the changes in best practices that 3D printing technology may impose and determine what types of tests and controls should be implemented. Destructive quality controls are not particularly well suited to the small batches produced by the 3D printer technique. To our knowledge, there are currently no GMP-approved 3D printers for the manufacture of drugs.

Will the future of drug 3D printing involve artificial intelligence? 

A growing number of authors believe that 3D printing of drugs will only be able to move out of the laboratory and become a mainstream technology in industry if artificial intelligence is integrated. Indeed, as things stand at present, because of the great flexibility mentioned above, the use of 3D printing requires a long iterative phase: it is necessary to test thousands of factors concerning in particular the excipients used, but also the parameters of the printer and the printing technique to be selected. The choice of these different factors is currently made by the galenics team according to its objectives and constraints: what is the best combination of factors to meet a given pharmacokinetic criterion? Which ones allow to minimize the production costs? Which ones allow to respect a possible regulatory framework? Which ones allow for rapid production? This iterative phase is extremely time-consuming and capital-intensive, which contributes to making 3D printing of drugs incompatible with the imperatives of pharmaceutical development for the moment. Artificial Intelligence seems to be the easiest way to overcome this challenge and to make the multidimensional choice of parameters to be implemented according to the objectives “evidence-based”. Artificial Intelligence could also be involved in the quality control of the batches thus manufactured.

The use of Artificial Intelligence to design new drugs opens up the prospect of new technical challenges, particularly with regard to the availability of the data required for these Machine Learning models, which are often kept secret by pharmaceutical laboratories.  We can imagine that databases can be built by text-mining scientific articles and patents dealing with different galenic forms and different types of excipients and then completed experimentally, which will require a significant amount of time. In addition to these technical challenges, it will also be necessary to ask more ethical questions, particularly with regard to the disruption of responsibilities caused by the implementation of these new technologies: who would be responsible in the event of a non-compliant batch being released? The manufacturer of the 3D printer? The developer of the algorithm that designed the drug? The developer of the algorithm that validated the quality control? Or the pharmacist in charge of the laboratory?

All in all, we can conclude that 3D printing of medicines is a technology that is already well mastered, whose market is growing by 7% each year to reach a projected market of 440 million dollars in 2025, but whose usefulness is so far limited to certain cases of use, but which could tomorrow, due to the unlocking of its potential through the combination of Artificial Intelligence, allow us to achieve a fully automated and optimized galenic development and manufacturing of oral forms, finally adapted to the ultra-customized medicine that is coming.

To subscribe to the monthly newsletter for free: Registration

Would you like to take part in writing articles for the newsletter ? You wish to participate in an entrepreneurial project on these themes ?

Contact us at ! Join our LinkedIn Group !

To go further:

  • Moe Elbadawi, Laura E. McCoubrey, Francesca K.H. Gavins, Jun J. Ong, Alvaro Goyanes, Simon Gaisford, and Abdul W. Basit ; Disrupting 3D Printing of medicines with machine learning ; Trends in Pharmacological Sciences, September 2021, Vol 42, No.9
  • Moe Elbadawi, Brais Muñiz Castro, Francesca K H Gavins, Jun Jie Ong, Simon Gaisford, Gilberto Pérez , Abdul W Basit , Pedro Cabalar , Alvaro Goyanes ; M3DISEEN: A novel machine learning approach for predicting the 3D printability of medicines ; Int J Pharm. 2020 Nov 30;590:119837
  • Brais Muñiz Castro, Moe Elbadawi, Jun Jie Ong, Thomas Pollard, Zhe Song, Simon Gaisford, Gilberto Pérez, Abdul W Basit, Pedro Cabalar, Alvaro Goyanes ; Machine learning predicts 3D printing performance of over 900 drug delivery systems ; J Control Release. 2021 Sep 10;337:530-545. doi: 10.1016/j.jconrel.2021.07.046
  • Les médicaments imprimés en 3D sont-ils l’avenir de la médecine personnalisée ? ; 3D Natives, le média de l’impression 3D ;!
  • Les médicaments de demain seront-ils imprimés en 3D ? ; Le mag’ Lab santé Sanofi ;
  • Press Releases – Merck and AMCM / EOS Cooperate in 3D Printing of Tablets ;

Ces articles pourraient vous intéresser


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…
Clinic Exploratory research Preclinical

Why are we still conducting meta-analyses by hand?

« It is necessary, while formulating the problems of which in our further advance we are to find solutions, to call into council the views of those of our predecessors who have declared an opinion on the subject, in order that we may profit by whatever is sound in their suggestions and avoid their errors. »

Aristotle, De anima, Book 1, Chapter 2

Systematic literature reviews and meta-analyses are essential tools for synthesizing existing knowledge and generating new scientific knowledge. Their use in the pharmaceutical industry is varied and will continue to diversify. However, they are particularly limited by the lack of scalability of their current methodologies, which are extremely time-consuming and prohibitively expensive. At a time when scientific articles are available in digital format and when Natural Language Processing algorithms make it possible to automate the reading of texts, should we not invent meta-analyses 2.0? Are meta-analyses boosted by artificial intelligence, faster and cheaper, allowing more data to be exploited, in a more qualitative way and for different purposes, an achievable goal in the short term or an unrealistic dream?

Meta-analysis: methods and presentation

A meta-analysis is basically a statistical analysis that combines the results of many studies. Meta-analysis, when done properly, is the gold standard for generating scientific and clinical evidence, as the aggregation of samples and information provides significant statistical power. However, the way in which the meta-analysis is carried out can profoundly affect the results obtained.

Conducting a meta-analysis therefore follows a very precise methodology consisting of different stages:

  • Firstly, a search protocol will be established in order to determine the question to be answered by the study and the inclusion and exclusion criteria for the articles to be selected. It is also at this stage of the project that the search algorithm is determined and tested.
  • In a second step, the search is carried out using the search algorithm on article databases. The results are exported.
  • Articles are selected on the basis of titles and abstracts. The reasons for exclusion of an article are mentioned and will be recorded in the final report of the meta-analysis.
  • The validity of the selected studies is then assessed on the basis of the characteristics of the subjects, the diagnosis, and the treatment.
  • The various biases are controlled for in order to avoid selection bias, data extraction bias, conflict of interest bias and funding source bias.
  • A homogeneity test will be performed to ensure that the variable being evaluated is the same for each study. It will also be necessary to check that the data collection characteristics of the clinical studies are similar.
  • A statistical analysis as well as a sensitivity analysis are conducted.
  • Finally, the results are presented from a quantitative and/or non-quantitative perspective in a meta-analysis report or publication. The conclusions are discussed.

The systematic literature review (SLR), unlike the meta-analysis, with which it shares a certain number of methodological steps, does not have a quantitative dimension but aims solely to organize and describe a field of knowledge precisely.

The scalability problem of a powerful tool

The scalability problem is simple to put into equation and will only get worse over time: the increase in the volume of data generated by clinical trials to be processed in literature reviews is exponential while the methods used for extracting and processing these data have evolved little and remain essentially manual. The intellectual limits of humans are what they are, and humans cannot disrupt themselves.

As mentioned in the introduction to this article, meta-analyses are relatively costly in terms of human time. It is estimated that a minimum of 1000 hours of highly qualified human labor are required for a simple literature review and that 67 weeks are needed between the start of the work and its publication. Thus, meta-analyses are tools with a high degree of inertia and their temporality is not currently adapted to certain uses, such as strategic decision-making, which sometimes requires certain data to be available quickly. Publications illustrate the completion of full literature reviews in 2 weeks and 60 working hours using automation tools using artificial intelligence.

“Time is money”, they say. Academics have calculated that, on average, each meta-analysis costs about $141,000. The team also determined that the 10 largest pharmaceutical companies each spend about $19 million per year on meta-analyses. While this may not seem like a lot of money compared to the various other expenses of generating clinical evidence, it is not insignificant and it is conceivable that a lower cost could allow more meta-analyses to be conducted, which would in turn explore the possibility of conducting meta-analyses of pre-clinical data and potentially reduce the failure rate of clinical trials – currently 90% of compounds entering clinical trials fail to demonstrate sufficient efficacy and safety to reach the market.

Reducing the problem of scalability in the methodology of literature reviews and meta-analyses would make it easier to work with data from pre-clinical trials. These data present a certain number of specificities that make their use in systematic literature reviews and meta-analyses more complex: the volumes of data are extremely large and evolve particularly rapidly, the designs of pre-clinical studies as well as the form of reports and articles are very variable and make the analyses and the evaluation of the quality of the studies particularly complex. However, systematic literature reviews and other meta-analyses of pre-clinical data have different uses: they can identify gaps in knowledge and guide future research, inform the choice of a study design, a model, an endpoint or the relevance or not of starting a clinical trial. Different methodologies for exploiting preclinical data have been developed by academic groups and each of them relies heavily on automation techniques involving text-mining and artificial intelligence in general.

Another recurring problem with meta-analyses is that they are conducted at a point in time and can become obsolete very quickly after publication, when new data have been published and new clinical trials completed. So much time and energy is spent, in some cases after only a few months or weeks, to present inaccurate or partially false conclusions. We can imagine that the automated performance of meta-analyses would allow their results to be updated in real time.

Finally, we can think that the automation of meta-analyses would contribute to a more uniform assessment of the quality of the clinical studies included in the analyses. Indeed, many publications show that the quality of the selected studies, as well as the biases that may affect them, are rarely evaluated and that when they are, it is done according to various scores that take few parameters into account – for example, the Jadad Score only takes into account 3 methodological characteristics – and this is quite normal: the collection of information, even when it is not numerous, requires additional data extraction and processing efforts.

Given these scalability problems, what are the existing or possible solutions?

Many tools already developed

The automation of the various stages of meta-analyses is a field of research for many academic groups and some tools have been developed. Without taking any offence to these tools, some examples of which are given below, it is questionable why they are not currently used more widely. Is the market not maturing enough? Are the tools, which are very fragmented in their value proposition, not suitable for carrying out a complete meta-analysis? Do these tools, developed by research laboratories, have sufficient marketing? Do they have sufficiently user-friendly interfaces?

As mentioned above, most of the tools and prototypes developed focus on a specific task in the meta-analysis methodology. Examples include Abstrackr, which specialises in article screening, ExaCT, which focuses on data extraction, and RobotReviewer, which is designed to automatically assess bias in reports of randomised controlled trials.

Conclusion: improvement through automation?

When we take into account the burgeoning field of academic exploration concerning automated meta-analysis as well as the various entrepreneurial initiatives in this field (we can mention in particular the very young start-up:, we can only acquire the strong conviction that more and more, meta-analysis will become a task dedicated to robots and that the role of humans will be limited to defining the research protocol, assisted by software that will allow us to make the best possible choices in terms of scope and search algorithms. Thus, apart from the direct savings that will be made by automating meta-analyses, many indirect savings will be considered, particularly those that will be made possible by the best decisions that will be taken, such as whether or not to start a clinical trial. All in all, the automation of meta-analyses will contribute to more efficient and faster drug invention.

Resolving Pharma, whose project is to link reflection and action, will invest in the coming months in the concrete development of meta-analysis automation solutions.

Would you like to discuss the subject? Would you like to take part in writing articles for the Newsletter? Would you like to participate in an entrepreneurial project related to PharmaTech?

Contact us at! Join our LinkedIn group!

To go further:
  • Marshall, I.J., Wallace, B.C. Toward systematic review automation: a practical guide to using machine learning tools in research synthesis. Syst Rev 8, 163 (2019).
  • Clark J, Glasziou P, Del Mar C, Bannach-Brown A, Stehlik P, Scott AM. A full systematic review was completed in 2 weeks using automation tools: a case study. J Clin Epidemiol. 2020 May;121:81-90. doi: 10.1016/j.jclinepi.2020.01.008. Epub 2020 Jan 28. PMID: 32004673.
  • Beller, E., Clark, J., Tsafnat, G. et al. Making progress with the automation of systematic reviews: principles of the International Collaboration for the Automation of Systematic Reviews (ICASR). Syst Rev 7, 77 (2018).
  • Lise Gauthier, L’élaboration d’une méta-analyse : un processus complexe ! ; Pharmactuel, Vol.35 NO5. (2002) ;
  • Nadia Soliman, Andrew S.C. Rice, Jan Vollert ; A practical guide to preclinical systematic review and meta-analysis; Pain September 2020, volume 161, Number 9,
  • Matthew Michelson, Katja Reuter, The significant cost of systematic reviews and meta-analyses: A call for greater involvement of machine learning to assess the promise of clinical trials, Contemporary Clinical Trials Communications, Volume 16, 2019, 100443, ISSN 2451-8654,
  • Vance W. Berger, Sunny Y. Alperson, A general framework for the evaluation of clinical trial quality; Rev Recent Clin Trials. 2009 May ; 4(2): 79–88.
  • A start-up specializing in meta-analysis enhanced by Artificial Intelligence:
  • And finally, the absolute bible of meta-analysis: The handbook of research synthesis and meta-analysis, Harris Cooper, Larry V. Hedges et Jefferey C. Valentine

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Exploratory research Generalities Preclinical

Oligonucleotides and Machine Learning Tools

Today, oligonucleotides – short DNA or RNA molecules – are essential tools in molecular biology projects, but also in therapeutics and diagnostics. In 2021, ten or so antisense therapies are authorised on the market, and much more are under clinical trials.

The recent Covid-19 crisis has also brought PCR tests to the public’s knowledge, these tests use small sequences of about 20 nucleotides to amplify and detect genetic material. Oligos have been so successful that, since their synthesis was automated, their market share has grown steadily. It is estimated that it will reach $14 billion by 2026.

Oligonucleotides have an elegance in their simplicity. It was in the 1950s that Watson and Crick described the double helix that makes up our genetic code, and the way in which the bases Adenine/Thymine and Cytosine/Guanine pair up. Thanks to this property, antisense therapies can virtually target our entire genome, and regulate its expression. Diseases that are difficult to treat, such as Spinal Dystrophy Disorder or Duchenne’s disease, are now benefiting some therapeutic support (1).

This article does not aim to restate the history of oligonucleotides used in clinic (many reviews are already available in the literature (2), (3), (4)), but to provide a quick overview of what has been developed in this area, with a Machine Learning tint.

We hope that the article will inspire some researchers, and that others may find new ideas of research and exploration. At a time when Artificial Intelligence has reached a certain maturity, it is particularly interesting to exploit it and to streamline all decision making in R&D projects.

This list is not exhaustive, and if you have a project or article to share with us, please contact us at We will be happy to discuss it and include it in this article.

Using Deep Learning to design PCR primers

As the Covid-19 health crisis has shown, diagnosing the population is essential to control and evaluate a pandemic. Thanks to two primers of about twenty nucleotides, a specific sequence can be amplified and detected, even at a very low level (PCR technique is technically capable of detecting up to 3 copies of a sequence of interest (5)).

A group from Utrecht University in the Netherlands (6) has developed a CNN (for Convolutional Neural Network, a type of neural network particularly effective in image recognition) capable of revealing areas of exclusivity in a genome. This allows the development of highly specific primers for the target of interest. In their case, they analysed more than 500 genomes of viruses from the Coronavirus family in order to train the algorithm to sort the different genomes. The primers designed by the model showed similar efficiency to the sequences used in practice. This tool could be used to develop PCR diagnostic tools with greater efficiency and speed.

Predicting the penetration power of an oligonucleotide

There are many peptides that improve the penetration of oligonucleotides into cells. These are called CPPs for Cell Penetrating Peptides, small sequences of less than 30 amino acids. Using a random decision tree, a team from MIT (7) was able to predict the activity of CPPs for oligonucleotides, modified by morpholino phosphorodiamidates (MO). Although the use of this model is limited (there are many chemical modifications to date and MOs cover only a small fraction of them), it is still possible to develop it for larger chemical families. For example, the model was able to predict experimentally whether a CPP would improve the penetration of an oligonucleotide into cells by a factor of three.

Optimising therapeutic oligonucleotides

Although oligonucleotides are known to be little immunogenic (8), they do not escape the toxicity associated with all therapies. “Everything is poison, nothing is poison: it is the dose that makes the poison. “- Paracelsus

This last parameter is key in the future of a drug during its development. A Danish group (9) has developed a prediction model capable of estimating the hepatotoxicity of a nucleotide sequence in mouse models. Again, here “only” unmodified and LNA (Locked Nucleic Acid, a chemical modification that stabilises the hybridisation of the therapeutic oligonucleotide to its target) modified oligonucleotides were analysed. It would be interesting to increase the chemical space studied and thus extend the possibilities of the algorithm. However, it is this type of model that will eventually reduce attrition in the development of new drugs. From another perspective (10), a model has been developped for optimising the structure of LNAs using oligonucleotides as gapmers. Gapmers are hybrid oligonucleotide sequences that have two chemically modified ends, that are resistant to degrading enzymes, and an unmodified central part that can be degraded once hybridised to its target. It is this final ‘break’ that will generate the desired therapeutic effect. Using their model, the researchers were able to predict the gapmer design that has the best pharmacological profile.

Accelerating the discovery of new aptamers

Also known as “chemical antibodies”, aptamers are DNA or RNA sequences capable of recognising and binding to a particular target with the same affinity as a monoclonal antibody. Excellent reviews on the subject are available here (11) or here (12). In clinic, pegatinib is the first aptamer to be approved for use. The compound is indicated for certain forms of AMD.

Current research methods, based on SELEX (Systematic Evolution of Ligands by Exponential Enrichment), have made it possible to generate aptamers directed against targets of therapeutic and diagnostic interest, such as nucleolin or thrombin. Although the potential of the technology is attractive, it is difficult and time-consuming to discover new pairs of sequence/target. To boost the search of new candidates, an American team (13) was able to train an algorithm to optimise an aptamer and reduce the size of its sequence, while maintaining or even increasing its affinity to its target. They were able to prove experimentally that the aptamer generated by the algorithm had more affinity than the reference candidate, while being 70% shorter. The interest here is to keep the experimental part (the SELEX part), and to combine it with these in silico tools in order to accelerate the optimisation of new candidates.

There is no doubt that the future of oligonucleotides is promising, and their versatility is such that they can be found in completely different fields, ranging from DNA-based nanotechnology to CRISPR/Cas technology. The latter two areas alone could be the subject of individual articles, as their research horizons are so important and exciting.

In our case, we hope that this short article has given you some new ideas and concepts, and inspired you to learn more about oligonucleotides and machine learning.

  1. Bizot F, Vulin A, Goyenvalle A. Current Status of Antisense Oligonucleotide-Based Therapy in Neuromuscular Disorders. Drugs. 2020 Sep;80(14):1397–415.
  2. Roberts TC, Langer R, Wood MJA. Advances in oligonucleotide drug delivery. Nat Rev Drug Discov. 2020 Oct;19(10):673–94.
  3. Shen X, Corey DR. Chemistry, mechanism and clinical status of antisense oligonucleotides and duplex RNAs. Nucleic Acids Res. 2018 Feb 28;46(4):1584–600.
  4. Crooke ST, Liang X-H, Baker BF, Crooke RM. Antisense technology: A review. J Biol Chem [Internet]. 2021 Jan 1 [cited 2021 Jun 28];296. Available from:
  5. Bustin SA, Benes V, Garson JA, Hellemans J, Huggett J, Kubista M, et al. The MIQE Guidelines: Minimum Information for Publication of Quantitative Real-Time PCR Experiments. Clin Chem. 2009 Apr 1;55(4):611–22.
  6. Lopez-Rincon A, Tonda A, Mendoza-Maldonado L, Mulders DGJC, Molenkamp R, Perez-Romero CA, et al. Classification and specific primer design for accurate detection of SARS-CoV-2 using deep learning. Sci Rep. 2021 Jan 13;11(1):947.
  7. Wolfe JM, Fadzen CM, Choo Z-N, Holden RL, Yao M, Hanson GJ, et al. Machine Learning To Predict Cell-Penetrating Peptides for Antisense Delivery. ACS Cent Sci. 2018 Apr 25;4(4):512–20.
  8. Stebbins CC, Petrillo M, Stevenson LF. Immunogenicity for antisense oligonucleotides: a risk-based assessment. Bioanalysis. 2019 Nov 1;11(21):1913–6.
  9. Hagedorn PH, Yakimov V, Ottosen S, Kammler S, Nielsen NF, Høg AM, et al. Hepatotoxic Potential of Therapeutic Oligonucleotides Can Be Predicted from Their Sequence and Modification Pattern. Nucleic Acid Ther. 2013 Oct 1;23(5):302–10.
  10. Papargyri N, Pontoppidan M, Andersen MR, Koch T, Hagedorn PH. Chemical Diversity of Locked Nucleic Acid-Modified Antisense Oligonucleotides Allows Optimization of Pharmaceutical Properties. Mol Ther – Nucleic Acids. 2020 Mar 6;19:706–17.
  11. Zhou J, Rossi J. Aptamers as targeted therapeutics: current potential and challenges. Nat Rev Drug Discov. 2017 Mar;16(3):181–202.
  12. Recent Progress in Aptamer Discoveries and Modifications for Therapeutic Applications | ACS Applied Materials & Interfaces [Internet]. [cited 2021 Jul 25]. Available from:
  13. Bashir A, Yang Q, Wang J, Hoyer S, Chou W, McLean C, et al. Machine learning guided aptamer refinement and discovery. Nat Commun. 2021 Apr 22;12(1):2366.

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Exploratory research

Health data: an introduction to the synthetic data revolution

Data, sometimes considered as the black gold of the 21st century, are the essential fuel for artificial intelligence and are already widely used by the pharmaceutical industry. However, and especially because of the particular sensitivity of Health, their use has several limitations. Will synthetic data be one of the solutions to solve these problems?

What is synthetic data and why use it?

Synthetic data are data created artificially through the use of generative algorithms, rather than collected from real events. Originally developed in the 1990s to allow work on U.S. Census data, without disclosing respondents’ personal information, synthetic data have since been developed to generate high-quality, large-scale datasets.

These data are generally generated from real data, for example from patient files in the case of health data, and preserve their statistical distribution. Thus, it is theoretically possible to generate virtual patient cohorts, having no real identity, but corresponding statistically in all points to real cohorts. Researchers have succeeded in synthesizing virtual patient records from publicly available demographic and epidemiological data. In this case, we speak of “fully synthetic data“, as opposed to “partially synthetic data“, which are synthetic data manufactured to replace missing data from real data sets collected in the traditional way.


Currently, and despite various initiatives – such as the Health Data Hub in France,  for which we will come back to in future articles – aiming to democratize their use, many problems still limit the optimal and massive use of patient data, despite their ever growing volume. Synthetic data are one of the solutions that can be used.

  • Health data privacy:

Naturally, health data are particularly sensitive in terms of confidentiality. The need to preserve patient anonymity leads to a certain number of problems in terms of accessibility and data processing costs. Many players do not have an easy access to these data, and even when they do manage to gain access, their processing involves significant regulatory and cybersecurity costs. Access times are also often extremely long, which slows down the research projects. For some databases, it is sometimes a regulatory requirement to hire a third-party company, that is accredited to handle these data.

To allow their use, patient data are generally anonymized using methods such as the deletion of identifying variables; their modification by the addition of noise; or the grouping of categorical variables in order to avoid certain categories containing too few individuals. However, the efficiency of these methods has been regularly questioned by studies showing that it was generally possible to trace the identity of patients, by making matches (probabilistic or deterministic) with other databases. Synthetic data generation can, in this context, be used as a safe and easy-to-use alternative.

  • Data quality:

The technique of synthetic data generation is commonly used to fill in missing data in real data sets that are impossible or very costly to collect again. These new data are representative of the statistical distribution of variables from the real data set.

  • The volume of health data datasets is too small to be exploited by artificial intelligence:

The training of Machine or Deep Learning models sometimes requires large volumes of data in order to obtain satisfying predictions: it is commonly accepted that a minimum of about 10 times as many examples as degrees of freedom of the model are required. However, when Machine Learning is used in health care, it is common that the volume of data does not allow good results, for example in rare pathologies that are poorly documented, or sub-populations representing few individuals. In such cases, the use of synthetic data is part of the data scientists’ toolbox.

The use of synthetic data is an emerging field, some experts believe it will help overcoming some of the current limitations of AI. Among the various advantages brought by synthetic data in the field of AI, we can mention: the fact that it is fast and inexpensive to create as much data as you want, without the need to label them by hand as it is often the case with real data, but also that these data can be modified several times in order to make the model as efficient as possible, in its processing of real data.

The different techniques for generating synthetic data

The generation of synthetic data involves several phases:

  • The preparation of the sample data from which the synthetic data will be generated: in order to obtain a satisfying result, it is necessary to clean and harmonize the data if they come from different sources
  • The actual generation of the synthetic data, we will detail some of these techniques below
  • The verification and the evaluation of the confidentiality offered by the synthetic data

Figure 1 – Synthetic Data Generation Schema

The methods of data generation are numerous, and their use depends on the objective one is aiming for and the type of data one wants to create: should we create data from already existing data, and thus follow their statistical distributions?  Or fully virtual data following rules, allowing them to be realistic (like text for example)? In the case of “data-driven” methods, taking advantage of existing data, generative Deep Learning models will be used. In the case of “process-driven” methods, allowing mathematical models to generate data from underlying physical processes, it will be a question of what we call agent-based modelling.

Operationally, synthetic data are usually created in the Python language – very well known to Data Scientists. Different Python libraries are used, such as: Scikit-Learn, SymPy, Pydbgen and VirtualDataLab. A future Resolving Pharma article will follow up this introduction by presenting how to create synthetic health data using these libraries.

Evaluation of synthetic data

It is common to evaluate anonymized patient data according to two main criteria: the quality of the use that can be made with the data, and the quality of anonymization that has been achieved. It has been shown that the more the data is anonymized, the more limited the use is, since important but identifying features are removed, or precision is lost by grouping classes of values. There is a balance to be found between the two, depending on the destination of the data.

Synthetic data are evaluated according to three main criteria:

  • The fidelity of the data to the base sample
  • Fidelity of the data to the distribution of the general population
  • The level of anonymization allowed by the data

Different methods and metrics exist to evaluate the criteria: 

By ensuring that the quality of the data generated is sufficient for its intended use, evaluation is an essential and central element of the synthetic data generation process.

Which use cases for synthetic data in the pharmaceutical industry?

A few months ago, Accenture Life Sciences and Phesi, two companies providing services to pharmaceutical companies, co-authored a report urging them to integrate more techniques involving synthetic data into their activities. The use case mentioned in this report is about synthetic control arms, which however, generally use real data from different clinical trials and is statistically reworked.

Outside the pharmaceutical industry, in the world of Health, synthetic data are already used to train visual recognition models in imaging: researchers can artificially add pathologies to images of healthy patients and thus test their algorithms on their ability to detect the pathologies. Based on this use-case, it is also possible to create histological section data that could be used to train AI models in preclinical studies.


There is no doubt that the burgeoning synthetic data industry is well on its way to improve artificial intelligence as we currently know it, and its use in the health industry. This is particularly true when handling sensitive and difficult-to-access data. We can imagine, for example, a world where it is easier and more efficient for manufacturers to create their own synthetic data, than to seek access to medical or medico-administrative databases. This technology would then be one of those that would modify the organization of innovation in the health industries, by offering a less central place to real data.

To go further:

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Exploratory research Preclinical

Robots in the lab: will tomorrow’s researchers also be roboticists?

Frequently, there are very repetitive and laborious tasks in a laboratory that could be automated. Whether it’s compiling data on a spreadsheet or preparing a gel for electrophoresis, the ratio Value generated / Time used is not often high. Derek Lowe, a chemist and author of the Into The Pipeline blog (1), humorously recalls a time when a simple chromatography took an enormous amount of time to perform (a time now almost over), and correctly notes that the goal of automation is not to push the researcher out of the lab, but to reduce all those laborious tasks and support the scientist’s intellectual input to the maximum.

In Chemistry or Biology, many groups are trying to imagine the laboratory of the future, one that could carry out end-to-end the synthesis and the testing of a molecule. However, from a technical point of view, the range of actions required to reproduce the work of a researcher by a robot is far too wide to be effective today, but the various projects presented below are promising for the future.


In the field of diagnostics, it is becoming increasingly vital to automate tests, in order to meet the demands of patients and clinicians. To give you an idea, in the UK, almost one million PCR tests are performed every day for covid-19 alone (2).

In clinical microbiology the use of automated systems is particularly interesting, where protocols require a lot of time and attention from microbiologists. The WASP robot, designed by Copan (3), combines robotics and software and is capable of performing culture operations, bacterial isolation, and monitoring whether the growth is done correctly thanks to a small camera installed in the robot. There is also Roche’s Cobas (4), which is capable of performing various molecular biology tests such as qPCR. The versatility of these robots allow them to be easily adapted for other diagnostic purposes.


In a more chemical context, a Liverpudlian group led by Prof. Andrew Cooper (5) has designed a robot capable of physically moving around the laboratory in order to optimise hydrogen production by photocatalysis. The advantage of this robot is that it is human-sized and can operate and move freely in any room. Although it took some time to set up such a system, it is estimated that the robot was 1000 times faster than if the work was done manually. The video of the robot is available below:

The artificial intelligence implemented in most robots works by iterations: the results of each experiment are evaluated by the algorithm and allow it to design the next experiment.

Figure 1: The combination of Artificial Intelligence and robotics allows the creation of an iterative circuit, where each cycle analyses the results of the previous one, and adapts the parameters to optimise the process defined by the researcher

In the field of Materials Chemistry, Alán Aspuru-Guzik et al. (6) have developed an automated and autonomous platform capable of working with a large number of parameters in order to discover new materials, useful for solar panels or electronic consumables. In Organic Chemistry, Coley et al (7) have used AI and robotics to synthesize small molecules by Flow Chemistry. All the chemist has to do is to indicate the molecule one want to obtain, and the AI will carry out its own retro synthesis pathway and try to synthesize the compound. The automaton was able to synthesise 15 small therapeutic molecules, ranging from aspirin to warfarin.

Other initiatives can be noted, particularly from Big Pharmas, such as AstraZeneca and its iLab (8), which aims to automate the discovery of therapeutic molecules via an iterative circuit of Design, Make, Test, Analyse. In Medicinal Chemistry, combinatorial chemistry methods enable the chemical space of a target to be explored very rapidly, thanks to controlled and optimised reactions. These projects witness the progress towards totally autonomous synthesis systems.


It is probably fair to note that some researchers are wary of using robots, and sometimes feel threatened by being replaced by a machine. Myself, as an apprentice chemist and exploring the subject, have said to myself on several occasions “Hey, but this robot could work for me! I remember the weeks I would spend trying to optimise a reaction, chain testing different catalysts, a job that an automaton (or a monkey!) could have done for me much more quickly and certainly more efficiently. Robotics has this enormous potential to improve the productivity of researchers, and to reduce tedious tasks that ultimately require little or even no intellectual thought at all.

There are also tools that help the researcher to develop research designs to optimise an X or Y process in the most efficient way. For example, the EDA tool developed by NC3Rs (9) is useful for in vivo research projects, where one tries to obtain statistically powerful data while reducing the number of animals used. Other tools have also been developed using Bayensian or Montecarlo Research Tree models (10), and lead to optimal experimental designs. In the same vein, Aldeghi et al. have developed Golem (11), an open-source tool available on GitHub (12).

Cloud technologies (i.e. access to a service via the internet) also hold great promise for the laboratory of the future. They will allow researchers to carry out their research entirely from home, thanks to “a few” lines of code. Projects such as Strateos have initiated this practice and already allow researchers to program Chemistry, Biochemistry and Biology experiments from home. Once the protocol is defined, the researcher simply launches the experiment from one’s computer and the robot located thousands of kilometres away carries out the operation for one’s. In a few years’ time, if the service becomes more widely adopted in the scientific community, everyone will have easy access to it for a reasonable price.

Figure 2: Cloub Lab principle. 1) The researcher sends his or her research protocol to the automaton, located on the other side of the world. 2) The automaton carries out the experiment designed by the researcher and 3) returns the results to the researcher as soon as the experiment is completed.


Between progress and doubts, it is probably only a matter of time before the scientific community adopts a different mentality. There was a time when telephone switchboards were run entirely by people, until the day everything was replaced by automatons. This short documentary by David Hoffman crystallised this transition and the reaction of users when they hear a robotic voice for the very first time. Although some of them were reluctant at first, the implementation of voice recognition has made the service much more efficient and less expensive for consumers. Won’t tomorrow’s researchers all be roboticists to some extent?


  1. Lab! Of! The! Future! | In the Pipeline [Internet]. 2021 [cited 2021 Jun 9]. Available from: //
  2. Testing in the UK | Coronavirus in the UK [Internet]. [cited 2021 Jun 13]. Available from:
  3. Copan WASP DT: Walk-Away Specimen Processor [Internet]. [cited 2021 Jun 9]. Available from:
  4. Automation in Molecular Diagnostic Testing [Internet]. Diagnostics. [cited 2021 Jun 13]. Available from:
  5. Burger B, Maffettone PM, Gusev VV, Aitchison CM, Bai Y, Wang X, et al. A mobile robotic chemist. Nature. 2020 Jul;583(7815):237–41.
  6. MacLeod BP, Parlane FGL, Morrissey TD, Häse F, Roch LM, Dettelbach KE, et al. Self-driving laboratory for accelerated discovery of thin-film materials. Sci Adv. 2020 May 1;6(20):eaaz8867.
  7. Coley CW, Thomas DA, Lummiss JAM, Jaworski JN, Breen CP, Schultz V, et al. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science [Internet]. 2019 Aug 9 [cited 2021 Jun 3];365(6453). Available from:
  8. The AstraZeneca iLab [Internet]. [cited 2021 Jun 27]. Available from:
  9. du Sert NP, Bamsey I, Bate ST, Berdoy M, Clark RA, Cuthill IC, et al. The Experimental Design Assistant. Nat Methods. 2017 Nov;14(11):1024–5.
  10. Dieb TM, Tsuda K. Machine Learning-Based Experimental Design in Materials Science. In: Tanaka I, editor. Nanoinformatics [Internet]. Singapore: Springer; 2018 [cited 2021 Jun 6]. p. 65–74. Available from:
  11. Aldeghi M, Häse F, Hickman RJ, Tamblyn I, Aspuru-Guzik A. Golem: An algorithm for robust experiment and process optimization. ArXiv210303716 Phys [Internet]. 2021 Mar 5 [cited 2021 Jun 9]; Available from:
  12. aspuru-guzik-group/golem [Internet]. Aspuru-Guzik group repo; 2021 [cited 2021 Jun 9]. Available from:


These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Exploratory research Preclinical

Organ-On-Chip: Towards totally miniaturized assays?

Before testing a new molecule in Humans, it is necessary to make toxicological and pharmacokinetic predictions with various preclinical models. Researchers try to reconstruct as best as they can what would happen in a specific tissue or organ. Among the most commonly used techniques are cell cultures, which, although effective, cannot fully simulate the dynamics of an organ or a pathology. There are also in vivo models, which are often more relevant, but are not adapted to high-throughput data generation. First, ethically, the models must be sacrificed and what is observed in animals is not always observed in Humans. Of the compounds that fail in the clinic, it is estimated that 60% of the causes are related to lack of efficacy in Humans, and 30% to unexpected toxicity [1]. Clearly, new biological models are needed.


Paradoxically, chemical libraries are growing, but the number of outgoing drugs is thinning. Therefore, the scientific community must rethink its models permanently to generate reliable information quicker. It is from this problem that the genesis of the Organs-On-Chips (OOC) begins. It was in 1995 that Michael L. Schuler was the first to propose a prototype of cell culture analogue, connecting several compartments of different cells 2. It is when these compartments were connected by microchannels that the term “organ-on-a-chip” appeared.

OOCs are devices the size of a USB flash drive. This is made possible thanks to the microchannel technology that harnesses volumes of the order of nanoliter and below. OOCs have three characteristics that allow them to better model a tissue or an organ:

  1. The control of the 3D distribution of cells
  2. The integration of different cell populations
  3. And the possibility of generating and controlling biomechanical forces.

This allows physiological conditions to be transcribed much more faithfully, compared to a static two-dimensional cell culture on a flat surface. There is no single design for OOC, but perhaps the easiest example to visualize is the lung OOC mimicking the air-alveolus interface. (see Figure 1).

Figure 1 : Illustration of an OOC mimicking the air-lung interface. A semi-permeable membrane separates the external environment from the pulmonary cells. The vacuum chamber makes it possible to mimic the diaphragm.

To date, different OOC have been designed, ranging from the liver to chronic obstructive bronchopneumopathy. Riahi et al. have developed a liver OOC, capable of assessing the chronic toxicity of a molecule by quantifying the evolution of certain biomarkers 3. Compared to 2D cultures, the OOC is more sustainable and generates data that could have only been observed in vivo. Another interesting model was developed by Zhang et al. and focuses on the heart and its cardiomyocytes 4. By integrating electrodes on the chip, the researchers were able to assess cell contraction, and evaluate the effectiveness and cardiotoxicity of certain drugs. If the adoption of the technology is successful, the OOC will be used as a complement to cellular tests and animal models, and may completely replace them.

Impressively, the versatility of the concept will allow clinicians to evaluate the response of our own cells to a specific treatment. By implementing, for instance, a tumor extract from a patient in an OOC, it will be possible to observe and optimize the therapeutic response to a molecule X, and transcribe these observations in clinic 5. This is a first step of the OOC towards personalized medicine.


Eventually, the different OOC models can be combined in order to group several organs and simulate an entire organism. This last idea, also known as “body-on-a-chip”, is extremely powerful and could capture both the effect of a drug and its associated toxicity on the various organs. Some models, such as Skardal et al.’, have allowed to study the migration of tumour cells from a colon OOC to a liver OOC 6. Edington et al. were able to connect up to 10 different OOCs, capturing some of their physiological functions. It consisted of the liver, lungs, intestines, endometrium, brain, heart, pancreas, kidneys, skin and skeletal muscles. The system was functional for four weeks 7. Even if such systems are not optimal yet, their exploration will enable the generation of much more relevant data, much faster, to boost Drug Discovery projects.

To go further :

Excellent reviews on the subject are available:


  1. Waring, M. J. et al. An analysis of the attrition of drug candidates from four major pharmaceutical companies. Nat. Rev. Drug Discov. 14, 475–486 (2015).
  2. Sweeney, L. M., Shuler, M. L., Babish, J. G. & Ghanem, A. A cell culture analogue of rodent physiology: Application to naphthalene toxicology. Toxicol. In Vitro 9, 307–316 (1995).
  3. Riahi, R. et al. Automated microfluidic platform of bead-based electrochemical immunosensor integrated with bioreactor for continual monitoring of cell secreted biomarkers. Sci. Rep. 6, 24598 (2016).
  4. Zhang, X., Wang, T., Wang, P. & Hu, N. High-Throughput Assessment of Drug Cardiac Safety Using a High-Speed Impedance Detection Technology-Based Heart-on-a-Chip. Micromachines 7, 122 (2016).
  5. Shirure, V. S. et al. Tumor-on-a-chip platform to investigate progression and drug sensitivity in cell lines and patient-derived organoids. Lab. Chip 18, 3687–3702 (2018).
  6. Skardal, A., Devarasetty, M., Forsythe, S., Atala, A. & Soker, S. A Reductionist Metastasis-on-a-Chip Platform for In Vitro Tumor Progression Modeling and Drug Screening. Biotechnol. Bioeng. 113, 2020–2032 (2016).
  7. Edington, C. D. et al. Interconnected Microphysiological Systems for Quantitative Biology and Pharmacology Studies. Sci. Rep. 8, 4530 (2018).

These articles should interest you

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !

Exploratory research Generalities

Dismantling the scientific publishers’ cartel to free innovation ?

” Text mining of academic papers is close to impossible right now. “

Max Häussler – Bioinformatics researcher, UCSC

Faced with the explosion of published scientific articles and the exponential increase in computing capacities, the way we will read the scientific literature in the future will probably have nothing to do with the tedious, slow, and repetitive current reading work and will undoubtedly involve more and more the use of intelligent text-mining techniques. By increasing tenfold our analytical capacities, these techniques make it possible – and will make it even easier in the future – to unleash creativity and bring about scientific innovation faster and cheaper. For the time being, however, this bright outlook faces a major obstacle: the scientific publishing cartel – one of the world’s most lucrative industries, which is determined to not jeopardize its enormous profits.

Text-mining and its necessity :

Text-mining is a technology that aims to obtain key and previously unknown information very quickly from a very large quantity of text – in this case the biomedical literature. This technology is multi-disciplinary in nature, using machine learning, linguistic and statistical techniques.

The purpose of this article is not to constitute a technical study of text-mining, but it is nevertheless necessary, for the full understanding of the potential of this technology, to describe its main steps :

  • Selection and collection of texts to be analyzed : This first step consists of using search algorithms to automatically download abstracts of interest from scientific article databases (such as PubMed, for example, which alone references 12,000,000 scientific articles). A search of the grey literature can also be conducted to be as exhaustive as possible.
  • Preparation of the texts to be analyzed : The objective of this step is to put the texts to be analyzed in a predictable and analyzable form according to the task to be accomplished. There is a whole set of techniques to carry out this step which will make it possible to remove the “noise” of the text and to “tokenize” the words inside the sentences.
  • Analysis of data from the texts : The analysis of the data will largely depend on the preparation of the text. Different statistical and data science techniques can be used: support vector machines, hidden Markov models or, for example, neural networks.
  • Data visualization : The issue of data visualization is probably more important than one might think. Depending on the chosen option: tables or 3D models, for example, the information and meta-information to which the user of the model has access will be more or less relevant and explanatory.

Text-mining has already proven its usefulness in biomedical scientific research: among other things, it has been used to discover associations between proteins and pathologies; to understand interactions between proteins or to elucidate the docking of certain drugs to their therapeutic target. However, most of the time, this technology is only implemented on the abstracts of articles, which considerably reduces its power in terms of reliability of the obtained data as well as the number of its applications.

So why not using the millions of scientific articles available online? New research hypotheses could be formulated, new therapeutic strategies could be created. This is technologically within reach, but scientific publishers seem to have decided differently for the moment. Here are some explanations.

The problems posed by scientific publishers :

At their emergence, at the end of the second world war, scientific publishers had a real utility in the diffusion of science: indeed, the various learned societies had only weak means to diffuse the work and conclusions of their members. At that time, the dissemination of published articles was done through the publication of paper journals, which were too expensive for most learned societies. Since the birth of this industry and despite the considerable changes in the means of transmission of scientific knowledge with the Internet, its business model has not evolved, becoming anachronistic and bringing its gross margins to percentages that make online advertising giants like Google or Facebook look like unprofitable businesses. Scientific publishers are indeed the only industry in the world that obtains the raw material (scientific articles) for free from its customers (scientists from all over the world) and whose transformation (peer-reviewing) is also carried out by its customers on a voluntary basis.


The triple-payment system set up by scientific publishers.

Scientific publishers have set up an “odd triple-payment system”, allowing private entities to capture public money intended for research and teaching. The States finance the research leading to the writing of scientific articles, pay the salaries of the scientists who voluntarily participate in the peer-reviewing and finally pay once again, through the subscriptions of universities and research laboratories, to have access to the production of scientific knowledge that they have already financed twice! Another model, parallel to this one, has also been developing for a few years, the author-pays model in which researchers pay publication fees in order to make their work more easily accessible to readers…are we heading towards a quadruple-pay system?

The deleterious consequences of the system put in place by scientific publishers are not only financial but also impact the quality of the scientific publications produced and therefore the validity of potential artificial intelligence models based on the data in these articles. The business model based on journal subscriptions leads publishers to favor spectacular and deeply innovative discoveries over confirmatory work, which pushes some researchers, driven by the race for the “impact factor”, to defraud or to publish statistically unconsolidated results very early on: This is one of the reasons of the reproducibility crisis that science is currently experiencing and also one of the possible causes of the insufficient publication of negative, yet highly informative, results: one out of every two clinical trials does not result in any publication.

Finally, and this is the point that interests us most in this article, scientific publishers are an obstacle to the development of text-mining on the huge databases of articles they possess, which has, in fine, a colossal impact on our knowledge and understanding of the world as well as on the development of new drugs. Indeed, it is currently extremely difficult to perform text-mining on complete scientific articles on a large scale because it is not allowed by the publishers, even when you have a subscription and are legally entitled to read the articles. Several countries have legislated so that research teams implementing text-mining are no longer required to seek permission from scientific publishers. In response to these legal developments, scientific publishers, taking advantage of their oligopolistic situation, have set up completely artificial technological barriers: for example, it has become impossible to download articles rapidly and in an automated way, the maximum rate imposed being generally 1 article every 5 seconds, which means that it would take about 5 years to download all the articles related to biomedical research. The interest of this system for scientific publishers is to be able to hold to ransom – the term is strong, but it is the right one – the big pharmaceutical companies who wish to remove these artificial technical barriers for their research project.

The current system of scientific publications, as we have shown, benefits only a few companies at the expense of many actors – researchers from all over the world, and even more when they work from disadvantaged countries, governments and taxpayers, health industries and finally, at the end of the chain, patients who do not benefit from the full potential of biomedical research. Under these conditions, many alternatives to this model are emerging, some of which are largely made possible by technology.

Towards the disruption of scientific publishing ?

” You only really destroy what you replace. “

Napoléon III – 1848

Doesn’t every innovation initially come from a form of rebellion? This is especially true when it comes, so far, to the various initiatives undertaken to unleash the potential of free and open science, as these actions have often taken the form of piracy operations. Between manifestos and petitions, notably the call for a boycott launched by Mathematics researcher Timothy Gowers, based on the text “The cost of knowledge”, the protest movements led by scientists and the creation of open-source platforms like have been numerous. However, few actions have had as much impact as those of Aaron Swartz, one of the main theorists of open source and open science, who tragically commit suicide at the age of 26, one month before a trial during which he was facing 35 years of imprisonment for having pirated 4.8 million scientific articles, or of course, those of Alexandra Elbakyan, the famous founder of the Sci-Hub website, which allows free – and illegal – access to most of the scientific literature.

Aaron Swartz and Alexandra Elbakyan

More recently, the proponents of the open-source movement have adapted to the radical turn of text-mining, notably through Carl Malamud’s project, aiming to take advantage of a legal grey area to propose to academic research teams to mine the gigantic database of 73 million articles he has built. The solution is interesting but not fully completed, this database is for the moment not accessible from Internet for legal reasons, it is necessary to travel to India, where it is hosted, to access it.

These initiatives operate on more or less legal forms of capturing articles after their publication by scientific publishers. In the perspective of a more sustainable alternative, the ideal would be to go up the value chain and therefore work upstream with researchers. The advent of the blockchain technology – a technology for storing and exchanging information with the particularity of being decentralized, transparent and therefore highly secure, on which future articles of Resolving Pharma will come back in detail – is thus for many researchers and thinkers of the subject a great opportunity to definitively replace scientific publishers in a system inducing more justice and allowing the liberation of scientific information.

The transformation of the system will probably be slow – the prestige accorded by researchers to the names of large scientific journals belonging to the oligopoly will persist over time – perhaps it will not even happen, but the Blockchain has, if successfully implemented, the capacity to address the issues posed earlier in this article in a number of ways :

A fairer financial distribution

As we have seen, the business model of scientific publishers is not very virtuous, to word it mildly. At the other end of the spectrum, Open Access, despite its undeniable and promising qualities, can also pose certain problems, being sometimes devoid of peer-reviewing. The use of a dedicated cryptocurrency for the scientific publishing world could eliminate the triple-payment system, as each actor could be paid at the fair value of their contribution. A researcher’s institution would receive a certain amount of cryptocurrency when he or she publishes as well as when he or she participates in peer-reviewing another paper. As for the institutions’ access to publications, it would be done through the payment of a cryptocurrency amount. Apart from the financial aspects, the copyright, which researchers currently waive, would be automatically registered in the blockchain for each publication. Research institutions will thus retain the right to decide at what price the fruits of their labor will be available. A system of this kind would allow, for example, anyone wishing to use a text-mining tool to pay a certain amount of this cryptocurrency, which would go to the authors and reviewers of the articles used. Large-scale text-mining would then become a commodity.

Tracking reader usage and defining a real « impact factor »

Currently, even if we try to count the number of citations to articles, the use of scientific articles is difficult to quantify, although it could be an interesting metric for the different actors of the research ecosystem. The Blockchain would allow to precisely trace each transaction. This tracing of readers would also bring a certain form of financial justice: one can imagine that through a Smart Contract, a simple reading would not cost exactly the same amount of cryptocurrency as the citation of the article. It would thus be possible to quantify the real impact of a publication and replace the “impact factor” system by the real-time distribution of “reputation tokens” to scientists, which can also be designed in such a way as not to discourage the publication of negative results (moreover, in order to alleviate this problem, researchers have set up a platform dedicated to the publication of negative results:

With the recent development of Non-Fungible Tokens (NFT), we can even imagine tomorrow the emergence of a secondary market for scientific articles, which will be exchanged from user to user, as is already possible for other digital objects (video game elements, music tracks, etc.).

A way to limit fraud

Currently, the peer-reviewing system, in addition to being particularly long (it takes on average 12 months between the submission and the publication of a scientific article, compared to two weeks on a Blockchain-based platform such as ScienceMatters), is completely opaque to the final reader of the article, who has no access to the names of the researchers who took part in the process, nor even to the chronological iterations of the article. The Blockchain could allow, through its unforgeable and chronological structure, to record these different modifications. This is a topic that would deserve another article on its own, but the Blockchain would also allow to record the different data and metadata that led to the conclusions of the article, whether it is for example preclinical or clinical trials, and thus avoid fraud while increasing reproducibility.

Manuel Martin, one of the co-founders of Orvium, a Blockchain-based scientific publishing platform, believes: “by establishing a decentralized and competitive marketplace, blockchain can help align the goals and incentives of researchers, funding agencies, academic institutions, publishers, companies and governments.”

The use of the potential of artificial intelligence in the exploitation of scientific articles is an opportunity to create a real collective intelligence, to make faster and more efficient research happen and probably to cure many diseases around the world. The lock that remains to be broken is not technological but organizational. Eliminating scientific publishers from the equation will be a fight as bitter as it is necessary, which should bring together researchers, governments and big pharmaceutical companies, whose interests are aligned. If we can be relatively pessimistic about the cooperation capacities of these different actors, we cannot doubt the fantastic power of transparency of the Blockchain which, combined with the determination of some entrepreneurs like the founders of Pluto, Scienceroot, ScienceMatters or Orvium platforms, will be a decisive tool in this fight to revolutionize the access to scientific knowledge.

The words and opinions expressed in this article are those of the author. The other authors involved in Resolving Pharma are not associated with it.

To go further :
Stephen Buranyi ; Is the staggeringly profitable business of scientific publishing bad for science? ; The Guardian ; 27/06/2017;
The Cost of Knowledge :
Priyanka Pulla ; The plan to mine the world’s research papers ; Nature ; Volume 571 ; 18/07/2019 ; 316-318
Bérénice Magistretti ; Disrupting the world of science publishing ; TechCrunch ; 27/11/2016
Daniele Fanelli ; Opinion : Is science really facing a reproducibility crisis, and do we need it to ? ; PNAS March 13, 2018 115 (11) 2628-2631; first published March 12, 2018;
D.A. Eisner ; Reproducibility of science: Fraud, impact factors and carelessness ; Journal of Molecular and Cellular Cardiology, Volume 114, January 2018, Pages 364-368
Chris Hartgerink ; Elsevier stopped me doing my research ; 16 Novembre 2015 ;
Joris van Rossum, The blockchain and its potential and academic publishing, Information Services & Use 38 (2018) 95-98 ; IOS Press
Douglas Heaven, Bitcoin for biological literature, Nature, 7/02/2019/ Volume 566
Manuel Martin ; Reinvent scientific publishing with blockchain technology ;
Sylvie Benzoni-Gavage ; The Conversation ; Comment les scientifiques s’organisent pour s’affranchir des aspects commerciaux des revues ;

These articles should interest you


Introduction to DeSci

How Science of the Future is being born before our eyes « [DeSci] transformed my research impact from a low-impact virology article every other year to saving the lives and…
Illustration In Silico

Towards virtual clinical trials?

Clinical trials are among the most critical and expensive steps in drug development. They are highly regulated by the various international health agencies, and for good reason: the molecule or…

To subscribe free of charge to the monthly Newsletter, click here.

Would you like to take part in the writing of Newsletter articles ? Would you like to take part in an entrepreneurial project on these topics ?

Contact us at ! Join our group LinkedIn !