Striking a Balance Between Public Health and Commercial Interests
Exploring how intellectual property law should govern bioinformatics and genetic data in a way that balances innovation incentives with public health imperatives.
Advances in genomics and bioinformatics have transformed both scientific research and healthcare, generating vast troves of genetic data and novel computational tools. The central problem addressed in this dissertation is how intellectual property (IP) law should regulate bioinformatics and genetic data in a manner that incentivizes innovation and commercial investment while safeguarding public health imperatives.
The tension between public health (which demands broad access to genetic information for research, diagnosis, and treatment) and commercial interests (which seek exclusivity and returns on investment through IP rights) is at the heart of this inquiry.
High-profile controversies highlight this tension â for example, patents on the BRCA1/2 breast cancer genes enabled a single company (Myriad Genetics) to monopolize diagnostic testing at a cost of ~$3000 per test, raising barriers for patients and researchers until the patents were invalidated.
The transformative impact of computational biology was highlighted when AlphaFold, DeepMind's AI-driven protein structure predictor, won the 2024 Nobel Prize in Chemistry. It has folded over 200 million proteins, open-sourced its database with EMBL-EBI, and become a standard tool with 30,000+ citations. Used by 2M+ researchers worldwide, AlphaFold demonstrates the importance of open science, balancing accessibility with biosecurity considerations.
At its core, bioinformatics can be viewed as comprising three primary components:
DNA, RNA, and protein sequences that encode biological information and functions.
Systems that store and organize biological sequences and related information.
Algorithms and applications used to access, analyze, and interpret genetic data.
Modern IP regimes struggle to properly regulate bioinformatics and genetic data. Patents, copyrights, database rights, and trade secrets each offer partial solutions but also create frictions. Overly broad IP protection of genetic data can stifle research collaboration and restrict access to health-related knowledge. Conversely, too little protection could dissuade private investment in genetic innovation and data curation, slowing down breakthroughs.
How can IP law harmonize commercial interests and public health in the fields of bioinformatics and genetic data? In essence: how can we protect and incentivize valuable bioinformatics innovations while ensuring that critical genetic information remains accessible for the public good?
Companies may keep genomic data or algorithms secret to maintain a competitive edge. Proprietary databases protected by confidentiality and click-wrap user agreements are common in genomics. For example, Myriad Genetics maintained a private BRCA mutation database after losing patent protection.
Global IP treaties such as the Nagoya Protocol set baseline rules. The WTO's TRIPS Agreement mandates patent protection in different fields of technology (including biotechnology), but allows some exclusions (e.g., plants, natural discoveries).
Other instruments like the 1992 Convention on Biological Diversity emphasize fair sharing of benefits from genetic resources, including sharing research results, royalties or joint IP rights when utilizing genetic material.
Normative declarations (e.g., UNESCO's 1997 Universal Declaration on the Human Genome and Human Rights) assert that in its natural state "the human genome should not give rise to financial gains".
Exploring the foundational legal framework including major cases, statutes, and policy principles.
The intersection of IP law and genetic data is characterized by evolving legal doctrines and high-profile disputes that frame the conceptual approach of this research. This section outlines the foundational legal framework including major cases, statutes, and policy principles.
U.S. Supreme Court allowed a patent on a human-made bacterium that would eat oil, emphasizing that Congress intended patentable subject matter to "include anything under the sun that is made by man."
Set the baseline by requiring WTO member countries to make patents available "for any inventions, in all fields of technology" (which includes biotechnological inventions), "provided the inventions are new, involve an inventive step, and are capable of industrial application".
Explicitly confirms that biological material which is "isolated or produced by means of a technical process", including gene sequences, "may constitute a patentable invention, even if the structure of that element is identical to that of a natural element."
U.S. Supreme Court unanimously decided that "naturally occurring DNA sequences, even when isolated from the human body, are unpatentable because they are products of nature." However, complementary DNA (cDNA) remained patentable.
High Court of Australia followed the U.S. lead and held that isolated human DNA is not a patentable invention under Australian law, despite prior practice to the contrary.
The United States went through a significant shift during the late 20th and early 21st centuries on gene patents. For a time, the U.S. Patent and Trademark Office did grant isolated DNA sequences patents, viewing the act of isolation as human "making." By around 2005, it was estimated that around 20% of human genes had been claimed in U.S. patents.
The watershed moment came when the Supreme Court unanimously decided in Association for Molecular Pathology v. Myriad Genetics, Inc. (2013) that "naturally occurring DNA sequences, even when isolated from the human body, are unpatentable because they are products of nature." However, the Court drew a line between two DNAs:
In the European Union, the legal framework was cemented by the Biotechnology Directive 98/44/EC. This Directive explicitly confirms that biological material which is "isolated or produced by means of a technical process", including gene sequences, "may constitute a patentable invention, even if the structure of that element is identical to that of a natural element."
Under EU law, an isolated human gene is patentable if patentability criteria is met and the industrial application (utility) of that gene is disclosed in the application. The European Patent Office (EPO) applies this approach in examining applications.
The Directive and EPC also maintain that mere discoveries are not patentable - one cannot patent the gene as it exists in the human body, only the isolated form with a specified use. Additionally, ethical limits exist: the Directive bans process patents for human cloning, modifying the human germ line or using for commercial purposes, human embryos (Article 6, morality clause).
India has taken a more restrictive stance consistent with protecting public health and indigenous resources. The Indian Patents Act 1970, as amended, explicitly excludes "the mere discovery of any living thing or non-living substance occurring in nature" from patentable subject matter (Section 3(c)). This means that a "naturally occurring gene" or protein isn't patentable in India, even if isolated.
Additionally, Section 3(j) bars patents on plants and animals (including parts thereof such as seeds, varieties, and species) other than microorganisms. The combined effect is that naturally existing gene sequences are unpatentable in India, aligning with the principle that discoveries of existing biological material lack the inventive element.
Only if a gene is significantly modified or used in a novel way â for instance, as part of a recombinant DNA construct, or a new synthetic sequence with substantial human intervention and a specific industrial application â can it potentially be patentable.
Examining how copyright and data licensing frameworks complement or conflict with patent protection in bioinformatics.
While patent law has been the flashpoint for gene-related IP disputes, copyright and data rights form another piece of the framework. Copyright's role in protecting bioinformatics arises notably in software and databases.
Bioinformatics algorithms and software tools (for example, those used for DNA sequence alignment or for identifying gene variants) are typically protected by copyright as computer programs. The authors (or their employers) hold rights that can prevent unauthorized copying or distribution of the software.
However, because copyright does not block independent creation and doesn't protect underlying ideas (such as the mathematical methods in the algorithm), its power in creating monopolies is limited compared to patents.
Large aggregations of genetic data often come with terms of use or data-sharing policies that function as a form of "private ordering" of IP rights.
For example, international genomic consortia or public databases sometimes require users to agree not to appropriate the data for commercial patents or to respect certain openness principles.
The National Institutes of Health (NIH) encourages broad sharing of genomic data and explicitly "discourages the use of patents to block access to genomic or genotypeâphenotype data".
In practice, this means research projects funded by NIH often have to release genomic datasets into open repositories (like GenBank or dbGaP) and cannot assert IP that would prevent others from using that data for further research.
This policy embodies the idea that genomic data, especially when generated with public funds, should be treated as a public resource to advance science, rather than a proprietary asset.
These principles mandated rapid public release of Human Genome Project data, treating sequence data as a common public good to accelerate discovery.
Hover to see the modern implications
During the COVID-19 pandemic, scientists rapidly published the SARS-CoV-2 virus genome openly. Researchers deliberately chose not to patent the viral genome sequence, instead posting it to public databases, which enabled labs worldwide to immediately start building tests and vaccines.
This instance powerfully illustrates how open access to genetic information can directly serve public health.
The openness ideal sometimes clashes with commercial realities. Private entities that invest heavily in sequencing or data aggregation may seek returns via exclusivity.
These strategies effectively create a fenced garden around data, even if formal IP law (like copyright) might not directly apply to the raw data.
The balance between openness and protection in bioinformatics often comes down to policy choices: institutions like NIH push towards the public health side (open data for collective benefit), whereas private firms lean towards proprietary models to recoup investment. Law and policy must mediate these approaches.
Exploring how legal frameworks function in practice through landmark cases and comparative regulatory approaches.
As we transition from examining the theoretical legal frameworks governing intellectual property in bioinformatics, this section explores how these frameworks function in practice through landmark cases and comparative regulatory approaches. The analysis reveals both the practical consequences of existing IP regimes and potential alternative models that might better balance innovation incentives with public health imperatives.
The dispute centered on Myriad's patents covering the BRCA1 and BRCA2 genes, mutations in which significantly increase risk for breast and ovarian cancer. Myriad's monopoly enabled them to charge approximately $3,000-$4,000 per test and prevent other laboratories from offering alternative testing services.
In a unanimous decision, the Supreme Court distinguished between naturally occurring genetic sequences and human-created modifications, holding that "a naturally occurring DNA segment is a product of nature and not patent eligible merely because it has been isolated." However, the Court upheld the patentability of complementary DNA (cDNA) as this lab-created construct does not occur naturally.
Within days of the decision, many laboratories announced competing BRCA testing services at significantly lower prices. Analysis by Zhuo Chen et al. documented that the average price for BRCA testing fell substantially following the ruling as competition increased.
For Myriad, the decision necessitated a strategic pivot toward proprietary databases and algorithmic tools for variant interpretation, demonstrating how exclusivity strategies adapt to changing legal frameworks.
Conley, Cook-Deegan, and LĂĄzaro-MuĂąoz analyzed how Myriad shifted toward maintaining their variant database as a trade secret after losing patent protection, which they term "the proprietary data dilemma." This shift from transparent patent exclusivity to opaque trade secrecy has potential long-term implications for scientific collaboration.
Contrary to industry fears of decreased investment, empirical assessments indicate that patent applications for molecular diagnostics shifted toward method claims and modified approaches rather than declining overall. The decision recalibrated the boundary between unpatentable discoveries and patentable inventions in biotechnology.
The Human Genome Project established a precedent for open-access approaches to genetic information through the Bermuda Principles, which mandated rapid public release of sequence data. This collaborative model has evolved through contemporary initiatives such as the Global Alliance for Genomics and Health (GA4GH), which develops frameworks for responsible genomic data sharing that balance openness with privacy concerns.
The GA4GH's Data Use Ontology and Machine Readable Consent framework represents significant technical innovations that help balance open science with appropriate controls on sensitive genetic information. These frameworks create standardized, computer-interpretable consent categories that enable automated compliance with ethical and legal requirements, potentially reducing transaction costs for data sharing while preserving necessary restrictions.
The COVID-19 pandemic demonstrated the effectiveness of rapid and open genomic data sharing. The GISAID platform facilitated the sharing of SARS-CoV-2 genome sequences, enabling global surveillance of viral evolution while maintaining contributors' ability to pursue related innovations. As documented by Rourke et al., this model succeeded in promoting both public health responses and commercial vaccine development by creating a protected commons where sequences were accessible but contributors retained recognition rights.
The Open COVID Pledge represents another innovative approach, with companies temporarily waiving enforcement of patents related to COVID-19 technologies to enable collaborative pandemic response. This emergency measure demonstrated the feasibility of coordinated IP sharing in public health contexts and has sparked discussion of such permanent mechanisms for genetic diagnostics.
The BioBricks Foundation's development of the BioBrick Public Agreement created a standardized legal framework for sharing synthetic biology components while preventing downstream patent thickets. Similarly, the Structural Genomics Consortium operates on a completely open-access model for protein structures, requiring all participants to keep or place their findings in the public domain, a model that has successfully attracted both academic and pharmaceutical industry participation.
Patent pools offer another promising approach for addressing IP fragmentation in bioinformatics. The Medicines Patent Pool has expanded to include technologies beyond HIV treatments, centralizing licensing to enhance access while maintaining innovation incentives through structured royalty arrangements. These initiatives point toward possible frameworks that move beyond the binary choice between maximum exclusivity and total openness.
| Jurisdiction | Approach to Gene Patents | Key Features | Guiding Principles |
|---|---|---|---|
| United States | Post-Myriad approach: naturally occurring DNA unpatentable, cDNA patentable |
|
Distinction between unpatentable discoveries and patentable applications |
| United States | Method claims based on natural correlations can be patent eligible |
|
Distinguishing between unpatentable diagnostic correlations and patentable treatment methods that apply those correlations |
| European Union | Permits patenting of isolated biological material when industrial application is shown |
|
Balance between commercial certainty and ethical limitations |
| India | Restrictive approach: excludes naturally occurring substances from patentable subject matter |
|
Prioritization of public health access over incentivizing private innovation |
These comparative approaches reveal fundamentally different prioritizations of interests. The post-Myriad U.S. model prioritizes research freedom for basic genetic information while protecting engineered constructs. The European approach provides more certainty for commercial entities through broader patentability but introduces counterbalancing through data sharing mandates. India's system most explicitly prioritizes public access over exclusivity through expanded subject matter exclusions. No single jurisdiction has developed a comprehensive solution that satisfactorily balances all competing interests.
Exploring the ethical frameworks that should inform IP policy for bioinformatics and genetic data.
Genetic information occupies an ethically distinctive position that complicates traditional property frameworks. Unlike conventional inventions, genetic sequences simultaneously constitute:
This multidimensional character raises fundamental questions about the appropriateness of exclusive ownership models.
"The human genome in its natural state shall not give rise to financial gains."
â UNESCO Universal Declaration on the Human Genome and Human RightsContemporary bioethicists increasingly conceptualize genetic information within a "stewardship" rather than "ownership" framework, where those who develop genetic knowledge hold responsibilities as caretakers rather than absolute owners.
The distribution of genetic innovation benefits raises important concerns of justice. As seen in the Myriad case, IP rights can create barriers to diagnostic access with direct health consequences. This dynamic is extended globally, raising ethical questions about how IP regimes influence health inequities between wealthy and resource-limited settings.
Only 10% of global health research addresses conditions affecting 90% of the world's population. This extends to genomics research, where over 80% of participants in large-scale genomic studies are of European ancestry, creating significant blind spots in understanding genetic factors in diverse populations.
Recent initiatives such as H3Africa (Human Heredity and Health in Africa) Consortium have established research networks generating African genomic data while implementing novel IP frameworks that balance local benefit-sharing with global research accessibility.
The 2001 Doha Declaration on TRIPS and Public Health, reinforced by subsequent WTO decisions, affirmed that countries maintain the right to issue compulsory licenses to protect public health, explicitly recognizing that IP rights should not trump access to essential medicines or diagnostic tools.
For many indigenous communities, genetic resources and associated knowledge are viewed as collective and intergenerational assets rather than potential objects of individual ownership.
Genetic resources and traditional knowledge are sometimes appropriated without permission or benefit-sharing, representing an ongoing ethical challenge. There are fundamental conceptual differences between Western intellectual property frameworks and indigenous understandings of knowledge as communal and intergenerational.
The Nagoya Protocol on Access and Benefit-sharing established principles requiring prior informed consent and mutually agreed terms for accessing genetic resources. However, implementation remains inconsistent with significant gaps between theoretical rights and practical protections.
Bioinformatics developments further complicate traditional knowledge protection by enabling digital sequence information to be separated from physical biological materials, potentially circumventing access and benefit-sharing requirements.
The CARE Principles for Indigenous Data Governance (Collective Benefit, Authority to Control, Responsibility, Ethics) provide standards that specifically address these concerns while emphasizing indigenous peoples' rights to control how their genetic information is used and commercialized.
The convergence of artificial intelligence with genomics introduces novel ethical challenges for IP frameworks. Machine learning algorithms trained on genetic datasets can now generate insights that may qualify as inventions, raising questions about appropriate attribution and ownership.
DeepMind's AlphaFold system, which used AI to predict the three-dimensional structures of nearly all known human proteins, exemplifies these boundary-blurring innovations. The black box nature of many AI systems compounds these challenges, undermining the social bargain at patent law's core: exclusive rights in exchange for teaching society how the invention works.
The U.S. Patent and Trademark Office's 2023 guidance on AI-assisted inventions confirms that AI systems cannot qualify as inventors under current law, but allows patenting of AI-assisted innovations where humans have made significant creative contributions. This policy clarification addresses immediate concerns but leaves open deeper questions about the fundamental purpose of patent incentives in an era where machines increasingly contribute to the innovative process.
Advances in gene editing technologies like CRISPR and synthetic biology raise additional ethical questions about appropriate IP boundaries. While earlier genetic technologies focused on reading or analyzing existing genetic code, today's technologies enable writing and modifying genetic sequences, blurring distinctions between discovery and invention.
The CRISPR patent dispute between the Broad Institute and University of California highlighted how complex overlapping patent claims in foundational technologies can restrict downstream innovation and access. However, the resolution of this dispute through a licensing arrangement demonstrates one potential approach that balances exclusivity and access. The agreement established a joint licensing platform that allows non-commercial research use while preserving commercial licensing rights for the patent holders.
Drawing insights from diverse ethical considerations, several key principles emerge that can guide more balanced IP policy for bioinformatics and genetic data:
Exclusive rights should be in proportion to actual innovation contribution instead of extending to discoveries of natural phenomena. This principle results from the Myriad approach of distinguishing between unpatentable natural sequences and potentially patentable engineered constructs.
This principle acknowledges legitimate commercial interests while ensuring that essential health applications remain accessible. This could involve differential protection levels based on application contextâstronger exclusivity for non-essential applications, weaker or time-limited rights for critical diagnostics.
This principle recognizes contributions from research participants, indigenous communities, and public funding. It supports mandatory contribution to data commons, reach-through benefits to communities, and recognition of collective interests in genetic resources.
Actual scientific progress requires not just access but also understanding. This supports robust measures such as disclosure requirements, limitations on black box proprietary algorithms in healthcare contexts, and research exemptions.
This principle acknowledges that genetic heritage is not limited to current stakeholders. It supports precautionary approaches to permanent modifications and maintenance of genetic diversity as common heritage.
Synthesizing insights and providing recommendations for future IP policy in bioinformatics and genetic data.
The preceding sections traced the uneasy co-evolution of intellectual-property doctrines and the fast-moving sciences of genomics, bioinformatics, and synthetic biology. We mapped the current doctrinal terrain: patents, at once an engine of investment and a brake on downstream experimentation; database rights and copyright, which protect the expensive curation of datasets but can hard-fence facts; and trade-secret or contractual strategies that flourish whenever formal IP protection recedes.
The Myriad case illustrated how aggressive exclusivity can chill diagnostic access yet also how firms pivot to proprietary data once sequence claims fall away. European law, by contrast, still extends patents to isolated sequences while tempering that breadth with disclosure and morality clauses, and India's Section 3(c) exclusion underscores a public-healthâfirst model.
What is needed is layered governance in which:
These insights converge on a core proposition: a future-proof IP regime for bioinformatics is necessarily plural, conditional, and integrated with adjacent regulatory fields (privacy, insolvency, conservation).
The next decade will confront lawmakers with questions that did not exist when Myriad was decided. Two recent flash-pointsâde-extinction inventions (e.g., the claimed resurrection of dire wolves) and the insolvency of a major consumer-genomics platform (23andMe)âexpose blind spots in today's patchwork. Building on the dissertation's findings, the following recommendations aim to anticipate such shocks while preserving incentives for genuine innovation.
23andMe's Chapter 11 petition has left 15 million customers unsure whether their permanent genetic identifiers could be sold to the highest bidder.
Problem: IP law treats databases as assets; bankruptcy law treats assets as transferable unless exempted; privacy statutes offer patchy, ex post remedies.
Recommendations:
Colossal Biosciences' announcement of three CRISPR-engineered "dire wolf" pups and the ensuing debate over patent claims on reconstructed genes.
Problem: De-extinction straddles discovery (ancient DNA) and invention (synthetic sequence and cloning process). Existing patent exclusions for "products of nature" may be under-inclusive, while ABS frameworks such as Nagoya assume the prior existence of living source material.
Recommendations:
Large-language-model architectures now design proteins and small molecules at scale, collapsing the temporal gap between in silico insight and wet-lab realization.
Recommendations:
Legal architecture should move beyond binary debates (open versus proprietary) toward mission-oriented layering:
Operationalizing this stack will require coordinated reform across patent statutes, data-protection law, and biodiversity treaties, but its virtue is systemic resilience. Any shocks in one layer (bankruptcy, an ecological mis-fire, an AI paradigm shift) do not collapse the entire edifice.
The dissertation's journey from doctrinal mapping through empirical case studies to ethical interrogation demonstrates that intellectual property is a steering mechanism, not an end in itself. Used narrowly, it can still underwrite the costly ingenuity required to decode, edit, and deploy the genome. Used reflexively, it can privatize the informational foundations of global public health. The recommendations offered here translate that insight into practical levers. Implemented together, they conceive an IP ecosystem that is anticipatory rather than reactive, plural rather than monolithic, and ultimately aligned with a 21st-century vision of health, equity, and planetary stewardship.