The Role of IP Law in Regulating Bioinformatics and Genetic Data

Striking a Balance Between Public Health and Commercial Interests

Intellectual Property Rights
Bioinformatics
Genetic Data
Public Health
$ researcher --info
[AUTHOR] Akshat V. Tenneti
[DEGREE] LL.M. in Intellectual Property Rights and Technology Law
[INSTITUTION] Jindal Global Law School
[SUPERVISOR] Ms. Shreya Shreekant
$ dissertation --execute

Introduction and Conceptual Framework

Exploring how intellectual property law should govern bioinformatics and genetic data in a way that balances innovation incentives with public health imperatives.

Advances in genomics and bioinformatics have transformed both scientific research and healthcare, generating vast troves of genetic data and novel computational tools. The central problem addressed in this dissertation is how intellectual property (IP) law should regulate bioinformatics and genetic data in a manner that incentivizes innovation and commercial investment while safeguarding public health imperatives.

Core Tension

The tension between public health (which demands broad access to genetic information for research, diagnosis, and treatment) and commercial interests (which seek exclusivity and returns on investment through IP rights) is at the heart of this inquiry.

High-profile controversies highlight this tension – for example, patents on the BRCA1/2 breast cancer genes enabled a single company (Myriad Genetics) to monopolize diagnostic testing at a cost of ~$3000 per test, raising barriers for patients and researchers until the patents were invalidated.

Contemporary Example: AlphaFold

2024 Nobel Prize in Chemistry

The transformative impact of computational biology was highlighted when AlphaFold, DeepMind's AI-driven protein structure predictor, won the 2024 Nobel Prize in Chemistry. It has folded over 200 million proteins, open-sourced its database with EMBL-EBI, and become a standard tool with 30,000+ citations. Used by 2M+ researchers worldwide, AlphaFold demonstrates the importance of open science, balancing accessibility with biosecurity considerations.

Components of Bioinformatics

At its core, bioinformatics can be viewed as comprising three primary components:

Biological Sequences

DNA, RNA, and protein sequences that encode biological information and functions.

Databases

Systems that store and organize biological sequences and related information.

Software Tools

Algorithms and applications used to access, analyze, and interpret genetic data.

Problem Statement

Modern IP regimes struggle to properly regulate bioinformatics and genetic data. Patents, copyrights, database rights, and trade secrets each offer partial solutions but also create frictions. Overly broad IP protection of genetic data can stifle research collaboration and restrict access to health-related knowledge. Conversely, too little protection could dissuade private investment in genetic innovation and data curation, slowing down breakthroughs.

Research Question

How can IP law harmonize commercial interests and public health in the fields of bioinformatics and genetic data? In essence: how can we protect and incentivize valuable bioinformatics innovations while ensuring that critical genetic information remains accessible for the public good?

Research Questions

  1. How do existing IP laws – particularly patent, copyright, database, and trade secret law – currently apply to bioinformatics innovations and genetic data?
  2. What are the tensions between IP protection & public health with regards to genetic data?
  3. Where have legal systems drawn the line between public and private interests in notable cases or legislation, and what guiding principles emerge from these precedents?
  4. What legal or policy mechanisms could better balance innovation incentives with the need for accessibility in bioinformatics?

Key IP Concepts Relevant to the Study

Trade Secrets & Contracts

Companies may keep genomic data or algorithms secret to maintain a competitive edge. Proprietary databases protected by confidentiality and click-wrap user agreements are common in genomics. For example, Myriad Genetics maintained a private BRCA mutation database after losing patent protection.

International Agreements

Global IP treaties such as the Nagoya Protocol set baseline rules. The WTO's TRIPS Agreement mandates patent protection in different fields of technology (including biotechnology), but allows some exclusions (e.g., plants, natural discoveries).

Other instruments like the 1992 Convention on Biological Diversity emphasize fair sharing of benefits from genetic resources, including sharing research results, royalties or joint IP rights when utilizing genetic material.

Normative declarations (e.g., UNESCO's 1997 Universal Declaration on the Human Genome and Human Rights) assert that in its natural state "the human genome should not give rise to financial gains".

Lessons from Practice—Case Analyses and Comparative Perspectives

Exploring how legal frameworks function in practice through landmark cases and comparative regulatory approaches.

As we transition from examining the theoretical legal frameworks governing intellectual property in bioinformatics, this section explores how these frameworks function in practice through landmark cases and comparative regulatory approaches. The analysis reveals both the practical consequences of existing IP regimes and potential alternative models that might better balance innovation incentives with public health imperatives.

Association for Molecular Pathology v. Myriad Genetics (2013)

U.S. Supreme Court 2013

The dispute centered on Myriad's patents covering the BRCA1 and BRCA2 genes, mutations in which significantly increase risk for breast and ovarian cancer. Myriad's monopoly enabled them to charge approximately $3,000-$4,000 per test and prevent other laboratories from offering alternative testing services.

In a unanimous decision, the Supreme Court distinguished between naturally occurring genetic sequences and human-created modifications, holding that "a naturally occurring DNA segment is a product of nature and not patent eligible merely because it has been isolated." However, the Court upheld the patentability of complementary DNA (cDNA) as this lab-created construct does not occur naturally.

Impact of the Myriad Decision

Immediate Effect

Market Response

Within days of the decision, many laboratories announced competing BRCA testing services at significantly lower prices. Analysis by Zhuo Chen et al. documented that the average price for BRCA testing fell substantially following the ruling as competition increased.

Strategic Shift

From Patents to Trade Secrets

For Myriad, the decision necessitated a strategic pivot toward proprietary databases and algorithmic tools for variant interpretation, demonstrating how exclusivity strategies adapt to changing legal frameworks.

Long-term Effect

The Proprietary Data Dilemma

Conley, Cook-Deegan, and LĂĄzaro-MuĂąoz analyzed how Myriad shifted toward maintaining their variant database as a trade secret after losing patent protection, which they term "the proprietary data dilemma." This shift from transparent patent exclusivity to opaque trade secrecy has potential long-term implications for scientific collaboration.

Industry Impact

Innovation Response

Contrary to industry fears of decreased investment, empirical assessments indicate that patent applications for molecular diagnostics shifted toward method claims and modified approaches rather than declining overall. The decision recalibrated the boundary between unpatentable discoveries and patentable inventions in biotechnology.

Alternative Models and Collaborative Approaches

Human Genome Project & The Bermuda Principles

The Human Genome Project established a precedent for open-access approaches to genetic information through the Bermuda Principles, which mandated rapid public release of sequence data. This collaborative model has evolved through contemporary initiatives such as the Global Alliance for Genomics and Health (GA4GH), which develops frameworks for responsible genomic data sharing that balance openness with privacy concerns.

The GA4GH's Data Use Ontology and Machine Readable Consent framework represents significant technical innovations that help balance open science with appropriate controls on sensitive genetic information. These frameworks create standardized, computer-interpretable consent categories that enable automated compliance with ethical and legal requirements, potentially reducing transaction costs for data sharing while preserving necessary restrictions.

COVID-19 Response Models

The COVID-19 pandemic demonstrated the effectiveness of rapid and open genomic data sharing. The GISAID platform facilitated the sharing of SARS-CoV-2 genome sequences, enabling global surveillance of viral evolution while maintaining contributors' ability to pursue related innovations. As documented by Rourke et al., this model succeeded in promoting both public health responses and commercial vaccine development by creating a protected commons where sequences were accessible but contributors retained recognition rights.

The Open COVID Pledge represents another innovative approach, with companies temporarily waiving enforcement of patents related to COVID-19 technologies to enable collaborative pandemic response. This emergency measure demonstrated the feasibility of coordinated IP sharing in public health contexts and has sparked discussion of such permanent mechanisms for genetic diagnostics.

Collaborative IP Models in Biotechnology

The BioBricks Foundation's development of the BioBrick Public Agreement created a standardized legal framework for sharing synthetic biology components while preventing downstream patent thickets. Similarly, the Structural Genomics Consortium operates on a completely open-access model for protein structures, requiring all participants to keep or place their findings in the public domain, a model that has successfully attracted both academic and pharmaceutical industry participation.

Patent pools offer another promising approach for addressing IP fragmentation in bioinformatics. The Medicines Patent Pool has expanded to include technologies beyond HIV treatments, centralizing licensing to enhance access while maintaining innovation incentives through structured royalty arrangements. These initiatives point toward possible frameworks that move beyond the binary choice between maximum exclusivity and total openness.

Comparative Jurisdictional Approaches

Jurisdiction Approach to Gene Patents Key Features Guiding Principles
United States Post-Myriad approach: naturally occurring DNA unpatentable, cDNA patentable
  • 2019 USPTO guidance clarifying subject matter eligibility
  • Federal Circuit developments in cases like Illumina v. Ariosa (2020)
  • Method claims tied to genetic testing remain patentable when involving specific treatment steps
Distinction between unpatentable discoveries and patentable applications
United States Method claims based on natural correlations can be patent eligible
  • Vanda Pharmaceuticals Inc. v. West-Ward Pharmaceuticals (2018) - Federal Circuit upheld patents for personalized medicine method that used genetic testing to determine drug dosage
  • Court distinguished from Mayo case by emphasizing the treatment steps were "more than" a law of nature
  • Created pathway for diagnostic method patents that include specific treatment steps
  • Established that method patents connecting genetic testing with specific medical actions remain viable
Distinguishing between unpatentable diagnostic correlations and patentable treatment methods that apply those correlations
European Union Permits patenting of isolated biological material when industrial application is shown
  • EU Biotechnology Directive allows patents on isolated genes
  • Morality exclusions for certain technologies
  • Focus on data governance through Directive on Open Data
Balance between commercial certainty and ethical limitations
India Restrictive approach: excludes naturally occurring substances from patentable subject matter
  • Patents Act Section 3(c) excludes discovery of natural substances
  • Section 3(j) bars patents on plants and animals
  • Section 3(d) prevents patents on new forms of known substances unless they demonstrate "enhanced efficacy"
  • Traditional Knowledge Digital Library to prevent biopiracy
Prioritization of public health access over incentivizing private innovation

Comparative Insights

These comparative approaches reveal fundamentally different prioritizations of interests. The post-Myriad U.S. model prioritizes research freedom for basic genetic information while protecting engineered constructs. The European approach provides more certainty for commercial entities through broader patentability but introduces counterbalancing through data sharing mandates. India's system most explicitly prioritizes public access over exclusivity through expanded subject matter exclusions. No single jurisdiction has developed a comprehensive solution that satisfactorily balances all competing interests.

Ethical and Normative Considerations for Policy Development

Exploring the ethical frameworks that should inform IP policy for bioinformatics and genetic data.

The Unique Ethical Status of Genetic Information

Genetic information occupies an ethically distinctive position that complicates traditional property frameworks. Unlike conventional inventions, genetic sequences simultaneously constitute:

  • Personal information about individuals
  • Shared heritage across populations
  • Scientific knowledge about biological processes

This multidimensional character raises fundamental questions about the appropriateness of exclusive ownership models.

"The human genome in its natural state shall not give rise to financial gains."

— UNESCO Universal Declaration on the Human Genome and Human Rights

Contemporary bioethicists increasingly conceptualize genetic information within a "stewardship" rather than "ownership" framework, where those who develop genetic knowledge hold responsibilities as caretakers rather than absolute owners.

Global Health Equity and Distributive Justice

The distribution of genetic innovation benefits raises important concerns of justice. As seen in the Myriad case, IP rights can create barriers to diagnostic access with direct health consequences. This dynamic is extended globally, raising ethical questions about how IP regimes influence health inequities between wealthy and resource-limited settings.

The "10/90 Gap"

Only 10% of global health research addresses conditions affecting 90% of the world's population. This extends to genomics research, where over 80% of participants in large-scale genomic studies are of European ancestry, creating significant blind spots in understanding genetic factors in diverse populations.

Recent initiatives such as H3Africa (Human Heredity and Health in Africa) Consortium have established research networks generating African genomic data while implementing novel IP frameworks that balance local benefit-sharing with global research accessibility.

The 2001 Doha Declaration on TRIPS and Public Health, reinforced by subsequent WTO decisions, affirmed that countries maintain the right to issue compulsory licenses to protect public health, explicitly recognizing that IP rights should not trump access to essential medicines or diagnostic tools.

Indigenous and Community Rights

For many indigenous communities, genetic resources and associated knowledge are viewed as collective and intergenerational assets rather than potential objects of individual ownership.

The Problem of "Biopiracy"

Genetic resources and traditional knowledge are sometimes appropriated without permission or benefit-sharing, representing an ongoing ethical challenge. There are fundamental conceptual differences between Western intellectual property frameworks and indigenous understandings of knowledge as communal and intergenerational.

The Nagoya Protocol on Access and Benefit-sharing established principles requiring prior informed consent and mutually agreed terms for accessing genetic resources. However, implementation remains inconsistent with significant gaps between theoretical rights and practical protections.

Bioinformatics developments further complicate traditional knowledge protection by enabling digital sequence information to be separated from physical biological materials, potentially circumventing access and benefit-sharing requirements.

The CARE Principles

The CARE Principles for Indigenous Data Governance (Collective Benefit, Authority to Control, Responsibility, Ethics) provide standards that specifically address these concerns while emphasizing indigenous peoples' rights to control how their genetic information is used and commercialized.

Emerging Ethical Challenges: Technological Convergence

The convergence of artificial intelligence with genomics introduces novel ethical challenges for IP frameworks. Machine learning algorithms trained on genetic datasets can now generate insights that may qualify as inventions, raising questions about appropriate attribution and ownership.

AlphaFold: AI Meets Genomics

AI System 2021-Present

DeepMind's AlphaFold system, which used AI to predict the three-dimensional structures of nearly all known human proteins, exemplifies these boundary-blurring innovations. The black box nature of many AI systems compounds these challenges, undermining the social bargain at patent law's core: exclusive rights in exchange for teaching society how the invention works.

The U.S. Patent and Trademark Office's 2023 guidance on AI-assisted inventions confirms that AI systems cannot qualify as inventors under current law, but allows patenting of AI-assisted innovations where humans have made significant creative contributions. This policy clarification addresses immediate concerns but leaves open deeper questions about the fundamental purpose of patent incentives in an era where machines increasingly contribute to the innovative process.

Gene Editing and Synthetic Biology

Advances in gene editing technologies like CRISPR and synthetic biology raise additional ethical questions about appropriate IP boundaries. While earlier genetic technologies focused on reading or analyzing existing genetic code, today's technologies enable writing and modifying genetic sequences, blurring distinctions between discovery and invention.

The CRISPR patent dispute between the Broad Institute and University of California highlighted how complex overlapping patent claims in foundational technologies can restrict downstream innovation and access. However, the resolution of this dispute through a licensing arrangement demonstrates one potential approach that balances exclusivity and access. The agreement established a joint licensing platform that allows non-commercial research use while preserving commercial licensing rights for the patent holders.

Towards an Integrated Ethical Framework

Drawing insights from diverse ethical considerations, several key principles emerge that can guide more balanced IP policy for bioinformatics and genetic data:

Proportionality Principle

Exclusive rights should be in proportion to actual innovation contribution instead of extending to discoveries of natural phenomena. This principle results from the Myriad approach of distinguishing between unpatentable natural sequences and potentially patentable engineered constructs.

Tiered Access Principle

This principle acknowledges legitimate commercial interests while ensuring that essential health applications remain accessible. This could involve differential protection levels based on application context—stronger exclusivity for non-essential applications, weaker or time-limited rights for critical diagnostics.

Benefit-sharing Principle

This principle recognizes contributions from research participants, indigenous communities, and public funding. It supports mandatory contribution to data commons, reach-through benefits to communities, and recognition of collective interests in genetic resources.

Transparency Principle

Actual scientific progress requires not just access but also understanding. This supports robust measures such as disclosure requirements, limitations on black box proprietary algorithms in healthcare contexts, and research exemptions.

Intergenerational Responsibility

This principle acknowledges that genetic heritage is not limited to current stakeholders. It supports precautionary approaches to permanent modifications and maintenance of genetic diversity as common heritage.

Conclusion

Synthesizing insights and providing recommendations for future IP policy in bioinformatics and genetic data.

Synthesis of Research Insights

The preceding sections traced the uneasy co-evolution of intellectual-property doctrines and the fast-moving sciences of genomics, bioinformatics, and synthetic biology. We mapped the current doctrinal terrain: patents, at once an engine of investment and a brake on downstream experimentation; database rights and copyright, which protect the expensive curation of datasets but can hard-fence facts; and trade-secret or contractual strategies that flourish whenever formal IP protection recedes.

The Myriad case illustrated how aggressive exclusivity can chill diagnostic access yet also how firms pivot to proprietary data once sequence claims fall away. European law, by contrast, still extends patents to isolated sequences while tempering that breadth with disclosure and morality clauses, and India's Section 3(c) exclusion underscores a public-health–first model.

What is needed is layered governance in which:

  1. Narrowly-tailored patents reward clearly inventive manipulation of genetic information
  2. Open-science pledges and data commons preserve the research substrate
  3. Privacy, benefit-sharing, and anti-trust rules check enclosure that would frustrate public-health goals

Cross-cutting Insights

  1. Separation of discovery from invention remains the most robust doctrinal fulcrum. Wherever courts draw that line clearly (as Myriad did for naturally occurring DNA; as the EU does for patent claims that fail the "specific industrial application" test), diagnostic prices fall and follow-on research accelerates without a demonstrable collapse in private R&D.
  2. Data is becoming the decisive asset. From Myriad's post-patent pivot to trade-secret variant databases to AlphaFold's open but temporally-gated release model, competitive advantage now resides in curating, cleaning, and algorithmically interpreting sequence data rather than in owning the sequences themselves.
  3. Ethical legitimacy is a moving target. Public-health crises (SARS-CoV-2), equity critiques (the "missing diversity" problem), and Indigenous claims to genetic resources have already pressured legislators toward hybrid solutions—e.g., compulsory-licensing triggers tied to WHO emergencies, and the disclosure/benefit-sharing obligations baked into the 2024 WIPO GR-ATK Treaty.

These insights converge on a core proposition: a future-proof IP regime for bioinformatics is necessarily plural, conditional, and integrated with adjacent regulatory fields (privacy, insolvency, conservation).

Future Directions & Recommendations

The next decade will confront lawmakers with questions that did not exist when Myriad was decided. Two recent flash-points—de-extinction inventions (e.g., the claimed resurrection of dire wolves) and the insolvency of a major consumer-genomics platform (23andMe)—expose blind spots in today's patchwork. Building on the dissertation's findings, the following recommendations aim to anticipate such shocks while preserving incentives for genuine innovation.

Genomic Data Fiduciaries & Insolvency "Firewalls"

Signal Event

23andMe's Chapter 11 petition has left 15 million customers unsure whether their permanent genetic identifiers could be sold to the highest bidder.

Problem: IP law treats databases as assets; bankruptcy law treats assets as transferable unless exempted; privacy statutes offer patchy, ex post remedies.

Recommendations:

  1. Statutory "genomic data trusts." Mandate that any entity collecting DNA at scale hold raw genotypes and associated health in a purpose-bound trust whose fiduciary duty runs to data subjects and the public, not to creditors. Beneficial ownership of that trust would never vest in the estate, rendering the data non-transferable in insolvency absent affirmative, court-supervised consent.
  2. Compulsory escrow of interpretive algorithms. To avoid locking data into a single analytics stack, require firms above a sequencing-volume threshold to deposit annotated pipelines in escrow, to be released under open-source or FRAND-style terms if the firm ceases to operate.
  3. Priority exit licensing. Where transfer is unavoidable, impose a first-refusal right for non-profit biobanks or public-health authorities under cost-based terms, ensuring continuity of beneficial research uses while deterring speculative data purchases.

A Sui Generis De-Extinction & Synthetic-Rescue Regime

Signal Event

Colossal Biosciences' announcement of three CRISPR-engineered "dire wolf" pups and the ensuing debate over patent claims on reconstructed genes.

Problem: De-extinction straddles discovery (ancient DNA) and invention (synthetic sequence and cloning process). Existing patent exclusions for "products of nature" may be under-inclusive, while ABS frameworks such as Nagoya assume the prior existence of living source material.

Recommendations:

  1. Process-centric patentability with mandatory open-sequence publication. Permit claims on genuinely inventive assembly or editing processes, but require deposition of the resulting sequence in a publicly searchable registry within five years, after which sequence claims expire.
  2. Digital ABS license. Extend benefit-sharing obligations to digital sequence information derived from extinct species; royalties (or data-sharing quotas) should flow into a global "Synthetic Biodiversity Fund" earmarked for ecosystem restoration, aligning incentives with conservation rather than enclosure.

Adaptive IP/AI Convergence Governance

Signal Trend

Large-language-model architectures now design proteins and small molecules at scale, collapsing the temporal gap between in silico insight and wet-lab realization.

Recommendations:

  1. Mandatory algorithmic-transparency annexes for patents claiming AI-generated bioinformatic inventions—enforceable not through forced disclosure of source code (which may harbor trade secrets) but via auditable model cards, training-data provenance statements, and reproducible test suites.
  2. Dynamic claim scope tied to training data openness. Offer broader claim breadth or term extensions where applicants make underlying training genomic datasets FAIR (Findable, Accessible, Interoperable, Re-usable), and narrower scope where data remain proprietary.
  3. Human-in-the-loop inventorship tests. Codify that where a model's output constitutes the sole inventive step, the public domain is the default owner unless a human demonstrates a non-obvious, shaping contribution.

Toward a Layered, Mission-Oriented Bioinformatics Commons

Legal architecture should move beyond binary debates (open versus proprietary) toward mission-oriented layering:

  • Open pre-competitive layers - reference genomes, epidemiological dashboards—supported by public funding and compulsory data-sharing mandates.
  • Middle layers of conditional exclusivity - validated diagnostic algorithms, clinical-grade variant databases—licensed on FRAND-plus-health-equity terms that include price caps in low- and middle-income jurisdictions.

Operationalizing this stack will require coordinated reform across patent statutes, data-protection law, and biodiversity treaties, but its virtue is systemic resilience. Any shocks in one layer (bankruptcy, an ecological mis-fire, an AI paradigm shift) do not collapse the entire edifice.

Final Insight

The dissertation's journey from doctrinal mapping through empirical case studies to ethical interrogation demonstrates that intellectual property is a steering mechanism, not an end in itself. Used narrowly, it can still underwrite the costly ingenuity required to decode, edit, and deploy the genome. Used reflexively, it can privatize the informational foundations of global public health. The recommendations offered here translate that insight into practical levers. Implemented together, they conceive an IP ecosystem that is anticipatory rather than reactive, plural rather than monolithic, and ultimately aligned with a 21st-century vision of health, equity, and planetary stewardship.