Start your day with intelligence. Get The OODA Daily Pulse.
Science fiction has probably done a huge disservice to the emerging bioeconomy, depicting any engagement with biology-based entities or systems with what always seems to include the prerequisite unseemly, fear-inducing, salivary-esque, sometimes lugubrious blob-like, purple or green, mucousy, gelatin substance. Or Tom Cruise crawling around on all fours to put his eyeball ball in his socket as Pre-crime agent John Anderton in the seminal Minority Report. We argue that the bioeconomy will lead all industry sectors into a fascinating (and anything but fear-inducing) future filled with abundance and promise – including DNA-based data storage capabilities. DNA Data Storage? Yes. Let’s start there.
Forget brick and mortar, steel, the gold standard, cotton, coal mines, steel mills, and gold bars. Think enzymes, proteins, biomaterials, biocarbon conversion… bio-foundry services, and microbial factory tools.
There is a definite “North Star” of the 2024 OODA Loop research agenda. Based on insights garnered from our post-OODAcon 2023 internal Hot
Wash, our “collective OODA Loop’, and some great strategic insights provided by our fellow travelers at ARK Invest, we have put bioengineering and synthetic biology in this north star, or pace car, position for a few reasons:
We are not trying to hard sell you here, but for strategic purposes, your organization needs to start internalizing that everything will plug and play into the bioeconomy over time. Forget brick and mortar, steel, the gold standard, cotton, coal mines, steel mills, and gold bars. Think enzymes, proteins, biomaterials, biocarbon conversion, DNA sequencing, AI analytics, and LLMS based on massive biological datasets, bio-foundry services, and microbial factory tools. And, yes, DNA-based data storage.
Integrating DNA-based data storage with the Internet and synthetic biology could usher in a new era of biodefense and public health systems. The question then becomes how DNA data storage will be made viable responsibly and ethically.
DNA-based data storage represents one of the most fascinating frontiers in the intersection of biology and technology, offering a paradigm shift in how we conceive of and manage data storage in the digital age. At its core, DNA-based data storage involves encoding digital information into the sequences of DNA, nature’s own storage medium, which has evolved over billions of years to store genetic information in a compact, durable form. This method capitalizes on the incredible density of DNA, which can theoretically store exabytes of data in a fraction of the space required by traditional storage media, and its longevity, with the potential to preserve data for thousands of years under the right conditions.
The process of DNA-based data storage involves converting digital binary data (0s and 1s) into the four nucleotide bases of DNA (adenine, cytosine, guanine, and thymine, represented as A, C, G, T), synthesizing these sequences into physical DNA, and then reading the data back by sequencing the DNA and converting it back into digital form. This technology is not just a theoretical exercise; it has been demonstrated in practice, with researchers successfully encoding various forms of data, including text, images, and even videos, into DNA.
The following is a sampling of a few headlines and summaries of a couple of recent news items and an analysis of DNA’s potential as a data storage platform—with a blog post from Synbiobeta contributor Jennifer Tsang, Ph.D., as the centerpiece of the analysis.
“Current methods for storing digital data can’t keep up with demand. DNA data storage poses an eco-friendly mechanism for our growing data needs.”
If Synbiobeta and its annual Global Conference in San Jose are not already on your radar, they should be on your tracking and sensemaking hit list.
As a pre-read for the Reading, Writing, and Editing DNA track at the upcoming SynBioBeta 2024 conference in early May, Tsang recently contributed the following:
Cloud storage. Social media platforms. Streaming services. Large language models. These platforms all contribute to a growing challenge: the need to store lots and lots of data. In 2010, we generated three zettabytes of data. That number is predicted to grow to 180 zettabytes just next year. “It’s hard to fathom how much data that is,” says Mike Kamdar, chair of the Reading, Writing, and Editing DNA track at the SynBioBeta 2024 conference and most recently, president and CEO of Molecular Assemblies. “Data is being generated at such an exponential rate that current storage systems can’t keep up,” says Emily Leproust, CEO of Twist Bioscience. Our current ways of storing data come at a cost to the environment. Therefore, many biotechs are turning towards a less environmentally disruptive solution to our growing data storage problems: storing data in DNA.
Using DNA as a data medium requires six steps: encoding, DNA synthesis, physical storage, data retrieval, sequencing, and decoding. As digital data is currently stored as strings of 0’s and 1’s, the first step is to convert this information into the bases that make up DNA. Then, this DNA is synthesized and encapsulated for long-term physical storage. When the data needs to be retrieved, the DNA is sequenced and subsequently decoded to turn the DNA sequence into digital data.
Our current methods of storing data, including hard drives, solid-state drives, and tape, have many environmental impacts. “Data centers in the US consume an average of 5,000 MW a day, whereas global centers consume an average of 14,000 MW daily,” says Leproust. By 2030, our current means of digital storage could consume 3-13% of the total global electricity.
“When you start looking at some of these data farms, they have an incredible footprint as it relates to use of electricity and land,” says Kamdar. “When you look at DNA, you can store your entire genetic code on your fingertip.”
“Despite significant progress made in reducing their environmental impact, the storage units in data centers are enormous consumers of energy, rare earth elements, and water during their manufacture and use,” says Erfan Arwani, CEO and co-founder of Biomemory, a start-up that recently launched a “DNA card” that stores 1 kilobyte of data.
The current environmental impacts of data storage are especially compounded by the fact that the data needs to be rewritten every 2 to 5 years, depending on the type of drive, to prevent data degradation.
In contrast, DNA has many qualities that make it ideal for data storage. “DNA does not require electricity. It’s fairly stable over long periods of time at room temperature,” says Kamdar.
“With DNA, IT professionals can use a ‘store it and forget it’ mentality,” says Leproust. “DNA is the most sustainable solution for long-term storage.”
The ability to store data in DNA requires the synthesis of an abundance of DNA. While chemical methods of DNA synthesis require large amounts of harsh chemicals that aren’t environmentally friendly, companies like Molecular Assemblies and Twist Bioscience are optimizing ways to synthesize DNA enzymatically, a method that less impacts the environment because it can be done in an aqueous environment. Enzymatic synthesis also enables the creation of longer pieces of DNA needed to be used as a storage mechanism.
Twist Bioscience has also miniaturized DNA synthesis by using a semiconductor-based DNA synthesis platform, which allows for fewer reagents and increased throughput and scalability – something that is beneficial for data storage applications. “On the production chip today, which is a little larger than an iPhone, we make more than one million short pieces of DNA, or oligos, at a time,” says Leproust. In contrast, traditional methods generate only 96 oligos in the same amount of space.
A four-year initiative is working to build a small device able to write data onto DNA-like polymers and a parallel device capable of reading the polymer-stored data. The initiative seeks to “present a clear and commercially viable path to future deployment at the exabyte scale” within ten years.
Prior projections for data storage requirements estimated a global need for about 12 million petabytes of capacity by 2030.
The research firm Gartner recently issued new projections, raising that estimate by 20 million petabytes.
The world is not on track to produce enough of today’s storage technologies to fill that gap.
Source: GARTNER
Data is piling up exponentially, and the rate of information production is increasing faster than the storage density of tape, which will only be able to keep up with the deluge of data for a few more years. The research firm Gartner predicts that by 2030, the shortfall in enterprise storage capacity alone could amount to nearly two-thirds of demand, or about 20 million petabytes. If we continue down our current path, in coming decades we would need not only exponentially more magnetic tape, disk drives, and flash memory, but exponentially more factories to produce these storage media, and exponentially more data centers and warehouses to store them. Even if this is technically feasible, it’s economically implausible.
…DNA exceeds by many times the storage density of magnetic tape or solid-state media. It has been calculated that all the information on the Internet—which one estimate puts at about 120 zettabytes—could be stored in a volume of DNA about the size of a sugar cube, or approximately a cubic centimeter. Achieving that density is theoretically possible, but we could get by with a much lower storage density. An effective storage density of “one Internet per 1,000 cubic meters” would still result in something considerably smaller than a single data center housing tape today.
The DNA Data Storage Alliance introduced its inaugural specifications for DNA-based data storage this week. This specification outlines a method for encoding essential information within a DNA data archive, crucial for developing and commercializing an interoperable storage ecosystem.
DNA data storage uses short strings of deoxyribonucleic acid (DNA) called oligonucleotides (oligos) mixed together without a specific physical ordering scheme. This storage media lacks a dedicated controller and an organizational means to understand the proximity of one media subcomponent to another. DNA storage differs significantly from traditional media like tape, HDD, and SSD, which have fixed structures and controllers that can read and write data from the structured media. DNA’s lack of physical structure requires a unique approach to initiate data retrieval, which brings its peculiarities regarding standardization.
To address this, the SNIA DNA Archive Rosetta Stone (DARS) working group, part of the DNA Data Storage Alliance, has developed two specifications, Sector Zero and Sector One, to facilitate the process of starting a DNA archive.
Scientists announced [in December of 2019] that they may have uncovered a new method for mixing genetically encoded data into manufacturing materials after they stored DNA data in a plastic 3-D printed bunny. The scientists sealed the synthetic DNA data inside microscopic glass beads to protect the information as the plastic for the toy was heated. Using a small portion of the DNA-infused plastic, they could extract the instructions embedded and seal them in the glass beads, and remake the figure flawlessly. The experiment’s findings were reported in the journal Nature Biotechnology, where they explain the process in depth.
The new DNA data storage technique could store digital information in items of any shape or size in the future. It could also be used to make devices that contain their own blueprints for replication or to embed electronic health records within different devices and drugs. Scientists have shown that hiding information in common objects could be in the foreseeable future.
“The duo successfully stored 214 petabytes of data per gram of DNA, encoding a total number of six files, which include:
From Tsang at Synbiobeta:
Many in the industry have turned towards DNA for archival storage as it is where most of our data lies. “DNA data storage isn’t far away. At Twist, we’ve completed multiple proof of concept studies, including storing a premium series called Biohackers in DNA for Netflix,” says Leproust. “We’re planning to launch early access of our terabyte storage solution in 2025.”
While archival storage has been the focus of many companies, it’s also possible that DNA can be used in computing. “Storage alone is not useful unless the data can be utilized, which is where the potential for DNA in computation becomes equally important,” says Arwani. Arwani notes that DNA has a lot of potential when it comes to reading data. “The reading speed of DNA is expected to far exceed the best flash memories with the use of solid-state nanopores,” says Arwani. Biomemory is working towards making DNA data storage mainstream for data centers beginning in 2030.
“It’s still early days,” says Kamdar. I think interest is growing and continuing to advance.”
From Rob Carlson in the IEEE piece:
“Eventually, DNA storage technology will completely alter the economics of reading and writing all kinds of genetic information. Even if the performance bar is set far below that of a tape drive, any commercial operation based on reading and writing data into DNA will have a throughput many times that of today’s DNA synthesis industry, with a vanishingly small cost per base.
At the same time, advances in DNA synthesis for DNA storage will increase access to DNA for other uses, notably in the biotechnology industry, and will thereby expand capabilities to reprogram life. Somewhere down the road, when a DNA drive achieves a throughput of 2 gigabases per second (or 120 gigabases per minute), this box could synthesize the equivalent of about 20 complete human genomes per minute. And when humans combine our improving knowledge of how to construct a genome with access to effectively free synthetic DNA, we will enter a very different world.
The conversations we have today about biosecurity, who has access to DNA synthesis, and whether this technology can be controlled are barely scratching the surface of what is to come. We’ll be able to design microbes to produce chemicals and drugs, as well as plants that can fend off pests or sequester minerals from the environment, such as arsenic, carbon, or gold. At 2 gigabases per second, constructing biological countermeasures against novel pathogens will take a matter of minutes. But so too will constructing the genomes of novel pathogens. Indeed, this flow of information back and forth between the digital and the biological will mean that every security concern from the world of IT will also be introduced into the world of biology. We will have to be vigilant about these possibilities.
We are just beginning to learn how to build and program systems that integrate digital logic and biochemistry. The future will be built not from DNA as we find it, but from DNA as we will write it.”
From an OODA Loop perspective, One of the most compelling aspects of DNA-based data storage is the previously mentioned potential to revolutionize data archiving. Given the exponential growth of data and the limitations of current storage technologies in terms of lifespan and density, DNA offers a solution that could effectively address these challenges. For instance, in theory, the entire digital universe could be stored in a volume of DNA that fits within a few cubic meters.
Despite its promise, significant technical and economic hurdles must be overcome. The costs of synthesizing and sequencing DNA are prohibitively high for widespread adoption, and the process is slower than electronic data storage methods. Advances in synthetic biology and gene synthesis technologies, as evidenced by the dramatic reduction in the cost of sequencing a human genome from $10,000 to $600 in just a decade, hint at a future where these obstacles might be surmounted.
Integrating DNA-based data storage with the internet and synthetic biology could usher in a new era of biodefense and public health systems, enabling rapid response to pandemics and other biological threats. Yet, this raises ethical and security concerns, particularly regarding the potential for misuse and the need for robust safeguards to protect against biosecurity risks.
When contemplating the future of DNA-based data storage, one is reminded of its broader implications for society and the economy. As we stand on the brink of potentially solving one of the digital age’s most pressing challenges, the question becomes not just how we can make DNA data storage viable but also how we can do so responsibly and ethically, ensuring that the benefits are widely accessible and that the risks are adequately managed.
For more OODA Loop News Briefs and Original Analysis, see OODA Loop: Biotechnology | Genetics | Genomics | Healthcare | Medical Tech
OODA Special Report: Executive’s Guide To The Revolution in Biology: An update to our report on what executives need to know about the business impact of biological sciences.
Bioengineering, Health and Business: A follow-up that dives deeper into this trend of bioengineering with a focus on the healthcare sector and business. See: Bioengineering, Health and Business.
Bioengineering Beyond Health: Many revolutionary advancements go beyond healthcare. Examine them at Bioengineering Beyond Health.
Contextualizing Advancements in Bioengineering To Your Business Operations: Here we dive more into the “so what” of the revolution in bio science with a focus on how to inform and change your business strategy. See: Contextualizing Advancements in Bioengineering To Your Business Operations
The Future Now: The State of the Bioeconomy in 2023: The Bioeconomy in 2023 is showing clear signs of opportunities for advantage created by the exponential disruption of the industrial base (including that of defense), coupled with exponential biotechnological innovation to build the bioeconomy of the future. The State of the Bioeconomy in 2023 includes Exponential Organizational Ecosystems at Speed and Scale; Blockchain Technologies; Artificial Intelligence in Biotechnology, Genomics, Healthcare, and Medical Tech; Biomanufacturing in Cislunar Space; andHealth Security and Cybersecurity Challenges. Details of current breakthroughs and strategic directions for each category can be found here.
Innovative Blockchain Technology Case Studies (by Industry Sector) – Blockchain Technologies in The Bioeconomy, Biotechnology, and Healthcare: Over the course of 2022 and 2023, The OODA Loop Blockchain Series has explored blockchain disruption in the market and new opportunities created by blockchain technologies in both the public and private sectors. Innovative blockchain technology efforts (by industry sector) – with a focus on how the blockchain enables new business models, opportunities for innovative value proposition design, and decentralized governance – are listed here.
Exponential Innovation and Building the Bioeconomy of the Future: Last year, we launched the Opportunities for Advantage Series to explore how exponential disruption and innovation require organizations to focus efforts to gain advantage. In a recent review of the series, we found that some patterns and groupings deserved to be highlighted to jumpstart the series for this year. To start, We found that the future of biotechnology was a cluster in the series, pointing to the opportunities for advantage created by the exponential disruption of the industrial base (including that of defense) coupled with exponential biotechnology innovation to build the bioeconomy of the future. The following posts are a primer on the potential of such an effort – including the challenges, threats, risks, and opportunities ahead for your organization in this technology and business ecosystem of the future.
The New Tech Trinity: Artificial Intelligence, BioTech, Quantum Tech: Will make monumental shifts in the world. This new Tech Trinity will redefine our economy, both threaten and fortify our national security, and revolutionize our intelligence community. None of us are ready for this. This convergence requires a deepened commitment to foresight preparation and planning on a level that is not occurring anywhere. The New Tech Trinity.
The Revolution in Biology: This post provides an overview of key thrusts of the transformation underway in biology and offers seven topics business leaders should consider when updating business strategy to optimize opportunity because of these changes. For more see: The Executive’s Guide To The Revolution in Biology
Materials Science Revolution: Room-temperature ambient pressure superconductors represent a significant innovation. Sustainability gets a boost with reprocessable materials. Energy storage sees innovations in solid-state batteries and advanced supercapacitors. Smart textiles pave the way for health-monitoring and self-healing fabrics. 3D printing materials promise disruptions in various sectors. Perovskites offer versatile applications, from solar power to quantum computing. See: Materials Science
Planning for a Continuous Pandemic Landscape: COVID-19’s geopolitical repercussions are evident, with recent assessments pointing to China’s role in its spread. Regardless of the exact origins, the same conditions that allowed COVID-19 to become a pandemic persist today. Therefore, businesses must be prepared for consistent health disruptions, implying that a substantial portion of the workforce might always operate remotely, even though face-to-face interactions remain vital for critical decisions. See: COVID Sensemaking
“AI for Enterprise”: Lessons Learned from Healthcare, Hugging Face and Clinical Language Models: Healthcare is already in the midst of an AI revolution – with an applied technology market maturity which outpaces most other industry sectors which are in a reactive mode to the AI hype cycle. Explore these AI healthcare use cases and apply them to your organization using design and systems thinking.
The Future of Biosafety and the Global Gain-of-Function Research Ecosystem: Researchers from the Center for Security and Emerging Technology (CSET) recently mapped “the gain- and loss-of-function global research landscape.” We contextualize the CSET findings relative to the biosafety levels in U.S. biomedical laboratories. A looming question: do other countries have an adequate commitment to health security and biosafety measures in their high risk pathogen research?
An OODAcast Conversation – Joe Tranquillo on the Revolution in Biological Science: Joe Tranquillo is a Professor of Biomedical Engineering at Bucknell University and a provost at the school. He is also an author and speaker with a knack for helping make new and at times complex subjects understandable. In this OODAcast we discuss many aspects of the revolution in biological sciences with Joe including topics like: New ways of delivering medicines that target specific tissues; Discovery of the structure of almost every human protein; Methods to synthesize biomolecules, which can result in ways to manufacture a wide range of materials like therapeutics, flavors, fabrics, food, fuels; and New ways of growing food that is more productive and take fewer pesticides and fertilizers.
AI-powered Genomics: The convergence of machine learning, deep learning, and genomics, especially in the area of AI-powered genomic health prediction, while remarkably promising will also present remarkably challenging unintended consequences. A recent report suggests areas that need to be explored – starting now – as “the issues posed by the…technologies become harder to predict, more complex and more numerous.