Launched in the early 1990s and completed in 2001, the Human Genome Project provided the basis for a wide range of studies of human biology and medicine. In the years that followed, advances in next-generation sequencing (NGS), coupled with data from the Human Genome Project, led to a better understanding on how DNA replication processes vary from person to person, aiding physicians in uncovering genetic alterations that may be benign, pathogenic, or of unknown clinical significance. Today, the next revolution in “data rich” sequencing technologies will no doubt increase our collective understanding of how inherited DNA differences can contribute to disease risk and response to treatment.
Genes are small sections of DNA within the genome that code for proteins. They contain the instructions, in essence a recipe, for our individual (hereditary) characteristics, like brown eyes or black hair. And because genes also contain information related to disease, gene expression profiles provide insights into new ways to prevent, diagnose, or treat human disease. Understanding the relationships between specific genes and particular diseases has further led to novel diagnostic and treatment approaches, including targeted, individualized, and personalized medicine.
For example, today, oncologists treat some types of cancer that — a decade ago —were untreatable. Liquid biopsies, combined with novel genetic sequencing technologies, have become routine, helping physicians to understand an individual patient’s risk of developing a specific illness.
A Genetic Atlas of All Cells
To query the brain, geneticist Steven A. McCarroll, Ph.D, in the Department of Genetics at Harvard Medical School, proposed going beyond the data collected in the Human Genome Project and develop an atlas of all the cells in the human body, including the brain.
Such an atlas, McCarroll argues, will help physicians understand, in precise detail, how specific genes work and reveal their interactions within the brain.
To accomplish this ambitious undertaking, scientists in McCarroll’s lab developed “Drop-seq,” a novel technology enabling simultaneous analysis of RNA expression in thousands of individual cells, a scale that has never been possible before, to describe how each individual cell can contribute to new ways of treating mental illnesses such as schizophrenia.
This data-rich approach, which may reveal the biological basis of mental disease, has, according to McCarroll, turned an ever-larger “repository” of information into “Big Data” problems in which progress in our understanding can be dramatically accelerated by using powerful computers, math, and statistics in tandem with biology.
Harnessing Big Data
Today, the collaboration of McCarroll’s team with more than 50,000 scientists from 30 countries is generating large datasets that should not just be defined as “big” based on size alone, but, as Lieutenant Colonel Josh Helms, a data analyst serving as a Research Fellow in the Army’s Training With Industry program, describes it, as “a phenomenon that is the result of the rapid acceleration and exponential growth in the expanding volume of high-velocity, complex, and diverse types of data.”
Managing these kinds of complex and diverse data types without proper data management tools will, no doubt, limit the benefit of the original, underlying, research.
EvidentIQ’s eClinical software solutions give scientists access to a powerful data management system, allowing them to collect data according to GCP standards and prepare data for analysis and review.
Image Credit: National Cancer Institute | visualsonline.cancer.gov