Genetic Data, Privacy, and Peril

A call for greater protections amongst the US’s fractured legal landscape

Bobbie Dousa
January 16, 2020
February 10, 2020

Sixteen years ago, the first full DNA sequence of an individual’s genome was completed. This endeavor, the Human Genome Project, took thirteen years and three billion dollars to achieve. While nearly two decades ago, one had an approximately one in six billion chance of knowing someone who had had their DNA sequenced, today, rapid advances in sequencing technologies have made it possible for individuals to sequence their genome in mere hours for less than $1,000 dollars. The advent of what sociologist, Jenny Reardon, hails as our “postgenomic era” has seen the application of genomic data across a myriad of domains including direct-to-consumer testing services, biomedical research, healthcare, and forensics. The possibilities for the application of such data are only growing. The proliferation of this highly personalized data has brought compelling ethical questions regarding civil liberties and autonomy, privacy and intrusion to the fore of the American public life.

A plethora of studies indicate that patients, professionals, and the general public vary widely in the degree to which they are concerned about what is sometimes referred to as “genetic privacy.” A 2018 study including nearly 50,000 participants found that when participants were asked: “Are you worried about genetic privacy?” that the vast majority responded in the affirmative. This research, however, also signals the complexity and confusion that pervades the notion of “genetic privacy.” The authors note that respondents frequently conflated privacy with confidentiality, control, and security. In addition, participants varied widely in how much control they desired regarding the use of their genetic data. This study and numerous others (including my current work) indicates that people are often more comfortable sharing their genetic data with physicians and researchers than they are sharing this data with the government, commercial entities, or insurers. Ethicists and genetic privacy experts are beginning to allot greater attention to how sociocultural factors influence people’s opinions and decisions with regard to their genetic data. One domain in which scholars are actively engaged in appraising the intricacies and pitfalls of this issue is, appropriately, the realm of law. The current legal landscape, these experts portend, is riddled with potential and existent perils.

The Legal Landscape: What is and isn’t protected

In the United States, federal and state law on genetic data constitutes both a poorly understood and fractured legal landscape. Bereft of uniform protections across all fifty states, different laws oversee genetic data with respect to what the data is being used for and where it is stored. Attesting to the contention and lack of clarity characterizing these laws, the National Institutes of Health (NIH) is funding LawSeq, a multi-year project in which legal and medical experts’ endeavor to analyze existing US law and regulation on translation genomics, convene national conferences, as well as generate consensus and publish recommendations regarding what the laws surrounding this issue should be. The LawSeq working group has also constructed a searchable public database of every state and federal ordinance, guideline, and statute pertaining to genetic data. Despite these efforts, pitfalls abound in navigating these haphazard regulations and most experts agree on the current inadequacy of legal protection with regard to genetic information.

There are a limited number of federal laws governing the use of genetic data within certain circumstances. The most significant statutes include:

  • GINA
  • The Affordable Care Act
  • The US Common Rule
  • The 21st Century Cures Act.

Enacted in 2008, GINA, or the Genetic Information Nondiscrimination Act, is a law that attempts to prohibit discrimination based on a person’s genetic data with respect to employment and health insurance. GINA forbids employers from utilizing one’s genetic information as the basis for decisions regarding hiring, firing, placement, and promotion decisions. It also forbids health insurers from denying coverage to healthy persons or charging a person higher premiums if they demonstrate genetic predispositions to developing a disease in the future. GINA does not apply to disability, life, or long term care insurers; schools; the Indian Health Service; the Veterans Health Administration; members of the military; athletic programs; mortgage lenders; or the health benefits of federal employees. While GINA fails to offer its protections to patients who are already manifesting symptoms, the Affordable Care Act of 2010 mandates that health insurance companies are prohibited from denying coverage to a person based on pre-existing conditions. Many legal experts such as professor of law, Mark Rothstein, deem the ACA to be the “best genetic nondiscrimination act” the US has ever enacted. Predictably, the Trump administration is currently challenging the legality of the Affordable Care Act within the federal court system.

The US Common Rule and the 21st Century Cures Act regulate the use of genetic data in the realm of research. The Common Rule, published in 1991, was designed to reinforce federal regulations for human subjects research. It requires all federally-funded projects to obtain informed consent from participants in genomics studies. That is, participants must be informed of all possible risks, the uses of research results, and of who their information might be shared with. This statute is limited in that it only applies to research funded by the national government and by the fact that the Common Rule does not apply to any de-identified genetic information. A provision of the 21st Century Cures Act of 2016 attempts to bolster the Common Rule’s protections by allotting federal research participants with a Certificate of Confidentiality guaranteeing that researchers are prohibited from collecting a subject’s genetic data and releasing it to either government agencies or law enforcement.

HIPAA, or the Health Insurance Portability and Accountability Act of 1996, established protections for patients’ identifiable health data, such as genetic information within a person’s electronic health record, by designating this data as Protected Health Information and limiting when and with whom this information may be shared. HIPAA applies only to health care providers and insurance companies. These entities are forbidden from disclosing protected genetic health information for underwriting purposes. This data may not be given to schools or employers under HIPAA however, law enforcement agencies may access a person’s health information without a warrant if they are a victim or suspect of a criminal investigation. HIPAA does not apply to academic institutions, direct-to-consumer genetic testing firms or other companies, scientific consorts or federal agencies. Furthermore, HIPAA’s protections do not extend to personal health data that has been de-identified.

None of these laws explicitly ensures privacy or security with regard to how genetic information can be accessed, disclosed, or utilized. For over a decade, the US government has failed to enact broad legislation that uniformly regulates the protection of genetic information. In contrast, Europe and the United Kingdom have already instigated an attempt at harmonizing the legal perimeters that surround the use and collection of genetic data. Implemented in May 2018, the European Union’s General Data Protections Regulation or GDPR, designates genetic information, broadly defined, as personal data and limits how it can be collected and used in healthcare and research (this legal framework will continue to apply to the UK regardless of Brexit). Ensuring uniform compliance to this recent law is ongoing as it requires further legal interpretation of its scope and definitions by the magistrates of member states. For example, the GDPR enumerates conditions for processing personal data that falls under research exemptions and decrees that this data shall be subject to appropriate “safeguards.” Yet it remains the responsibility of member states to define and determine the terms of those safeguards.

Potential Perils Pertaining to a Lack of Strong Regulations

Although US legal scholars contend the concept of “privacy” itself is assailed by the disarray of meanings it calls forth, — most in the medical, scientific, and legal community agree that genetic data is especially sensitive and worth protecting. They caution that the misuse of genetic information can potentially engender severe social, psychological, political, and economic consequences. Although some implications for misuse remain hypothetical, a wealth of cautionary examples are already available for them to summon as existent material evidence in their calls for stronger protections.

Genetic information differs critically from other forms of personal information. As Stanford Law professor and Director of Consumer Privacy at the Center for Internet and Privacy, Jen King, explains:

“DNA is unlike any other data that you share or have collected. It is uniquely identifiable, and it’s unchangeable. It’s yours. Forever. You could change your social security number. You could change your name. You can’t change your DNA. I think people are used to sharing and giving away a lot of info about themselves, but this is different. You could infer things about other people from your data. DNA is shared with your family. When you give that up, you potentially give up other people’s identifiability and privacy as well.”

As King explains, the nature of genetic information can expose you and your kin — across generations and geography — to issues pertaining to identity tracing or re-identification. In the last two years, the contentiousness of this issue surfaced in a several ways. One such way is through the increasing accounts of “paternity breaches” whereby men who anonymously donated to a sperm bank decades ago are found and contacted by their kin who have discovered their identities by comparing their sequenced DNA to that of others’ housed in free online databases (e.g., GEDmatch) where users upload their sequenced DNA from direct-to-consumer companies (e.g., 23andme, MyHeritage, Ancestry) and then subsequently sleuth through public records. King foresees other possible legal quandaries related to paternity tests. For example, a person could undertake a direct-to-consumer DNA test and find that:

“His father is not his father because he has a bunch of cousins that aren’t related to any part of his family. Let’s say his father is still alive. Can he use it to sue his father or compel him to take a test? That’s probably more of what we might see. I can imagine someone saying have my parents fraudulently kept information from me my whole life. That I was adopted or one parent wasn’t my parent, and we have a bad relationship and I want to sue them.”

In 2018, Science published a study declaring that 90 percent of white Americans will be identifiable through genetic genealogy within a year or two. Yaniv Erlich, the lead author of the study, stresses that this is “not the distant future, it’s the near future.” Evidently, over 15 million people have given samples of their DNA to be sequenced by companies such as Ancestry and 23andme in the past several years alone. This study also found that an “anonymized” genetic profile obtained from a medical data set could be uploaded to GEDmatch and subsequently identified. Under the aegis of the US’s current legal framework (i.e. HIPAA), given that a person’s name, birthdate, social security number and fourteen other identifiers are removed from one’s sequenced DNA, their genomic data can be legally shared between researchers, bought and sold by data mining companies, and posted to public databases. Over the past decade, genomic privacy researchers have repeatedly demonstrated that with enough data, it is possible to re-identify one’s supposedly “anonymized” genome.

King and other scholars such as professor of law at UC Davis, Elizabeth Joh, are also disturbed by the lack of legal limits surrounding genetic genealogy as it develops into a popular instrument for law enforcement agencies. In April 2018, police arrested a Californian man accused of committing a series of unsolved murders and sexual assaults stretching back to the 1970’s. Using DNA samples from one of the crime scenes, testing it for DNA markers, then uploading the data to GEDmatch, police were able to hone in on the alleged “Golden State Killer.” By comparing the data from the crime scenes to million-plus profiles on GEDmatch, authorities found several distant cousins of the assailant. They then compared this information with genealogical records and crime locations to identify and arrest him. Since the explosive arrest of the alleged Golden State Killer, police have arrested over fifty suspects in the past year using this novel method. One of these suspects includes a 17 year-old high school student whose DNA was collected and provided to the police by a school resource officer from discarded milk and juice boxes that the student had thrown in his school cafeteria’s trash. The high school student was charged, not with murder or rape, but assault. Joh explains that genetic genealogy, especially as employed in this particular case, raises urgent questions. If this technique is used to arrest an individual on assault charges, what else might police use it for? Might they use it in cases of trespassing, or even shoplifting? How are the courts to parse the issue of consent? Currently databases like GEDmatch and direct-to-consumer companies only requires a user to click a button in their terms of service to opt in or out of allowing law enforcement to access their data for genetic genealogy purposes. As it stands, one person’s consent concurrently exposes their siblings, aunts, uncles, parents cousins, and other distant kin to this method without their knowledge or consent.

"A global vantage points to the troubling implications of government-run, mass genetic surveillance."

Beyond local law enforcement agencies, the federal government (including the US military and the State Department) might also request genetic information to exploit this method. In King’s view, this scenario is much more likely to occur rather an employer exploiting an employee via a loophole in the GINA statute. In January 2019, journalists from Buzzfeed revealed that the Federal Bureau of Investigation had been searching not just GEDmatch but also the genealogical database of the prominent genetic testing company, FamilyTreeDNA. The FBI already maintains a database of nearly 14 million DNA profiles which includes people convicted of crimes, criminal suspects, as well as people on parole, people on probation, and individuals who were merely arrested. A global vantage points to the troubling implications of government-run, mass genetic surveillance. In February of this year, journalists reported that the American biotech company, Thermo Fisher, had been selling equipment to the Chinese government. Chinese officials were using this equipment (including sequencing machines) to build a DNA database of the country’s marginalized Uighur population using blood samples collected under the guise of a free health check up program. In an interview with CNBC, Marcy Darnovsky, the executive director of the Center for Genetics and Society, reasoned what threat this might pose to marginalized communities in the US:

“There’s great concern in the law-enforcement context both about civil liberties in general and about disproportionate impact on communities of color, because they are already disproportionately in contact with police.” Given that genomics research and the genetic testing industry is already plagued by a lack of diversity in their samples, King further contends: “To the extent that poverty/low income is intertwined with the criminal justice system…a focus on using these databases to identify criminals will create unease or distrust, especially among historically targeted populations.”

Outside of GINA’s protections, there are no restrictions or use-purpose limitations that pertain to consumer level DNA testing. King cautions:

“While all the sites today are not selling that data, there’s no reason these companies couldn’t change their mind at some point and decide, “Hey, we’re going to sell it.”

Companies in this industry have already partnered with data brokers and leading pharmaceutical companies to offer customers’ genomic data to aid private research endeavors — provided that their customers have consented to this on their website. 23andme alone has partnered with Pfizer, Genentech, and GlaxoSmithKline. King points out that:

“People do think they are helping the world, helping society, even though they may not as an individual benefit…but if your DNA helps develop a drug for a pharmaceutical company, there is nothing governing what they do. It could be a drug they sell at a high profit but doesn’t help the world become a better place.”

Furthermore, the law does not explicitly prohibit direct-to-consumer testing companies from selling or sharing customers’ genomic data to insurers.

In addition consumer genetic testing companies and databases are also vulnerable to potential security attacks. For example, over 92 million MyHeritage accounts were hacked last year although the company reported that the DNA data of these profiles was not breached. In a broader sense, computer scientists have also expressed concern that machine learning algorithms that work with genomic data are potentially susceptible to security breaches. Creating other security threats, consumer level DNA testing companies can also go out of business and be sold. Darnovsky insists that what happens to customers’ genetic data in those scenarios is rarely transparent. Often, these companies reserve the right to amend their policies regarding data privacy, ownership, and distribution at any time, though they may elicit their users to agree to their new terms. Finally, genetic data remains especially sensitive as we simply do not know all that can be discerned from one’s sequenced DNA. It remains uncertain what additional future discoveries the science of genomics might illuminate as well as what future risks these discoveries might precipitate.

Proposed Interventions

Legal scholars, biomedical researchers, computer scientists, and genetic privacy experts are sounding the call for a legal overhaul of the statutes affording protections based on genetic information. Researchers affiliated with LawSeq such as Barbara Evans, the director of the Center for Biotechnology and Law at the University of Houston, reason that as genetic information is no longer adequately safeguarded by the protections of HIPAA and GINA, Congress and other legislative bodies may need to pass a broadly applicable, special-purpose genetic privacy law. These researchers also deem it necessary for US policymakers to address the issue of de-identified genetic data. Although legislatures could regulate DNA as personal identifying information in attempt to redress the legal loopholes of genetic genealogy, LawSeq affiliates caution that such a law would not prevent individuals from adding their personal genomes to online databases for ancestry purposes. As a result, Joh and other legal scholars assert that state legislatures and attorneys general can and must act to set up guidelines concerning genetic surveillance and policing by law enforcement agencies while, in addition, Congress and the Federal Trade Commission could address the privacy and security issues of consumer genetic data. Although legal experts like Jen King do not necessarily advocate for stricter controls on genetic data within biomedical contexts, they do stress the need to regulate the practices of commercial genetic testing companies and data mining firms. Fortunately, many consumer testing companies are invested in preserving the trust of their customers. A few have formed an inter-market privacy coalition, re-committed to strengthening their consent clauses, and released public statements declaring they are opposed to willingly cooperating with law enforcement.

Within the realm of biomedical research, moreover, ethicists and researchers similarly stress the need to rethink current regulations for securing consent. Instead, they advocate for a shift from the paradigm of mere consent to frameworks of accountability that attend to participants’ evolving concerns and adhere to ongoing commitments of responsible use of participant samples. They argue that as political surroundings, public opinion, the type of information collected, and the application of this data necessarily shifts, researchers must build responsive systems of consent. Consent practices, they argue, must integrate not only ongoing assessments of the risks and implications of their research but also frequent monitoring of patient attitudes, beliefs, and perspectives. Given that people’s participation in research is often framed as an act of participating in society predicated on contributing to the common good, ethicists assert that more needs to be done to guarantee reciprocity or, in other words, to ensure that these participants are also benefiting from the research. This begins, they contend, with a willingness to address historical injustices that have contributed to the mistrust that certain groups continue to hold with respect to biomedical research. For some, distributing broad benefits in genetics and genomics research, involves making research and research instruments publicly available so that they are not tethered to the limited access that often characterizes commercial arrangements. Ethicists also explain that research organizations can engage in capacity-building in which more richly resourced research organizations join forces and share resources with lower resourced organizations and community participants.

Finally, given that it is virtually impossible to ensure anonymity for genetic information, researchers in medicine, law, and computer science also recommend establishing restrictions on how genetic data is stored and repurposed. Some, like Yaniv Erlich, endorse the idea of attaching cryptographic signatures to genetic profiles and using blockchain technology to curb potential abuses. Others advocate for utilizing methods of obfuscation. One of these methods of obfuscation is referred to as “differential privacy.” In this method, noise is introduced to portions of the genetic profile to prevent re-identification and repurposing of the data as well as to control access. Nevertheless, the majority of experts across the fields of law, biomedical science, healthcare, and computer science are unanimous in asserting the urgency for stronger legislative protections.

  • Written by Roberta Dousa, Patient Experience Researcher at
  • Edited by Belle Taylor, Strategic Communications and Partnerships Manager at
  • Thanks to John Cassidy and Geoffroy Dubourg Felonneau for valuable discussions

References consulted and further reading:

This is some text inside of a div block.