Science

Scientists are using data from 100,000 people to find serial killers 

It is now possible to do everything from finding genetic predictors of disease to tracking murderers.

by Yewande Pearse
Netflix

Humans have an inherent social drive, and in this age of social media, we are more connected than ever. However, by constructing the world’s largest family tree comprising 125 million people, computational geneticist Yaniv Erlich, has shown that some of these connections run deeper — down into our genes. Erlich, who is a professor and researcher at Columbia University and CSO of MyHeritage.com, is revolutionizing the field of genomics by linking genealogical data provided online by volunteers to DNA with striking accuracy. Earlier this year, Erlich and his colleagues sent a shock wave through the field of genetics by showing that it is possible to uncover the identities of males who have taken part in “anonymous” genetic research without ever matching their data to a sample of their DNA. All you really need is the internet.

This story originally appeared on Massive Science, an editorial partner site that publishes science stories by scientists. Subscribe to their newsletter to get even more science sent straight to you.

"“Smoking…determines 10 years of our life expectancy, which is twice as much as what our genetics determines."

Genomic data is incredibly powerful. It can reveal migration patterns, or uncover interesting details like the distance people move from their place of birth to procreate. But more importantly, genomic data allows us to ask questions about human health, like how much genetic variations account for differences in individual lifespans. Large family trees allow us to analyze both close relatives and distant relatives, teasing apart the difference between genetic variations and environmental factors. Erlich, for example, found that genes account for only 15 percent of the differences in individual life spans, on average about five years. Speaking about these surprising findings, Erlich says, “I think there is this notion that there is some fountain of youth in our genome, and we just have to find the gene to unlock it. But it doesn’t seem this is the case.” Erlich explains that since 1960, lifes pans have increased linearly by about two months every year, despite two World Wars. Despite the many catastrophes of the 20th century, lifespans continued to steadily grow. Erlich says these findings mean that our actions might matter more than our genes. “Smoking for example, determines 10 years of our life expectancy, which is twice as much as what our genetics determines.”

While genes seem to have relatively little impact on our life span, genomic data has allowed us to identify risk factors for a numbers of diseases. Using genome-wide association studies (GWAS), it’s possible to link genetic variants in different individuals to particular traits. The more statistically significant the link is, the more the data looks like the skyline of Manhattan. Ten years ago, Erlich says, these Manhattan plots actually looked more like the skyline of Los Angeles. But bigger sample sizes have become easier for researchers to access, thanks to initiatives like the UK Biobank, where an increasing number of genetic risk factors are being identified. Using data from more than 100,000 donors, obtained through the website DNA.land, Erlich has himself been able to discover the genetic bases for several traits in Israeli families.

A Manhattan plot. The bars that rise higher than the rest are the ones of interest.

Ikram et al 2010 PLoS Genetics

With the help of civilian genealogy enthusiasts, genomic data is changing not only the landscape of health care, but forensics too. In April, thanks to the website GEDmatch, the FBI was able to link DNA from the unidentified Golden State Killer to a third cousin of the suspect who had voluntarily provided their own DNA to the free online genealogy database. By building a large family tree, and scanning the different branches of the tree until they found a profile that exactly matched what they knew about the serial killer, they were able to track down the suspect, test his DNA, and charge him.

Erlich is impressed by the power of genomics to improve demography, healthcare, and forensics. But he agrees there are many issues that still need to be addressed. For example, since these databases primarily contain people of European descent, non-European populations with certain genetic risk factors are missed, while risk factors identified in these European populations may not have the same implications for other groups. The most obvious reason for this disparity is economics. But many genealogy websites are free, and the price of DNA tests has dropped to as little as $49. Another reason may be access to family records. As Erlich says, “My family died in the Holocaust, so I have no means to go beyond a certain number of generations. It’s all lost.” A lack of record-keeping is also a problem for many populations. There’s also the question of social influence. “If I know someone who is doing genealogy, I’m now more willing to also do it. When you start with one community, it spreads from that community unequally.” Erlich does not have the answers for how to remedy the issue of diversity in databases, but believes that governments, at least in countries equipped with the resources, should take greater responsibility for driving genomic medicine.

Serge Melki via Wikimedia Commons

Another complex issue is the issue of privacy. When it comes to genetic information, many of us are concerned that employers and insurance companies may use this information unethically. According to the Genetic Information Nondiscrimination Act of 2008 (GINA), employers and insurance companies cannot use our genetic information without our consent. But there are some major loopholes; for example, GINA doesn’t apply to life insurance. There’s also the question of how law enforcement should be allowed to use genetic information. The Golden State Killer case in particular raises many questions about privacy. Interestingly, 60 percent of Americans of European heritage (because they are over-represented in databases) have relinquished genetic information that could be used by law enforcement, and within three years, this number is expected to rise to 99 percent. Erlich says he’s not scared of these techniques being abused. He’s more worried about national security. “I’m more concerned about foreign governments using the same techniques to identify US individuals. Think about CIA operation in some countries. The whole point is that it’s covert — you don’t know the identities of these people. It’s very easy to disguise your face and get a fake passport, but you can’t change your DNA.” At the end of the day, there are no easy answers. “It’s a tricky question of justice, and how to define that,” he says, pointing toward the need to make genetic information part of a public good, rather than be used for monetary gain. But the limits may be hard to find. He says, “I don’t know what’s the right answer.”

Related Tags