Zip Code vs. Genetic Code

Illustration by Dan Page

Illustration by Dan Page

When considering the risk of a given disease—cancer, cardiovascular problems, Alzheimer’s—what matters more: the genes inherited from parents and grandparents, or the environment? Is disease influenced more by DNA, or by factors such as air pollution levels, socioeconomic status, or even regional weather conditions?

It’s common to think of disease and health “as this tension of ZIP code versus genetic code,” explains Chirag Patel, assistant professor of biomedical informatics at Harvard Medical School.

But a study by Patel and his research team challenges this “either-or” thinking, using Big Data to tease apart the complex interplay of environment, genes, and other factors in disease. They analyzed an insurance database of almost 45 million people in the United States, Patel explains, zeroing in on 700,000 pairs of non-twin siblings and 56,000 pairs of twins, in what is likely the largest study of twin pairs to date. Studying identical twins is a common way to consider nature-versus-nurture questions because such siblings have identical genes and often grow up in the same environment. In typical twin studies, researchers must recruit participants and examine just one or two diseases at a time. But this massive preexisting database enabled Patel and his team to consider 560 different diseases at the same time.

“You have this huge sample size, which we all love in science,” he says, “but these types of data are not meant for this work.” Preparing the database for study was therefore a challenge. Because the data did not specify which siblings were twins, for example, postdoctoral fellow Chirag Lakhani, who led the analyses, isolated the twins by searching for family members born on the same day. The team also had to determine which twins were identical (with identical DNA) and which fraternal. Male-female twin pairs cannot be identical, but same-sex twins have an equal chance of being identical or fraternal. Working with colleagues at the University of Queensland in Australia, the Harvard team developed a statistical technique for estimating which of the same-sex pairs were identical. When they compared their findings with previous small-scale studies on twins and disease, “we found by and large that there was a strong correlation with the things that we were seeing,” Patel says.

Some conditions stood out for the strength of their genetic links... 

Of the 560 diseases studied, 40 percent had some genetic component, while the shared environment (elements such as air quality and average temperatures) played a role in at least 20 percent of the diseases. Unsurprisingly, most diseases involved a mix of genetic and environmental factors. But some conditions stood out for the strength of their genetic links, including pervasive developmental disorders such as attention deficit hyperactivity disorder, and psychiatric diseases such as schizophrenia or depression. In contrast, lead poisoning and eye diseases such as myopia and astigmatism were the most heavily influenced by environment.

The researchers acknowledge some gaps in their work. For example, all people in the study were covered by employer-sponsored health insurance, so at least one person in the family had a job, which made it complicated to sort out the influence of income on disease. “Trying to dig deeper into that question is a priority for us,” Patel says. In the future he hopes to do similar work with Medicare or Medicaid data, “which has coverage for people who would be facing health disparities.” Moreover, none of the subjects were more than than 24 years old, so the study couldn’t capture how the influence of genes and environment might change as people enter middle age and beyond. Nor could the researchers explore how changes in an environment over time might influence health.

The work is important for confirming that large datasets can help researchers examine how numerous genetic and environmental factors interact at the same time, although Lakhani stresses that it takes painstaking effort to ensure that the data are used accurately. But the research also raises intriguing questions about additional disease factors. “For diseases that have neither a large shared environment, nor genetic, component,” Patel says, “we, the scientific community, need to get more serious about measuring specific environmental factors, such as diet, that can make twins different, or figure out how much is actually due to random chance.” 

Read more articles by: Erin O'Donnell

You might also like

“It’s Tournament Time”

Harvard women’s basketball prepares for Ivy Madness.

A Harvard Agenda Shaped by Speech

The work underway in the Faculty of Arts and Sciences

Dialogue, not Debate

American University’s Lara Schwartz, J.D. ’98, teaches productive disagreement.

Most popular

AWOL from Academics

Behind students' increasing pull toward extracurriculars

Post-COVID Learning Losses

Children face potentially permanent setbacks

Why Americans Love to Hate Harvard

The president emeritus on elite universities’ academic accomplishments—and a rising tide of antagonism

More to explore

Winthrop Bell

Brief life of a philosopher and spy: 1884-1965

Talking about Talking

Fostering healthy disagreement

A Dogged Observer

Novelist and psychiatrist Daniel Mason takes the long view.