VT College of Science Magazine Annual 2014 | Page 19
“My collaborators are research medical doctors and genetic epidemiologists who investigate the genetics and the molecular basis
of complex diseases – in particular cardiovascular disease, obesity,
and type II diabetes – and I provide statistics and statistical genetics expertise for designing their studies and integratively analyzing the resulting data,” Hoeschele said. “We know, for instance, that
obesity leads to type-II diabetes but we don’t understand the precise mechanism of how it happens, so they collect data that I help
analyze and interpret.”
When Hoeschele says “they collect data,” she is very casual with the
word, as data collection has changed significantly from when she
started analyzing pedigrees in humans and animals, trying to find
genes that segregated in families. Today’s data come in much larger
populations across the entire genome and include not just whether
or not a disease is present, but also genome-wide gene expression
and its potential epigenetic regulators.
“We measure different regulatory mechanisms of gene expression
on a genome-wide scale, such as micro-RNA expression and DNA
methylation – a chemical modification of DNA that doesn’t change
the sequence but can be inherited,” she said. The link between environment and genes can be seen in smokers, for example, who have
vast changes in their DNA methylation patterns.
“We can collect vast amounts
of data but the difficulty is
interpreting them and determining what it is they tell us”
up with a lot of false positives with so much data being analyzed.
At the same time, multiple high-dimensional datasets can provide
information that classical (e.g., single response) data cannot, for
example on how to fit a model that accounts for all major technical
and biological sources of variation. Using all the available information in the data, maximizing the power of discovery, and controlling
the rate of false positives is what modern statistics and statistical
genetics is all about.”
Hoeschele continues to make her mark in the quest she began as
a teenager to learn more about the genetics of disease, knowing
that one day the work she does in front of a computer will prove
invaluable to breakthrough discoveries leading to novel therapies
for human diseases.
“We collect data on thousands of people, millions of genetic markers,
tens of thousands of genes, and hundreds of thousands to millions
of epigenetic markers – and we have to interpret this vast amount
of data.”
Today, Hoeschele figures she can do in a day what it would have
taken years to do just two decades ago. “The computational analysis
is the bottleneck in (human) genomics research today. The big issue
is that the data can be generated fairly easily, but trying to make
sense of so much information can make it hard to find out what is
really going on. If you test so many things at the same time, you can
have low power to actually find out