Methodological Research

The methodological research of the faculty of the Department of Biostatistics and Bioinformatics involves developing and applying statistical methodology in search of answers to medical and public health questions. BIOS faculty members, post-docs, and students are involved in a variety of traditional (such as clinical trials and Bayesian methodology) and contemporary (such as imaging statistics and bioinformatics) research areas. Details on these methodological research areas and examples of specific research projects in each area can be found below.

View a list of BIOS graduate faculty and their research interests

Learn about our collaborative research 

Agreement studies have wide and important applications in biomedical research and clinical practices since data from multiple observers or measurement methods commonly occur in such settings. For example, when a new assay or instrument is developed, it is important to assess whether the new method can reproduce the results of a traditional method or of a gold standard. Our department has established a strong research program in developing agreement methodology for complex outcomes in biomedical studies. 

Research conducted by our faculty includes:

  • Development of new statistical methods to extend the existing agreement paradigm to handle multiple scale (continuous/ordinal) measurements
  • Development of nonparametric as well as parametric approaches to assess agreement for survival outcomes which involve censored or truncated observations
  • Development of agreement methods to investigate the alignment between traditional behavior/clinical outcomes and the emerging high-dimensional neuroimaging data
  • Development of agreement methods for assessing comparability among brain images acquired from multi-center neuroimaging studies
  • Agreement methods for data resulting from studies where each observer makes replicated or repeated readings on each study subject

Faculty: Ying Guo, Michael Haber, Amita Manatunga, Limin Peng

Bayesian methods have become increasingly popular due to computational advances. Bayesian inference focuses on probability distributions characterizing unknown parameters through both prior knowledge and the data. One key advantage of Bayesian methods is the ability to integrate various sources of information in a model-based framework that can address complicated dependence structures in data often encountered in today’s real-world problems.

Bayesian approaches are particularly useful in settings where: (1) the statistical model is highly complex, (2) the sample size is small, (3) prior knowledge needs to be incorporated in the analysis, and (4) a comprehensive approach to quantify different sources of uncertainty is needed.

Our department has an active group of faculty working in the development and application of Bayesian methodology for a wide range of applications, including bioinformatics (Dr. Zhaohui Qin), causal inference (Dr. Qi Long), clinical trials (Dr. Nelson Chen), environmental statistics (Drs. Howard Chang and Lance Waller), and high-dimensional data analysis (Drs. Suprateek Kundu and Qi Long).

Faculty: Howard Chang, Nelson Chen, Suprateek Kundu, Qi Long, Zhaohui (Steve) Qin, Lance Waller

Bioinformatics is an interdisciplinary field developing methods and software tools to analyze high-dimensional data generated from biomedical experiments. With biomedical data sets becoming larger, more diverse and more complex, information science and biostatistics play a larger role in the biomedical sciences. Bioinformatics is a very diverse field, with applications ranging across DNA sequence analysis, protein structure assessment, models of molecular evolution, and imaging analysis. Research in our department concentrates primarily on omics: genomics, epigenomics, and metabolomics.

High throughput technologies, such as next generation sequencing and mass spectrometry, produce massive amounts of data compounded with substantial noise and uncertainties, and present a rich variety of fascinating and challenging scientific questions for biostatistics researchers. Our faculty’s research has resulted in novel statistical methodologies, as well as powerful and efficient algorithms and software developed for biomedical researchers. In addition, our faculty members collaborate extensively with biologists and clinicians at Emory and elsewhere to assist efforts to identify novel biological insights from these rapidly expanding experimental data.

Faculty: Karen Conneely, Jeanne Kowalski, Suprateek Kundu, Zhaohui (Steve) Qin, Hao Wu, Tianwei Yu

Our work in the area of biostatistics and bioinformatics in cancer research continues to grow with the support of faculty members like Drs. Yuan Liu, Jeffrey Switchenko, and Zhengjia Chen from the Biostatistics and Bioinformatics Shared Resource (overseen by Dr. Jeanne Kowalski at the Winship Cancer Institute). Faculty research is motivated by state-of-the cancer data arising from ongoing retrospective and prospective study designs. 

Dr. Chen has extensive experience with Phase I trial development and applications in cancer. Drs. Liu and Switchenko have experience with the analysis of large cancer databases, such as the National Cancer Data Base and the Surveillance, Epidemiology, and End Results (SEER) Program. Dr. Kowalski has experience in clinical bioinformatics through her work on designing and analyzing studies to examine the clinical relevance of large genomic studies results and integrating genomic data types to test hypothesized patterns of molecular change.    

Faculty: Zhengjia (Nelson) Chen, Jeanne Kowalski, Qi Long, Yuan Liu, Jeffrey Switchenko

The goal of causal inference is to make inference on causation and treatment effects using data often collected from observational studies (subject to complications such as selection bias and confounding) as well as data from complex clinical trials. Our faculty members are actively engaged in methodological and collaborative research in this area.

Examples include:

  • Development of a general class of hybrid trial designs that combine features of treatment randomization and patient choice of treatments.  Such trials are useful in behavioral intervention studies where treatment assignments cannot be blinded and strong motivation is often required to maintain compliance
  • Extension of propensity score approaches to non-binary treatment regimens.
  • Semiparametric methods for estimating effect of non-binary treatment regimens

Faculty: Qi Long, Bob Lyles

 

Clinical trials (prospective studies to evaluate the effect of interventions in humans under pre-specified conditions) represent the gold standard for quantifying the health impacts of given treatments and interventions. Our faculty members maintain active research in advancing the practice of clinical trials by enhancing experimental designs, developing novel analytic tools, and providing user-friendly statistical software.  Examples include:

  • Improving the accuracy of maximum tolerated dose estimation and trial efficiency of Phase I clinical trials of toxicity by treating toxicity response as a quasi-continuous variable and fully utilizing all toxicities
  • Extending Phase I trials with dose escalation with overdose control designs to allow under-dose control, flexible patient administration, and fully utilization of partially completed data
  • Improving the power and efficiency of Phase II trials by treating tumor clinical response as a continuous variable (percentage of tumor shrinkage or increase) instead of a categorical variable.
  • Increasing the success rate of Phase III clinical trials in cancer by establishing a stronger relationship between tumor responses in Phase II trials and survival outcomes in Phase III trials
  • Propose hybrid trial designs which combine features of randomization and patient choice in the assignment of treatments
  • Propose Bayesian methods for monitoring and predicting patient accrual in ongoing clinical trials in order incorporate multiple levels of uncertainty.

 In addition, department faculty members have been involved in the development and testing of a measure of treatment success through the Illness Density Index. This index uses the longitudinal area under the response curve and is particularly useful for medical device studies, where switching treatments is less feasible and long-term outcomes are of great interest.

Faculty: Zhengjia (Nelson) Chen, Mary Kelley, Jeanne Kowalski, Michael Kutner, Qi Long, Reneé Moore

Neuroimaging techniques have become an increasingly important tool in clinical research to help diagnose, treat and even prevent brain diseases. To garner richer understanding of the human brain, there has been a sharp increase in national and international neuroimaging projects and initiatives. In recent years, imaging statistics has emerged as one of the fastest growing research areas in biostatistics.

The main goal of imaging statistics is to develop and apply state-of-the-art statistical methods to help extract the most relevant and accurate information from imaging data to advance scientific understanding of the human brain function among healthy as well as diseased subjects. Our department hosts one of the first imaging statistics research centers in the country, the Center for Biomedical Imaging Statistics (CBIS). CBIS currently develops statistical methods for data acquired from various imaging modalities including functional and structural magnetic resonance imaging, magnetic resonance spectroscopic imaging, and positron emission tomography. CBIS faculty and students have conducted statistical methodological research in:

  • Brain network analysis using analytical tools such as independent component analysis and graphical models to understand brain architecture and neural circuits
  • Imaging-based predictive modeling that aims to extract relevant features in imaging data to predict individual disease status and treatment response
  • Reproducibility of imaging studies using analytical tools such as agreement methodologies and meta-analysis
  • Imaging genetics that integrate neuroimaging and genetic data to investigate how genetic variations impact brain structure and function which further leads to alterations in subjects’ behavioral and psychiatric outcomes. 

In addition to methodological research, CBIS has collaborated with imaging researchers from Emory's Departments of Psychiatric and Behavioral Sciences, Radiology, and Biomedical Engineering, as well as the Winship Cancer Institute.

Faculty: Ying Guo, Suprateek Kundu

Biostatisticians have made and continue to make important contributions with respect to the study of infectious diseases. Stochastic models are used to determine the rate at which susceptible (S) individuals go to infection (I) and from infection to recovery (R), i.e., the classic SIR model. Statistical methods evaluate how well vaccines and other interventions prevent infections, morbidity, and mortality. They are also used in predicting the incidence of infectious diseases in different locations over time. 

Statistical analyses of infectious disease data are challenging since standard assumptions of independence among individuals often do not apply. Complicating issues even further, infection and illness status can be misclassified easily. Moreover, exposure to an infectious agent is often difficult to quantify (we usually do not know who infected whom). Faculty at the Department of Biostatistics and Bioinformatics use statistical models and methods to address these and related issues.

Faculty: Julie Clennon, Michael Haber, Vicki Hertzberg, Christina Mehta, Lance Waller

Missing data are ubiquitous in medical and epidemiologic research. Specific examples include survey nonresponse, missed clinical visits by study subjects, patient drop out, respondents refusing to answer certain items on a questionnaire, or data lost in transcription. Inadequate handling of missing data is known to lead to biased and less precise results.

Consequently, the development and application of methods dealing with missing data draws substantial interest and remains a very active area of research. To this end, many statistical methods have been developed for conducting appropriate statistical inference in the presence of missing data. The four common methodologies for handling missing data include likelihood-based approaches, multiple imputation, Bayesian methods, and semi-parametric methods including those based on inverse-probability weighted estimating equations.

Covariate contamination or misclassification of categorical data is also a common issue in medical and epidemiologic research. Examples include measures of CD4 count and viral load in HIV/AIDS studies, blood pressure in cardiovascular disease research, and dietary intake in cancer prevention. Ignoring this measurement error can result in substantial estimation bias in analyses and misinterpretation of results. 

Statistical methods to account for covariate measurement error have been under active development—particularly functional modeling methods for nonlinear models. Such methods do not impose distributional assumptions on the unobserved true covariates and are thus appealing for their robustness. Structural measurement error models also permit flexibility in such settings, despite the need for further modeling or distributional assumptions. Similar methods have been extended to handle complex missing data mechanisms based on supplemental sampling designs.

Our faculty is particularly interested in advancing statistical methodology in:

  • Robust imputation methods
  • Imputation methods for big data
  • Functional modeling methods for covariate measurement error
  • Structural modeling methods to handle missing data, misclassification, and measurement error via efficient validation or reassessment study designs

Faculty: John Hanfelt, Yijian (Eugene) Huang, Qi Long, Bob Lyles

Public health data are increasingly being collected with geospatial information. The analysis of spatially referenced data provides opportunities for a wide variety of methodological and applied statistical research. These approaches often involve the use of spatially correlated random effects within generalized linear mixed models to accurately estimate fixed effects accounting for the presence of spatial correlation.

Research typically uses geographic information systems to manage and visualize data, and Bayesian hierarchical models to examine associations between outcomes and possible explanatory variables. The department’s faculty members are involved in numerous research projects developing spatio-temporal models for a wide range of applications. Examples include:

  • Infectious disease (spatial dynamics of raccoon rabies, malaria, and schistosomiasis)
  • Ecology (spatial patterns in sea turtle nesting)
  • Epidemiology (measuring and mapping disparities in disease burdens and accessibility to health care/sanitation)
  • Exposure assessment (data assimilation of satellite imagery and ground monitor exposure data),
  • Environmental health (estimating the health impacts of air quality, extreme heat, and climate change)

Faculty: Howard Chang, Julie Clennon, Lance Waller

Statistical genetics focuses on disease gene mapping, that is, linking inherited genetic variations and to draw inference on genetic drivers of disease risk. Statistical genetics is mostly centered on human genetics where results of the Human Genome Project are changing the practice of medicine and public health allowing human genetics to play a more central role in all the biomedical sciences.

Our faculty members develop new statistical methodology to explore these issues as well as new algorithms and popular software. Specific research foci include methods for design and analysis of large-scale data sets from genome-wide association studies and next-generation sequencing studies, using both unrelated and related individuals. These developments present remarkable opportunities for the prevention and cure of human diseases, allowing investigators to work at the interface between human genetics and the mathematical sciences. In addition, our faculty members collaborate closely with geneticists, molecular biologists, clinicians, and bioinformaticians to address real-world questions of human health and disease.

Faculty: Karen Conneely, Michael Epstein, Yijuan Hu, Yi-An Ko, Glen Satten

In addition to methodological work directly related to the health sciences, our faculty members also engage in fundamental research in statistical theory. One area of theoretical research addresses the problem of “many nuisance parameters”, which arises when there is substantial heterogeneity in the population that is not of main scientific interest but that must be accounted for in order to arrive at valid inference and robust conclusions. The presence of many nuisance parameters is pathological: it invalidates standard methods of statistical inference.

Department faculty members have developed approaches to reduce or eliminate the harmful effects of nuisance parameters in either the full likelihood context (e.g., relaxed conditional likelihood under a rectangular array asymptotic setting) or the estimating function context (e.g., composite conditional score functions, orthogonal second-order locally ancillary estimating functions, and G-ancillary estimating functions). These methods are designed to be computationally feasible and robust while avoiding unnecessary modeling assumptions.

Another area of research in statistical theory concerns the application and adaptation of empirical process methods to provide accurate and reliable inference for complex data structures. Examples include development of classification strategy and study of global quantile regression in high-dimensional settings.

Faculty: John Hanfelt, Yijian (Eugene) Huang, Limin Peng

Epidemiology and environmental health represent two highly important traditional disciplines in the broad field of public health. Our faculty members are engaged in methodological and collaborative research motivated by studies such as: air pollution and health, HIV and cancer epidemiology, and investigations of associations between biomarker levels (e.g., in blood or urine) and reproductive health outcomes.

Key ongoing public health problems provide the motivation for much of the methodological research conducted by department faculty in this area. Areas of particular interest include but are not limited to:

  • Methods for causal inference in observational studies
  • Methods for handling missing and mismeasured data
  • Methods for assessing agreement between multiple biomarkers of exposure and/or disease
  • Spatial analysis and geographic information systems
  • Modeling the dynamics of outbreaks of infectious disease in space and time.
  • Survival analysis and quantile regression relating to limit-of-detection of environmental exposures
  • High-dimensional measures of lifetime exposures (the exposome)
  • Statistical genetics and estimation of gene-environment interactions
  • Imaging statistics, including remote sensing images as markers of environmental exposure.

Collaborative ties with faculty in Rollins' Department of Epidemiology and Department of Environmental Health enhance the opportunities and breadth of impact associated with ongoing research in these areas.  

Faculty: Howard Chang, Julie Clennon, Bob Lyles, Amita Manatunga, Limin Peng, Tianwei Yu, Lance Waller

Survival analysis addresses time-to-event data, which arise routinely in clinical trials and observational follow-up studies. One distinguishing focus of survival analysis is the ability to draw information from incomplete observations of time-to-event responses in real data settings, addressing complications known as censoring, competing risks, and truncation.

Methodology has been well established for traditional types of survival data, where assumptions (such as independent censoring and independent truncation) are deemed reasonable. Techniques such as the Kaplan-Meier curve, log-rank test, and Cox’s proportional hazards regression model have been well accepted and are widely used in many areas across biomedical research. Despite the success of these standard survival analysis techniques, there has been increasing attention to their limitations in practical scenarios where their underlying assumptions are considered unrealistic.

There are also many interesting research problems arising from the rapid development of new, high-dimensional data structures applied to new investigative goals. Examples of such problems include assessment of dynamic survival processes, screening and selection of high-dimensional survival predictors, and delineation of fine-tuned or personalized treatment effects on survival. These challenges provide an exciting outlook for survival analysis methodological research in the future, requiring creative integration with other modern developments of statistical techniques.

Dynamic regression provides another research direction currently under active development  by department faculty. Classical models, including the proportional hazards model and accelerated failure time model, presume constant effects of covariates. Such constancy assumption, however, is not realistic in many applications where effects of covariates may actually evolve over time. For instance, the effectiveness of an AIDS drug is typically eroded over time due to drug resistance. To address this issue, quantile regression provides a popular and flexible means, allowing covariate effects to vary across data quantiles. Department faculty members are actively involved in developing quantile regression methods that can appropriately handle special features of survival data.

Faculty: Ying Guo, Yijian (Eugene) Huang, Amita Manatunga, Limin Peng