Biography
I am Instructor in the Division of Biostatistics in the Department of Population Health Sciences (formerly known as the Department of Healthcare Policy and Research) at Weill Cornell Medicine. I received my PhD degree in Biostatistics in 2019 from The Ohio State University (dissertation) and completed my postdoctoral training in Mental Health Data Science at Weill Cornell.
My research these days mostly focuses around using big data (e.g., EHR, insurance claims data) to study mental health, especially disparities in youth mental health care. I recently received a K99 award on studying risk and protective factors for Black youth suicide and suicidal thoughts and behaviors (K99 MH130713).
My doctoral dissertation focuses on the statistical analysis of co-location networks, which are a special type of two-mode networks where ties are only defined between the nodes of different modes. In the case of co-location networks, nodes represent individuals and geographic locations and ties indicate that the individual visits the location. I use co-location networks to study routine activity patterns of individuals within an urban setting. The primary objective, in particular, is to detect ecological communities of individuals based on the set of locations where they spend time, instead of their locations of residence. I use two approaches to achieve the goal: 1) latent Dirichlet allocation (LDA), a well-developed method in text data mining, and 2) hierarchical Bayesian non-negative matrix factorization (NMF), which is similar to LDA but can be adapted to make use of sparse finite mixture techniques to automatically determine the number of communities in a co-location network.
Besides my dissertation research, I was a Graduate Research Associate (GRA) at Ohio State’s Center of Excellence in Regulatory Tobacco Science (OSU-CERTS) from September 2015 to August 2018, where I provided data management and statistical analysis for two ongoing surveys: the Buckeye Teen Health Study (BTHS) and the Tobacco User Adult Cohort (TUAC).
Research Focus
Mental Health, Co-Location Networks, Social Networks, Spatial Statistics, Bayesian Methods
Education
PhD, Biostatistics
Defended: Jul 2019; conferred: Dec 2019
The Ohio State University, Columbus, OH
- Advisor: Catherine A. (Kate) Calder, PhD, FASA, FAAAS
- Dissertation: Community Structure in Co-Location Networks
MS, Statistics
Dec 2018
The Ohio State University, Columbus, OH
MS, Public Health
Aug 2014
The Ohio State University, Columbus, OH
MS, Biomathematics
Jun 2012
University of Science and Technology of China, Hefei, China
B.S., Mathematics
Jun 2010
Northwest Normal University, Lanzhou, China
Work Experience
Instructor
Aug 2022 - Present
Weill Cornell Medicine, New York, NY
- Department of Population Health Sciences
Postdoctoral Associate
Aug 2019 - Jul 2022
Weill Cornell Medicine, New York, NY
- Department of Population Health Sciences (FKA Department of Healthcare Policy and Research)
Research Associate
Sep 2015 – Aug 2018
The Ohio State University, Columbus, OH
- Center of Excellence in Regulatory Tobacco Science (OSU-CERTS)
Research Associate
May 2015 – Aug 2015
The Ohio State University, Columbus, OH
- College of Public Health
Teaching Associate
Aug 2014 – May 2015
The Ohio State University, Columbus, OH
- College of Public Health
Research Intern
May 2013 – Aug 2013
The Ohio State University, Columbus, OH
- Department of Otolaryngology
Research Assistant
Feb 2012 – Jun 2012
University of Science and Technology of China
- Department of Mathematics
Teaching Assistant
Feb 2011 – Jan 2012
University of Science and Technology of China
- Department of Mathematics
Research
My postdoctoral training was in the intersection of mental health, data science, and health informatics. I was supported by a NIH-funded study, “Predicting Self-Harm, Suicide Attempt, and Suicidal Death using Longitudinal EHR, Claims and Mortality Data” (R01 MH119177). The majority of my work and training was focused on using health insurance claims data to study suicidal ideation (SI) and suicide attempts (SA) at the US national level. Specifically, I summarized the US national level healthcare utilization patterns of patients before and after a psychiatric hospitalization admitted through the emergency department (ED), and found that SI was highly prevalent at the time of the index event (29.88%). To further investigate the geographic variation and the neighborhood impact, I studied the effects of geographic regions and neighborhood-level social deprivation level on the clinical and demographic risk factors for SI and SA among US youth and adults (details in the neighborhood section below). While working on these projects, I became aware of the difference between the youth and the adult populations, and started to focus more on studying STBs in youth.
In the summer of 2022, I independently mentored three master-level students on their capstone project, which focused on identifying behavioral factors associated with Black youth’s suicidal thoughts and behaviors (STBs) using the 2019 Youth Risk Behavior Survey (YRBS) data. We found that feeling sad or hopeless and being bullied at school or electronically were positively associated with Black youth’s suicidal ideation, suicide planning, and suicide attempts, whearas ever use of cocaine or heroin was positively associated with Black youth’s attempt-related injuries. In the past two decades, among all racial/ethnic groups, Black youth have experienced the fastest increase in the suicide rate. To understand the unique factors contributing to the rise of Black youth suicide, from 2022-2023, I independently mentored another master-level student on her portfolio project, which studied the temporal trends of risk factors for Black youth STBs in the US from 1991 to 2021. Using LASSO regressions, we found that factors identified for all four outcomes (ideation, planning, attempts, injuries) were similar and consistent overtime, including violent behaviors, substance use, body image concerns, and sex. The effects of all factors, however, remained unchanged overtime, indicating that the increase in prevalence might be driven by other factors not captured by the study. A senior-authored manuscript is currently under revision.
Building upon my experiences, my K99 focuses on using EHR data to study Black youth suicide and STBs in NYC. I found that, among youth who contacted the healthcare system for the first SA diagnosis, compared to White youth, Black youth were younger, had more all-cause healthcare encounters before and after the first SA, but fewer mental-health related encounters after the SA. In addition, Black youth reported more physical comorbidities, especially pain and breathing related diagnoses, both before and after the first SA. To better pinpoint the root of youth suicidality, I started collaborating with child social worker Dr. Xiafei Wang, to study the effects of Adverse Childhood Experiences (ACEs) on youth mental health and STBs. In a recent study, I conducted a moderated mediation analysis to study how the mediation effect of mental health problems are moderated by race. I found that mental health problems fully mediated between ACEs and STBs in White and Black youth, partially mediated in Latinx youth, and had no mediation effects in Asian youth.8 We have submitted one R21 grant proposal on this topic.
People with substance use disorders have an elevated risk of STBs. However, engagement with health care services among this vulnerable population remains under-investigated. Using EHR data, we examined patterns of health care use, identified risk factors in seeking treatment, and assessed associations between outpatient service use and ED visits. We found that opioid use disorder (OUD) was associated with an increased use of outpatient, ED, and inpatient services, yet only one in ten OUD patients received medications for OUD (MOUD) treatment. Using outpatient mental health services was associated with reduced suicide-related ED visits.
In the summers of 2021 and 2024, I mentored three undergraduate students on studying the temporal trends and associations between STBs and substance use in US adolescents. We identified an upward trend in the effects of electronic vaping and marijuana use on STBs, and the associations were consistent across all racial/ethnic groups. In addition, we found that STBs were most prevalent among adolescents with illicit substance use, and least prevalent among those without substance use, suggesting that substance use is positively associated with STBs, with illicit substances having the strongest effect. An earlier version of the results was presented at the 2022 American Psychiatric Association Annual Meeting, and a senior-authored manuscript is in preparation (analysis complete).
Very recently, I started collaborating with social worker Dr. Jonathan Prince to (1) identify communities with high needs of naloxone access, and (2) evaluate the awareness and comfort level of non-traditional first responders who received training to administer naloxone during an opioid overdose.
My interest at the intersection of neighborhood/geography and health disparities (i.e., neighborhood-level social determinants of health) is deeply rooted in my pre-doctoral research on the adverse effects of social and racial segregation on youth’s short- and long-term health, as well as my PhD training in spatial statistics and network analysis. My dissertation research on identifying ecological communities revealed that residents of predominantly White neighborhoods of the city are more attached to an ecological community compared to residents of Black or mixed-race neighborhoods
In tobacco regulatory research, my work has found that rural adolescents had higher self-reported tobacco marketing exposures, however, urban adolescents and those living in neighborhoods with a higher percentage of poverty had more potential exposures to tobacco retailers in their path between home and school.
Using claims data, I investigated the geographic variations of risk factors for suicidal ideation and attempts, as well as the impact of neighborhood-level social disparities on COVID-19 hospitalization and STBs among commercially insured psychiatric patients. I found that neighborhood social deprivation level affected STBs youth and adults’ populations differently, suggesting future community-based suicide prevention initiatives need to be tailored according to participants’ age and neighborhood deprivation level – more attention should be paid on youth residing in middle neighborhoods, a demographic group usually being ignored in suicide prevention.
Recently, we developed a data-driven methodology for defining ethnic enclaves using the readily available neighborhood-level survey/census data. By implementing our methodology, we identified Asian and Latinx enclaves in NYC and studied the effects of ethnic enclaves on self-reported depression and anxiety among senior Asian and Latinx residents. The analysis is complete and the manuscript is currently in preparation.
In an ongoing project, where I am the senior author of an invited paper, we are studying the effects of individual neighborhood-level social determinants on STBs in youth with schizophrenia.
Since the summer of 2023, I have joined, as a spatial epidemiologist, a cross-disciplinary research team consisting of physicians, urban environmental engineers, climatologists, foresters, environmentalists, and geographers, with the common goal of protecting the health of vulnerable communities during extreme heat events by developing evidence-based urban greening strategies. As a team effort, we have submitted one R21 (as a co-I), one F32 (as a collaborator), and one co-authored manuscript (currently under revision).
Teaching
@ Weill Cornell Medicine
HBDS 5018 Data Science I
Course Director, Autumn 2022 - 2024
This course provides an introduction to data science using the R programming languages. In this course students will gain experience working directly with data to pose and answer questions. Topics covered include: reproducible research, exploratory data analysis, data manipulation, data visualization techniques, simulation design, and unsupervised learning methods.
HBDS 5020 Big Data in Medicine
Instructor, Spring 2024
Module II: Big administrative data for healthcare
There has been an explosion of big data in medicine and healthcare. There are four main sources of such big data – 1) administrative databases in healthcare such as electronic health records and health insurance claims, 2) biomedical imaging (e.g. MRI, CT-Scan, X-ray etc.) 3) sensors in smartphones, wearable and implantable devices and 4) genetics and genomics. It is difficult to navigate and critically assess the statistical methods and analytic tools that are needed to conduct analytics and research with such big biomedical data. This course will introduce the four above-mentioned important sources of big data in medical studies, discuss the nuances and intricacies of how such data are generated and introduce tools to navigate such databases visualize and describe them.
Advanced Statistical Methods for Observational Studies
Guest Lecturer
- Complex Survey Analysis, Spring 2022 - 2024
- Analysis Methods for Pre-Post Studies with Missing Post-Test Data, Spring 2022
SYSEN 5610 Introduction to the US Healthcare System, Data, and Interoperability
Guest Lecturer
- Health Insurance Claims Data – HCCI, Autumn 2024
Talks
Invited Talks
- (Forthcoming) ENAR 2025 Spring Meeting, New Orleans, LA, Mar 2025.
- "Challenges and Methodological Advances in Utilizing Big and Small Data for Depression Treatment"
- (Forthcoming) 2025 International Conference on Health Policy Statistics (ICHPS), San Diego, CA, Jan 2025.
- "Predicting Preventable and Psychiatric Hospitalizations in Late Middle-Aged Adults With Depression"
- 10th Annual Thomas R. Ten Have Symposium on Statistics in Mental Health, Boston, MA, Jun 2023.
- "Analysis of Big Data in Mental Health Research: Opportunities and Challenges"
Oral Presentations
- (Forthcoming) American Academy of Child and Adolescent Psychiatry (AACAP) 2024 Annual Meeting, Seattle WA, Oct 2024.
- "Characteristics of Black and White Children and Youth With Suicide Attempts in New York City From 2016 to 2023"
- 11th Annual Thomas R. Ten Have Symposium on Statistics in Mental Health, New York, NY, Jun 2024.
- "Adverse Childhood Experiences, Mental Health Disorders, and Suicidality in Youth: A Moderated Mediation Analysis"
- Suicide Research Symposium (SRS), Virtual, Apr 2024.
- "Trends of Risk Factors for Black Youth Suicidality in the US From 1991 to 2021"
- Suicide Research Symposium (SRS), Virtual, Apr 2023.
- "Factors Associated With Black Youth Suicidality"
- Joint Statistical Meetings (JSM), Virtual, Aug 2021.
- "Overall AUC for Survival Models"
-
American Medical Informatics Association (AMIA) 2020 Annual Symposium, Virtual, Nov 2020.
- Healthcare Contacts Among Patients with Psychiatric Hospitalization Admitted Through the Emergency Department"
- Joint Statistical Meetings (JSM), Virtual, Aug 2020.
- "EHR Phenotyping of Depressed Patients: A Hierarchical Bayesian Latent Variable Modeling Approach"
- Joint Statistical Meetings (JSM), Denver, CO, Jul 2019.
- "Overlapping Activity Patterns and Community Detection in Ecological Networks"