Your Picture

Phi Le

About Me

My full name Lê Long Phi in Vietnamese. I am from Vietnam. My hobbies when I have free time are watching action movies and reading.

Education

  • PhD in Applied Mathematics (2016): University of Missouri - Columbia.

  • MS in Biostatistics and Data Science (2020): University of Mississippi Medical Center

Employment

  • Data Scientist (Biostatistician, Bioinformatician, Machine Learning, Deep Learning): University of California - San Francisco: 2022-

  • Biostatistician: University of Mississippi Medical Center: 2018-2022

  • Mathematician: Syracuse University: 2016-2018

Recent Projects

Feature selection model applied to gene expressions

We developed a novel ensemble method that invokes a Group Lasso Model with a permutation-assisted (PA) technique to find the feature associated with clinical outcomes of interest. Our model got high accuracy and less irrelevant selected features through various simulation scenarios due to the PA method. When using our selected features for the prediction model, our results showed better performance and robust predictions across different prediction models.

Tcell receptor - antigen peptide recognition prediction

Predicting the binding of T-cell receptors (TCRs) to peptides is essential for understanding the immune system and developing new immunotherapy treatments for diseases such as cancer. To achieve this, we developed various models using Deep Learning and Graph Neural Networks to encode letter-based amino acid sequences of TCRs into numerical values, which increases data variation. Additionally, we built a Bayesian classification model to obtain a high-performance model that can check the probability of binding between TCRs and a list of antigen peptides and provide uncertainty levels of predictions.

Gene clusters and their associations with ASD

I utilized WGCNA and DGE analyses to identify gene modules associated with Autism Spectrum Disorder (ASD) and discern significant gene expression differences compared to individuals without ASD.

Single cell analysis pipeline

We developed a pipeline to conduct single-cell RNA-seq analysis from FASTQ files to annotate cells using marker genes. Additionally, I perform statistical analysis to examine changes in immune cells across different treatments and timepoints

Neoadjuvant CD40 Agonism Remodels the Tumor Immune Microenvironment in Locally Advanced Esophageal/Gastroesophageal Junction Cancer

In this project, I contributed to analyzing B-cell receptor similarities through network analysis and performed statistical analysis across different time points and features of B cells.

Effect of geographic and racial disparities on continuity of care and healthcare utilization among patients with obesity-associated chronic conditions

In this project, we apply Meta-analysis method involving practice-based research networks (PBRNs) for 3 large academic medical centers and one large integrated to study the Effect of geographic and racial disparities on continuity of care and healthcare utilization among patients with obesity-associated chronic conditions health care system in Tennessee, Mississippi, and Louisiana

Spatial correlation and its effects on the continuity of care, length of stay at the hospital and readmission within 30 days

In our cohort study conducted in Mississippi, we examined whether increasing continuity of care provided protection to patients with chronic diseases against emergency visits and hospitalization readmissions. To investigate this, we employed spatial-temporal models that took into account the influence of geographic locations on the risk of experiencing emergency visits and hospitalizations. Notably, our approach diverged from classical linear regressions by incorporating the correlation effects of the neighborhood, which arise from Social Determinants of Health factors. This allowed for a more comprehensive analysis of the data and yielded insights into the potential benefits of improved continuity of care for patients with chronic diseases.

Medical tags from patients' descriptions of health questions

I utilize the BERT model to construct a natural language processing framework that automatically assigns tags to the specific health concerns inquired by users, based on their descriptions of medical issues.

Integration TCR and scRNA-seq sequences to clusters T-cells

By using GNN, we developed a novel model to integrate amino acid sequence based TCR and numerical value gene expression counts. Our results showed that we can be better cluster T-cells that can bind to the same antigen peptide from real data provided by 10X and mouse data

Some Observations of Gastric Cancer from SEER database

In this retrospective cohort survival study, we used the Cox proportional hazard model to investigate if undergoing surgery for non-related cancer issues would increase the lifespan of patients. In addition to our primary model, which was adjusted for age, race, and gender, we also conducted univariate models for age, gender, race, and surgery status, as well as a model for the interaction of surgery status and age. Our results showed that males had a 3% higher risk of death compared to females. Regarding race, the median survival times were 24 months for Asian/Pacific Islanders, 16 months for White individuals, 13 months for Black individuals, and the lowest at 12 months for American Indian/Native Alaskan individuals. We found conclusive evidence that surgery positively impacted patients' lifespans. Patients who underwent surgery had a median survival time of 36 months, while those who did not undergo surgery and were not recommended for surgery had median survival times of 11 months and 7 months, respectively.

Publications

    1. Le P, Ung L, Yang H, Huang A, He T, Zhang L. Prediction of Multi-Class Peptides by T-cell Receptor Sequences with Deep Learning. Pre-print. Forthcoming.
    2. Le P, Gong X, Ung L, Yang H, Keenan B, Zhang L, He T. A robust ensemble feature selection approach to prioritize genes associated with survival outcome in high-dimensional gene expression data Front. Syst. Biol., 20 March 2024 Sec. Integrative Genetics and Genomics Volume 4 - 2024 https://doi.org/10.3389/fsysb.2024.1355595.
    3. Phi Le, Josh Mann, Jeannette Simino, Fazlay Faruque, Continuity of care indices and their associations with unplanned hospital readmission within 30 days. In preparation
    4. Maira Soto, Erin L. Filbert, Hai Yang, Stephanie Starzinski, Alec Starzinski, Marissa Gin, Brandon Chen, Phi Le, Tony Li, Brandon Bol, Alexander Cheung, Li Zhang, Frank J. Hsu, Andrew Ko, Lawrence Fong, Bridget P. Keenan, Neoadjuvant CD40 agonism remodels the tumor immune microenvironment in locally advanced esophageal/gastroesophageal junction cancer, to appear Cancer Research Community 2023
    5. Phi Le, Joshua Mann, Satya Surbh4, Hao Mei , Yilu Lin, Lizheng Shi, Eboni G. Price-Haywood, Jeff Burton , Sohul A Shuvo, Christopher D. Jackson, Ming Chen, James E. Bailey. Effect of geographic and racial disparities on continuity of care and healthcare utilization among patients with obesity-associated chronic conditions. In preparation
    6. Satya Surbhi, Ming Chen, Sohul A.Shuvo, Eboni Price-Haywood, Lizheng Shi, Joshua Mann, Yilu Lin, Phi L.Led, Jeffrey H.Burton, James E.Bailey. Effect of continuity of care on emergency department and hospital visits for obesity-associated chronic conditions: A federated cohort meta-analysis, , Journal of the National Medical Association, https://doi.org/10.1016/j.jnma.2022.07.001
    7. Lin Y, Bailey JE, Surbhi S, Shuvo SA, Jackson CD, Chen M, Price-Haywood EG, Mann J, Fort D, Burton J, Sandlin R, Castillo A, Mei H, Smith P, Leak C, Le P, Monnette AM, Shi L. Continuity of Care for Patients with Obesity-Associated Chronic Conditions: Protocol for a Multisite Retrospective Cohort Study. JMIR Res Protoc. 2020 Sep 9;9(9):e20788. PubMed Central PMCID: PMC7511855.
    8. Finsler Trudinger-Moser inequalities on $\mathbf{R}^2$, with Duy, N.T., Sci. China Math. (2021). https://doi.org/10.1007/s11425-020-1820-5
    9. Cylindrical Hardy type inequalities with Bessel pairs, with Duy Nguyen, Operators and Matrices Volume 15, Number 2 (2021), 485–495 doi:10.7153/oam-2021-15-34
    10. Continuity of Care for Patients with Obesity-Associated Chronic Conditions: Protocol for a Multisite Retrospective Cohort Study, with others, JMIR Res Protoc 2020;9(9):e20788, PMID: 32902394 PMCID: PMC7511855 DOI: 10.2196/20788
    11. Hardy and Caffarelli-Kohn-Nirenberg inequalities with nonradial weights, with Nguyen Tuan Duy, Nguyen Thanh Son, Electronic Journal of Differential Equations, Vol. 2020 (2020), No. 33, pp. 1–10.
    12. ISSN: 1072-6691. URL: http://ejde.math.txstate.edu or http://ejde.math.unt.edu
    13. A note on the second order geometric Rellich inequality on half-space, with Duy, N.T., Lam, N., Monatsh Math 195, 233–248 (2021). https://doi.org/10.1007/s00605-020-01490-9
    14. Sharp affine Trudinger-Moser inequalities: A new argument, with N Duy and N Lam, Canadian Mathematical Bulletin, 1-14, 2020, DOI: https://doi.org/10.4153/S0008439520000806
    15. Quantum divergences with p-power means, with N Lam, Linear Algebra and its Applications 609, 289-307, 2020, https://doi.org/10.1016/j.laa.2020.09.009
    16. Hardy Inequalities and Caffarelli-Kohn-Nirenberg inequalities with radial derivative, with Nguyen Tuan Duy, Weijia Yin , 2020 Journal of Mathematical Inequalities, Volume 14, Number 2 (2020), 501–523 , dx.doi.org/10.7153/jmi-2020-14-32
    17. Sharp Trudinger-Moser inequalities with homogeneous weights, with Duy, Nguyen Tuan; Nghia, Le Trung; Electron J. Differential Equations 2019, N. 205, https://ejde.math.txstate.edu/Volumes/2019/105/duy.pdf
    18. Carleson measure estimates and the Dirichlet problem for degenerate elliptic equations, with Steve Hofmann and Andrew Morris, ANALYSIS $\&$ PDE, Volume 12, No. 8, 2019, DOI: 10.2140/apde.2019.12.2095
    19. BMO solvability and absolute continuity of harmonic measure, with Steve Hofmann, The Journal of Geometric Analysis, Volume 28, Issue 4, pp 3278–3299, https://doi.org/10.1007/s12220-017-9959-0
    20. The weak-$A_{\infty}$ property of harmonic and p-harmonic measures implies uniform rectifiability, with Steve Hofmann, Kaj Nystrom, Jose Maria Martell, Analysis and PDEs. Vol 10. No. 3 2017, DOI: 10.2140/apde.2017.10.513
    21. Nonlinear versions of Stampacchia and Lax-Milgram theorems and applications, with Duong M. Duc and Nguyen H. Loc, Nonlinear Anal. 68 (2008), no. 4, 925931, https://doi.org/10.1016/j.na.2006.11.048

Miscellaneous: Pipeline analysis, codes, and others

Please visit my github for some codes and pipeline of single-cell RNA-seq analysis, WGCNA, deep learning models, spatial statistical analysis, etc.