Research Summary

In April 2015, I became the inaugural Director of the Bakar Computational Health Sciences Institute at the University of California, San Francisco, with the role of recruiting new computational-related faculty to UCSF.

My prior decade was spent at Stanford University, where I advanced from Assistant to Full Professor and Division Chief by developing and using bioinformatics methods to integrate, leverage, and reason over genomic and other molecular and clinical data sets to yield tools for physicians and patients. Example of this method includes work on cancer drug discovery (PNAS, 2000), type 2 diabetes (PNAS, 2003, 2012), fat cell formation (Nature Cell Biology, 2005), obesity (Bioinformatics, 2007), and transplantation (PNAS, 2009). To facilitate this, we developed tools to index public genomic data sets (Nature Biotechnology, 2006), reuse gene expression data (Nature Methods, 2007, 2010; Nature Communications 2015), and for cloud-computing (Nature Biotechnology, 2010). With these methods, we explore human physiology using electronic health record data (Science, 2008; Science Translational Medicine, 2014), estimate medical risk with whole genomes (Lancet, 2010), computationally reposition drugs (Science Translational Medicine, 2011). In newer work, we are studying entire medical systems through real-world clinical data (Journal of Clinical Investigation, 2020).

My research lab currently has 3 graduate students, 7 post-doctoral research fellows, and 3 staff members. I have successfully administered multiple research projects, including the NIAID ImmPort data archival repository, collaborated with many other researchers around the world, and continue to produce many peer-reviewed publications from each project. I have been heavily invested in teaching and mentoring. I am currently training or have trained 30 post-doctoral scholars in my research lab, with many obtaining prestigious research positions after departing. Twelve graduate students are completing, or have completed their PhD work in the lab, including two members of underrepresented minorities.

Research Funding

  • April 1, 2020 - March 31, 2025 - Computational models of naturally acquired immunity to falciparum malaria , Co-Principal Investigator . Sponsor: NIH, Sponsor Award ID: U01AI150741
  • September 1, 2015 - August 31, 2021 - Stanford and Northrop Grumman proposal for the Oncology Models Forum , Principal Investigator . Sponsor: NIH, Sponsor Award ID: U24CA195858
  • September 26, 2016 - April 30, 2021 - Integrative Analysis of Genomic, Epigenomic and Phenotypic Data for Disease Stratification of Endometriosis , Co-Principal Investigator . Sponsor: NIH, Sponsor Award ID: R01HD089511
  • April 15, 2014 - August 31, 2020 - Biorepository of Human iPSCs for Studying Dilated and Hypertrophic Cardiomyopathy , Co-Principal Investigator . Sponsor: NIH, Sponsor Award ID: R24HL117756
  • September 1, 2003 - March 31, 2019 - Adaptive and Innate Immunity, Memory and Repertoire in Vaccination and Infection , Co-Investigator . Sponsor: NIH, Sponsor Award ID: U19AI057229
  • September 27, 2016 - March 31, 2018 - California Precision Medicine Consortium , Co-Principal Investigator . Sponsor: NIH, Sponsor Award ID: OT2OD024611
  • September 30, 2006 - March 31, 2017 - Enabling new translational discoveries using a genomic data-driven nosology , Principal Investigator . Sponsor: NIH, Sponsor Award ID: R01GM079719
  • July 12, 2010 - June 30, 2016 - Vaccination and infection: indicators of immunological health and responsiveness , Co-Investigator . Sponsor: NIH, Sponsor Award ID: U19AI090019
  • September 23, 2008 - July 31, 2014 - Comparative functional genomics for lung cancer gene discovery , Co-Principal Investigator . Sponsor: NIH, Sponsor Award ID: R01CA138256
  • September 30, 2008 - September 29, 2013 - Integrating Microarray and Proteomic Data by Ontology-based Annotation , Principal Investigator . Sponsor: NIH, Sponsor Award ID: R01LM009719
  • January 15, 2005 - May 31, 2008 - CREATION AND APPLICATION OF A DIABETES KNOWLEDGE BASE , Principal Investigator . Sponsor: NIH, Sponsor Award ID: K22LM008261

Education

Brown University, Providence, RI, A.B./Honors, 1987–1991, Computer Science
Brown University Medical School, RI, M.D., 1991–1995, Medicine
Children’s Hospital and Harvard, Boston, MA, Residency, 1995–1998, Pediatrics
Children’s Hospital and Harvard, Boston, MA, Fellowship, 1998–2001, Pediatric Endocrinology,
Massachusetts Institute of Technology, MA Sc.M. 1998–2002 Medical Informatics
Harvard Medical School and MIT, MA Ph.D. 2002–2004 Health Sci Technology

Honors & Awards

  • 2002, 2003
    Outstanding Speaker, American Association for Clinical Chemistry (awarded twice)
  • 2006
    PhRMA Foundation Informatics Research Starter Grant
  • 2006
    Howard Hughes Medical Institute Physician-Scientist Early Career Award
  • 2007
    Genome Technology magazine “Tomorrow’s Principal Investigator” award
  • 2008
    American Medical Informatics Association New Investigator Award
  • 2009
    Elected into the American College of Medical Informatics
  • 2010
    Young Investigator Award, Society for Pediatric Research
  • 2011
    National Human Genome Research Institute (NHGRI) Genomic Advance of the Month
  • 2012
    Recognized for Outstanding Scientific Accomplishment and Lectureship by the NIH Director (Wednesday Afternoon Lecture Series, WALS)
  • 2013
    Elected into the American Society of Clinical Investigation (ASCI)
  • 2013
    Awarded White House Champion of Change in Open Science
  • 2014
    Kavli Frontiers of Science Invited Fellow for the Indonesian-American Symposium, National Academy of Science
  • 2014
    E. Mead Johnson Award, Society for Pediatrics Research
  • 2015
    Elected to the National Academy of Medicine (NAS)

Selected Publications

  1. Miao BY, Chen IY, Williams CYK, Davidson J, Garcia-Agundez A, Sun S, Zack T, Saria S, Arnaout R, Quer G, Sadaei HJ, Torkamani A, Beaulieu-Jones B, Yu B, Gianfrancesco M, Butte AJ, Norgeot B, Sushil M. The MI-CLAIM-GEN checklist for generative artificial intelligence in health. Nat Med. 2025 Feb 06.  View on PubMed
  2. Frouard J, Telwatte S, Luo X, Elphick N, Thomas R, Arneson D, Roychoudhury P, Butte AJ, Wong JK, Hoh R, Deeks SG, Lee SA, Roan NR, Yukl S. HIV-SEQ REVEALS GLOBAL HOST GENE EXPRESSION DIFFERENCES BETWEEN HIV-TRANSCRIBING CELLS FROM VIREMIC AND SUPPRESSED PEOPLE WITH HIV. bioRxiv. 2024 Dec 20.  View on PubMed
  3. Chen LC, Zack T, Demirci A, Sushil M, Miao B, Kasap C, Butte A, Collisson EA, Hong JC. Assessing Large Language Models for Oncology Data Inference From Radiology Reports. JCO Clin Cancer Inform. 2024 Dec; 8:e2400126.  View on PubMed
  4. Winge MCG, Nasrallah M, Jackrazi LV, Guo KQ, Fuhriman JM, Szafran R, Ramanathan M, Gurevich I, Nguyen NT, Siprashvili Z, Inayathullah M, Rajadas J, Porter DF, Khavari PA, Butte AJ, Marinkovich MP. Repurposing an epithelial sodium channel inhibitor as a therapy for murine and human skin inflammation. Sci Transl Med. 2024 Dec 11; 16(777):eade5915.  View on PubMed
  5. Kim LY, Schüssler-Fiorenza Rose SM, Mengelkoch S, Moriarity DP, Gassen J, Alley JC, Roos LG, Jiang T, Alavi A, Thota DD, Zhang X, Perelman D, Kodish T, Krupnick JL, May M, Bowman K, Hua J, Liao YJ, Lieberman AF, Butte AJ, Lester P, Thyne SM, Hilton JF, Snyder MP, Slavich GM. California Stress, Trauma, and Resilience Study (CalSTARS) protocol: A multiomics-based cross-sectional investigation and randomized controlled trial to elucidate the biology of ACEs and test a precision intervention for reducing stress and enhancing resilience. Stress. 2024 Jan; 27(1):2401788.  View on PubMed
  6. Sun S, Zack T, Williams CYK, Butte AJ, Sushil M. Revealing the impact of social circumstances on the selection of cancer therapy through natural language processing of social work notes. JAMIA Open. 2024 Dec; 7(4):ooae073.  View on PubMed
  7. Williams CYK, Miao BY, Kornblith AE, Butte AJ. Evaluating the use of large language models to provide clinical recommendations in the Emergency Department. Nat Commun. 2024 Oct 08; 15(1):8236.  View on PubMed
  8. Sushil M, Zack T, Mandair D, Zheng Z, Wali A, Yu YN, Quan Y, Lituiev D, Butte AJ. A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports. J Am Med Inform Assoc. 2024 Oct 01; 31(10):2315-2327.  View on PubMed
  9. Deshpande D, Chhugani K, Ramesh T, Pellegrini M, Shiffman S, Abedalthagafi MS, Alqahtani S, Ye J, Liu XS, Leek JT, Brazma A, Ophoff RA, Rao G, Butte AJ, Moore JH, Katritch V, Mangul S. The evolution of computational research in a data-centric world. Cell. 2024 Aug 22; 187(17):4449-4457.  View on PubMed
  10. Franklin JB, Marra C, Abebe KZ, Butte AJ, Cook DJ, Esserman L, Fleisher LA, Grossman CI, Kass NE, Krumholz HM, Rowan K, Abernethy AP, JAMA Summit on Clinical Trials Participants. Modernizing the Data Infrastructure for Clinical Research to Meet Evolving Demands for Evidence. JAMA. 2024 08 05.  View on PubMed
  11. Bains JK, Williams CYK, Johnson D, Schwartz H, Sabbineni N, Butte AJ, Kornblith AE. Enhancing emergency department charting: Using Generative Pre-trained Transformer-4 (GPT-4) to identify laceration repairs. Acad Emerg Med. 2025 Jan; 32(1):94-97.  View on PubMed
  12. Binvignat M, Miao BY, Wibrand C, Yang MM, Rychkov D, Flynn E, Nititham J, Tamaki W, Khan U, Carvidi A, Krueger M, Niemi E, Sun Y, Fragiadakis GK, Sellam J, Mariotti-Ferrandiz E, Klatzmann D, Gross AJ, Ye CJ, Butte AJ, Criswell LA, Nakamura MC, Sirota M. Single-cell RNA-Seq analysis reveals cell subsets and gene signatures associated with rheumatoid arthritis disease activity. JCI Insight. 2024 Jul 02; 9(16).  View on PubMed
  13. Khera R, Oikonomou EK, Nadkarni GN, Morley JR, Wiens J, Butte AJ, Topol EJ. Transforming Cardiovascular Care With Artificial Intelligence: From Discovery to Practice: JACC State-of-the-Art Review. J Am Coll Cardiol. 2024 Jul 02; 84(1):97-114.  View on PubMed
  14. Williams CYK, Zack T, Miao BY, Sushil M, Wang M, Kornblith AE, Butte AJ. Use of a Large Language Model to Assess Clinical Acuity of Adults in the Emergency Department. JAMA Netw Open. 2024 05 01; 7(5):e248895.  View on PubMed
  15. Ong JCL, Chang SY, William W, Butte AJ, Shah NH, Chew LST, Liu N, Doshi-Velez F, Lu W, Savulescu J, Ting DSW. Ethical and regulatory challenges of large language models in medicine. Lancet Digit Health. 2024 Jun; 6(6):e428-e432.  View on PubMed
  16. Behr M, Kumbier K, Cordova-Palomera A, Aguirre M, Ronen O, Ye C, Ashley E, Butte AJ, Arnaout R, Brown B, Priest J, Yu B. Learning epistatic polygenic phenotypes with Boolean interactions. PLoS One. 2024; 19(4):e0298906.  View on PubMed
  17. Mehandru N, Miao BY, Almaraz ER, Sushil M, Butte AJ, Alaa A. Evaluating large language models as agents in the clinic. NPJ Digit Med. 2024 Apr 03; 7(1):84.  View on PubMed
  18. Silverman AL, Bhasuran B, Mosenia A, Yasini F, Ramasamy G, Banerjee I, Gupta S, Mardirossian T, Narain R, Sewell J, Butte AJ, Rudrapatna VA. Accurate, Robust, and Scalable Machine Abstraction of Mayo Endoscopic Subscores From Colonoscopy Reports. Inflamm Bowel Dis. 2024 Mar 26.  View on PubMed
  19. Patel PV, Zhang A, Bhasuran B, Ravindranath VG, Heyman MB, Verstraete SG, Butte AJ, Rosen MJ, Rudrapatna VA, ImproveCareNow Pediatric IBD Learning Health System. Real-world effectiveness of ustekinumab and vedolizumab in TNF-exposed pediatric patients with ulcerative colitis. J Pediatr Gastroenterol Nutr. 2024 May; 78(5):1126-1134.  View on PubMed
  20. Silverman AL, Sushil M, Bhasuran B, Ludwig D, Buchanan J, Racz R, Parakala M, El-Kamary S, Ahima O, Belov A, Choi L, Billings M, Li Y, Habal N, Liu Q, Tiwari J, Butte AJ, Rudrapatna VA. Algorithmic Identification of Treatment-Emergent Adverse Events From Clinical Notes Using Large Language Models: A Pilot Study in Inflammatory Bowel Disease. Clin Pharmacol Ther. 2024 Jun; 115(6):1391-1399.  View on PubMed

Go to UCSF Profiles, powered by CTSI