Research Summary

In April 2015, I became the inaugural Director of the Bakar Computational Health Sciences Institute at the University of California, San Francisco, with the role of recruiting new computational-related faculty to UCSF.

My prior decade was spent at Stanford University, where I advanced from Assistant to Full Professor and Division Chief by developing and using bioinformatics methods to integrate, leverage, and reason over genomic and other molecular and clinical data sets to yield tools for physicians and patients. Example of this method includes work on cancer drug discovery (PNAS, 2000), type 2 diabetes (PNAS, 2003, 2012), fat cell formation (Nature Cell Biology, 2005), obesity (Bioinformatics, 2007), and transplantation (PNAS, 2009). To facilitate this, we developed tools to index public genomic data sets (Nature Biotechnology, 2006), reuse gene expression data (Nature Methods, 2007, 2010; Nature Communications 2015), and for cloud-computing (Nature Biotechnology, 2010). With these methods, we explore human physiology using electronic health record data (Science, 2008; Science Translational Medicine, 2014), estimate medical risk with whole genomes (Lancet, 2010), computationally reposition drugs (Science Translational Medicine, 2011). In newer work, we are studying entire medical systems through real-world clinical data (Journal of Clinical Investigation, 2020).

My research lab currently has 3 graduate students, 7 post-doctoral research fellows, and 3 staff members. I have successfully administered multiple research projects, including the NIAID ImmPort data archival repository, collaborated with many other researchers around the world, and continue to produce many peer-reviewed publications from each project. I have been heavily invested in teaching and mentoring. I am currently training or have trained 30 post-doctoral scholars in my research lab, with many obtaining prestigious research positions after departing. Twelve graduate students are completing, or have completed their PhD work in the lab, including two members of underrepresented minorities.

Research Funding

  • April 1, 2020 - March 31, 2025 - Computational models of naturally acquired immunity to falciparum malaria , Co-Principal Investigator . Sponsor: NIH, Sponsor Award ID: U01AI150741
  • September 1, 2015 - August 31, 2021 - Stanford and Northrop Grumman proposal for the Oncology Models Forum , Principal Investigator . Sponsor: NIH, Sponsor Award ID: U24CA195858
  • September 26, 2016 - April 30, 2021 - Integrative Analysis of Genomic, Epigenomic and Phenotypic Data for Disease Stratification of Endometriosis , Co-Principal Investigator . Sponsor: NIH, Sponsor Award ID: R01HD089511
  • April 15, 2014 - August 31, 2020 - Biorepository of Human iPSCs for Studying Dilated and Hypertrophic Cardiomyopathy , Co-Principal Investigator . Sponsor: NIH, Sponsor Award ID: R24HL117756


Brown University, Providence, RI, A.B./Honors, 1987–1991, Computer Science
Brown University Medical School, RI, M.D., 1991–1995, Medicine
Children’s Hospital and Harvard, Boston, MA, Residency, 1995–1998, Pediatrics
Children’s Hospital and Harvard, Boston, MA, Fellowship, 1998–2001, Pediatric Endocrinology,
Massachusetts Institute of Technology, MA Sc.M. 1998–2002 Medical Informatics
Harvard Medical School and MIT, MA Ph.D. 2002–2004 Health Sci Technology

Honors & Awards

  • 2002, 2003
    Outstanding Speaker, American Association for Clinical Chemistry (awarded twice)
  • 2006
    PhRMA Foundation Informatics Research Starter Grant
  • 2006
    Howard Hughes Medical Institute Physician-Scientist Early Career Award
  • 2007
    Genome Technology magazine “Tomorrow’s Principal Investigator” award
  • 2008
    American Medical Informatics Association New Investigator Award
  • 2009
    Elected into the American College of Medical Informatics
  • 2010
    Young Investigator Award, Society for Pediatric Research
  • 2011
    National Human Genome Research Institute (NHGRI) Genomic Advance of the Month
  • 2012
    Recognized for Outstanding Scientific Accomplishment and Lectureship by the NIH Director (Wednesday Afternoon Lecture Series, WALS)
  • 2013
    Elected into the American Society of Clinical Investigation (ASCI)
  • 2013
    Awarded White House Champion of Change in Open Science
  • 2014
    Kavli Frontiers of Science Invited Fellow for the Indonesian-American Symposium, National Academy of Science
  • 2014
    E. Mead Johnson Award, Society for Pediatrics Research
  • 2015
    Elected to the National Academy of Medicine (NAS)

Selected Publications

  1. Padula WV, Kreif N, Vanness DJ, Adamson B, Rueda JD, Felizzi F, Jonsson P, IJzerman MJ, Butte A, Crown W. Machine Learning Methods in Health Economics and Outcomes Research-The PALISADE Checklist: A Good Practices Report of an ISPOR Task Force. Value Health. 2022 07; 25(7):1063-1080.  View on PubMed
  2. Wong DR, Tang Z, Mew NC, Das S, Athey J, McAleese KE, Kofler JK, Flanagan ME, Borys E, White CL, Butte AJ, Dugger BN, Keiser MJ. Deep learning from multiple experts improves identification of amyloid neuropathologies. Acta Neuropathol Commun. 2022 04 28; 10(1):66.  View on PubMed
  3. Binvignat M, Pedoia V, Butte AJ, Louati K, Klatzmann D, Berenbaum F, Mariotti-Ferrandiz E, Sellam J. Use of machine learning in osteoarthritis research: a systematic literature review. RMD Open. 2022 03; 8(1).  View on PubMed
  4. Maruthamuthu S, Rajalingam K, Kaur N, Morvan MG, Soto J, Lee N, Kong D, Hu Z, Reyes K, Ng D, Butte AJ, Chiu C, Rajalingam R. Individualized Constellation of Killer Cell Immunoglobulin-Like Receptors and Cognate HLA Class I Ligands that Controls Natural Killer Cell Antiviral Immunity Predisposes COVID-19. Front Genet. 2022; 13:845474.  View on PubMed
  5. Hu Z, van der Ploeg K, Chakraborty S, Arunachalam P, Mori D, Jacobson K, Bonilla H, Parsonnet J, Andrews J, Hedlin H, de la Parte L, Dantzler K, Ty M, Tan G, Blish C, Takahashi S, Rodriguez-Barraquer I, Greenhouse B, Butte A, Singh U, Pulendran B, Wang T, Jagannathan P. Early immune responses have long-term associations with clinical, virologic, and immunologic outcomes in patients with COVID-19. Res Sq. 2022 Feb 02.  View on PubMed
  6. Nelson CA, Bove R, Butte AJ, Baranzini SE. Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis. J Am Med Inform Assoc. 2022 01 29; 29(3):424-434.  View on PubMed
  7. Kaur N, Oskotsky B, Butte AJ, Hu Z. Systematic identification of ACE2 expression modulators reveals cardiomyopathy as a risk factor for mortality in COVID-19 patients. Genome Biol. 2022 01 10; 23(1):15.  View on PubMed
  8. Hu Z, Bhattacharya S, Butte AJ. Application of Machine Learning for Cytometry Data. Front Immunol. 2021; 12:787574.  View on PubMed
  9. Bishara A, Chiu C, Whitlock EL, Douglas VC, Lee S, Butte AJ, Leung JM, Donovan AL. Postoperative delirium prediction using machine learning models and preoperative electronic health record data. BMC Anesthesiol. 2022 01 03; 22(1):8.  View on PubMed
  10. Cefalu WT, Andersen DK, Arreaza-Rubín G, Pin CL, Sato S, Verchere CB, Woo M, Rosenblum ND, Symposium planning committee, moderators, and speakers:. Heterogeneity of Diabetes: β-Cells, Phenotypes, and Precision Medicine: Proceedings of an International Symposium of the Canadian Institutes of Health Research's Institute of Nutrition, Metabolism and Diabetes and the U.S. National Institutes of Health's National Institute of Diabetes and Digestive and Kidney Diseases. Diabetes Care. 2022 01 01; 45(1):3-22.  View on PubMed
  11. Bishara A, Wong A, Wang L, Chopra M, Fan W, Lin A, Fong N, Palacharla A, Spinner J, Armstrong R, Pletcher MJ, Lituiev D, Hadley D, Butte A. Opal: an implementation science tool for machine learning clinical decision support in anesthesia. J Clin Monit Comput. 2022 10; 36(5):1367-1377.  View on PubMed
  12. Cefalu WT, Andersen DK, Arreaza-Rubín G, Pin CL, Sato S, Verchere CB, Woo M, Rosenblum ND, Symposium planning committee, moderators, and speakers:, Rosenblum N, Cefalu W, Andersen DK, Arreaza-Rubín G, Dhara C, James SP, Makarchuk MJ, Pin CL, Sato S, Verchere B, Woo M, Powers A, Estall J, Hoesli C, Millman J, Linnemann A, Johnson J, Pin CL, Hawkins M, Woo M, Gloyn A, Cefalu W, Rosenblum N, Huising MO, Benninger RKP, Almaça J, Hull-Meichle RL, MacDonald P, Lynn F, Melero-Martin J, Yoshihara E, Stabler C, Sander M, Evans-Molina C, Engin F, Thompson P, Shalev A, Redondo MJ, Nadeau K, Bellin M, Udler MS, Dennis J, Dash S, Zhou W, Snyder M, Booth G, Butte A, Florez J. Heterogeneity of Diabetes: β-Cells, Phenotypes, and Precision Medicine: Proceedings of an International Symposium of the Canadian Institutes of Health Research's Institute of Nutrition, Metabolism and Diabetes and the U.S. National Institutes of Health's National Institute of Diabetes and Digestive and Kidney Diseases. Diabetes. 2021 Nov 13.  View on PubMed
  13. Kornblith AE, Addo N, Dong R, Rogers R, Grupp-Phelan J, Butte A, Gupta P, Callcut RA, Arnaout R. Development and Validation of a Deep Learning Strategy for Automated View Classification of Pediatric Focused Assessment With Sonography for Trauma. J Ultrasound Med. 2022 Aug; 41(8):1915-1924.  View on PubMed
  14. Zhang B, Silverman AL, Bangaru S, Arneson D, Dasharathy S, Nguyen N, Rodden D, Shih J, Butte AJ, El-Nachef WN, Boland BS, Rudrapatna VA. Case-control study of the association of chronic acid suppression and social determinants of health with COVID-19 infection. Sci Rep. 2021 10 25; 11(1):20987.  View on PubMed
  15. Butte KD, Bahmani A, Butte AJ, Li X, Snyder MP. Five-year pediatric use of a digital wearable fitness device: lessons from a pilot case study. JAMIA Open. 2021 Jul; 4(3):ooab054.  View on PubMed
  16. Hong JC, Butte AJ. Assessing Clinical Outcomes in a Data-Rich World-A Reality Check on Real-World Data. JAMA Netw Open. 2021 07 01; 4(7):e2117826.  View on PubMed
  17. Kaur N, Bhattacharya S, Butte AJ. Big Data in Nephrology. Nat Rev Nephrol. 2021 Oct; 17(10):676-687.  View on PubMed
  18. Mahendra M, Luo Y, Mills H, Schenk G, Butte AJ, Dudley RA. Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care. Crit Care Explor. 2021 Jun; 3(6):e0450.  View on PubMed
  19. Liu X, Anstey J, Li R, Sarabu C, Sono R, Butte AJ. Rethinking PICO in the Machine Learning Era: ML-PICO. Appl Clin Inform. 2021 03; 12(2):407-416.  View on PubMed
  20. Rudrapatna VA, Glicksberg BS, Butte AJ. Utility of routinely collected electronic health records data to support effectiveness evaluations in inflammatory bowel disease: a pilot study of tofacitinib. BMJ Health Care Inform. 2021 May; 28(1).  View on PubMed

Go to UCSF Profiles, powered by CTSI