Exploring the Application of Artificial Intelligence in the Context of Healthcare and Precision Medicine
- Flora Maran
- Jan 14
- 14 min read
Written by: Flora Maran
Edited by: Jennifer Chen and Felix Yuan
Illustrated by: Ariel Brown-Ogha

The COVID-19 pandemic dramatically increased the need for greater access to care, forcing healthcare practitioners to think quickly during an unprecedented era of social distancing [1]. Suddenly, healthcare workers grappled with the challenge of maintaining patient care amidst widespread public health uncertainties. In response, telehealth, characterized by the remote delivery of medical services via telecommunication technologies, emerged as a pivotal solution [2]. As a result of the COVID-19 pandemic, it became evident that medicine and healthcare delivery needed to embrace flexibility and personalized patient care. The rise of telehealth as a form of healthcare technology ignited a path for other technology-driven medical services, prompting considerations on managing the increasing patient data influx [1]. This scenario has primed the stage for the integration of artificial intelligence (AI) to address existing gaps.
A key component of these technological healthcare advancements centers around enhancing the patient care experience and distributing care to medically underserved areas (MUAs) or “geographic areas and populations with a shortage of primary care health services” [3]. Historically, access to healthcare services has been a driving force in unequal health outcomes among marginalized groups in the United States [4]. Various factors, including limited transportation, health insurance coverage, job flexibility, language and cultural barriers, and discriminatory housing policies, have contributed to inequalities in both healthcare access and health outcomes [4].
The COVID-19 pandemic has underscored the importance of technological solutions as a means for healthcare providers to bridge gaps in healthcare delivery and better understand the diverse identities and needs of patients. Due to the virtual nature of telehealth and similar technologies, AI, as a broad technological concept, holds significant potential for numerous applications in healthcare. AI-powered tools have the potential to support clinical decision-making and improve patient care by minimizing human errors, reducing time and costs by automating tasks, and personalizing healthcare delivery to the patient [5].
AI more broadly encompasses an umbrella of ways in which computer algorithms recognize patterns in data and generate new predictions based on both the given information and what programmers judge as “meaningful and informative” identifiers [6] For instance, in categorizing a collection of images of vehicles, AI might identify the common trait of wheels among them. The allure surrounding AI lies in its capacity to mimic human intelligence by facilitating rapid computer-based decisions using information about patterns [7]. AI’s decision-making ability is also transforming many industries, including entertainment, finance, and transportation to name a few [5]. In healthcare, AI presents an opportunity to empower both patients and clinicians in helping make informed treatment plans, while also raising challenges around bias, privacy, and the need for human expertise alongside technology [5].
A major instance of artificial intelligence implemented in healthcare involves employing deep machine learning (DL), a specialized subset of AI that utilizes labeled data to discern patterns within new datasets [8]. We can compare how we choose the clothes we wear based on the weather to how DL makes decisions based on patterns – when it is cold, you wear a sweater – maybe even boots, a coat, and winter accessories depending on the severity. In the same way you make decisions on how to dress because you have gathered a lot of data on and recognized the pattern in the weather, DL uses the patterns in large datasets to predict a certain outcome based on the context.
This begs the question: How does AI fit into the clinical context? One sector where DL may effectively assist clinicians is in creating models that analyze ultrasound images for metastatic, or cancer-causing, lymph nodes in patients [9]. For example, DL can be used for breast cancer imaging. In the United States, breast cancer accounts for the most diagnosed form of cancer among American women, with approximately 285,000 new cases in 2021 [10]. Researchers estimate an average survival rate of 90% for breast cancer [10], emphasizing the importance of accurate cancer staging for patient management as the survival rate of breast cancer will decrease if detected at a later stage (i.e., the local, regional, or distant stage).
Amidst the ever-evolving realm of medical diagnostics, the gold standard for scrutinizing lymph node metastases, particularly among breast cancer patients, involves intricate procedures such as aspiration cytology to withdraw cells using a needle, a sentinel lymph node biopsy procedure to check if cancer has spread to nearby lymph nodes, and axillary lymph node surgery to remove lymph nodes from the armpit [11]. Traditionally, these processes are incredibly labor-intensive, susceptible to errors, and may have clinical complications such as infection and edema or tissue swelling. Moreover, some patients may undergo surgeries that could have been avoided. This is because research indicates that a considerable number of individuals with sentinel lymph node metastasis do not develop further metastases in other lymph nodes, leading to unnecessary surgical procedures [11].
AI may offer a non-invasive alternative for evaluating the status of lymph node metastasis, reducing the need for invasive and risky procedures. Specifically, clinicians may automate and standardize the assessment of lymph node metastasis by utilizing ultrasound images of the soft tissue inside the body [12] and allowing the trained DL model to detect metastatic cells from the slides [9].
The DL approach involves training an algorithm with labeled images of both malignant (cancerous) and benign (noncancerous) lymph nodes. Through this process, the algorithm learns to distinguish between the two types of lymph nodes based on visual features depicted in the images. Malignant lymph nodes appear to have a globular shape, distorted blood vessels, and denting of the cortex or the outer portion of the lymph node. In contrast, normal lymph nodes appear clear, oval-shaped, and have a symmetrical cortex [10]. Equipped with an algorithm adept at distinguishing between these metastatic and benign traits, clinicians can seamlessly upload extensive sets of ultrasound images for the DL model to identify potential abnormalities in the lymph nodes of real patients.
Google’s AutoML is an excellent example of an AI model able to identify ultrasound images of malignant breast tissue based on histopathological or tissue cell results, with an accuracy of 85-92% [10]. One study found that the Google AutoML model showed similar performance to even experienced radiologists in predicting metastatic lymph nodes of breast cancer patients [13]. Examining 317 ultrasounds of axillary lymph nodes, both the algorithm and the radiologists differentiated between benign and malignant images by identifying features associated with normal and abnormal presentations. Despite the study having limitations due to being based on data collected from past patients and non-standardized imaging, researchers remain optimistic about the future of enlisting prospective patients, employing multi-modality imaging techniques, and conducting more extensive assessments of AI efficacy with increased sample sizes [10].
As experts continue to explore the possibilities of AI, critics raise significant questions about using big data, which are defined by volume, variety, and velocity [7]. Big data refers to extremely large datasets that contain a large volume of information, a variety of mediums (e.g., text, photos, videos), and a velocity or the speed at which information is generated over time [14]. The scale of big data makes it difficult for humans to process using traditional methods as no amount of human-driven calculations would be able to keep up with the constant stream of data [15]. Hence, AI algorithms relieve the computing process by capturing and structuring the information at a fraction of the time humans would be capable of.
Such access to vast quantities of patient health data comes with opportunities for innovation as well as risks in data breaches. To protect privacy, many datasets try to remove personal identities. [16]. However, research by Rocher, Hendrickx, and de Montjoye found that even with attempts at anonymization, nearly all Americans could be identified using just 15 demographic traits [17]. Think of it like this: imagine you are carrying a bag filled with your groceries, carefully selected fruits, veggies, snacks, and drinks. As you are walking home, the bag suddenly tears open, scattering your purchases for everyone to see. Each item in your bag represents a unique aspect of your preferences, much like demographic data. Despite efforts to hide your identity, these details can still give you away. In this analogy, our "grocery bag" symbolizes the infrastructure protecting personal health information. If it is weak, like the torn paper bag, it cannot safeguard the vast amount of sensitive data we generate. This leaves individuals vulnerable to privacy breaches and the risk of being identified.
The automation of data processing has sparked concerns among critics regarding the potential biases embedded in AI decision-making [18]. Large language models (LLMs), which extend beyond the realm of medicine, are a type of DL model that have the capability to understand and generate human language by recognizing patterns in vast amounts of text [19]. In a 2023 study evaluating the clinical decision-making skills of four general LLMS – Bard, ChatGPT-3.5, Claude, and GPT-4 – researchers uncovered troubling patterns [18].
They presented each model with nine distinct questions, repeated five times, resulting in forty-five responses per model. In all the models, the researchers observed that the responses perpetuated race-based medicine or the assumptions of one’s genetic or physiological differences based on their race [20], revealing both racial bias and gender bias [18].
This observation mirrors historical instances of bias in medicine, as highlighted in a 2020 Lancet article by Cerdeña, Plaisime, and Tsai [20]. The authors detailed how race-based medical assumptions have historically harmed patients. One notable case involved the assessment of kidney function in Black patients using the estimated glomerular filtration rate (eGFR). It was previously assumed that Black patients had higher muscle mass and creatinine levels, resulting in a falsely elevated eGFR compared to White patients. This misconception placed Black patients at a heightened risk of kidney failure and led to delayed dialysis treatment [20]. Despite efforts to dismantle race-based medical practices, remnants of this bias persist.
Regrettably, the study on LLMs revealed that these models have not completely abandoned outdated biases. When researchers inquired about eGFR calculation, both ChatGPT-3.5 and GPT-4 unfortunately perpetuated race-based medicine by asserting false claims about Black individuals having different muscle mass and consequently higher creatinine levels [18]. Similarly, assumptions about lung capacity in minority groups further underscored the prevalence of race and gender biases in AI [18, 20]. These findings raise significant concerns about the future of medically-trained AI and its ability to navigate and overcome entrenched stereotypes. As we strive for equitable healthcare, addressing and mitigating biases in AI systems becomes paramount.
The data that LLMs are trained on plays the biggest role in the responses they will output. The researchers acknowledge that all the LLMs used had promoted race-based medicine to some extent because they were “trained in an unsupervised fashion on large-scale corpora from the internet and textbooks,” thus the responses reaffirmed antiquated and biased information that is no longer recognized [18]. As LLMs and other AI models emerge as a decision-making tool, experts caution clinicians to assess for potential biases before they are used for medical education or patient care [21]. A possible solution to mitigating biases involves a quantitative method called sensitivity analysis, which can assess for random sources of error and help researchers better understand the uncertainty of results [22]. By systematically probing the robustness of AI models to different inputs and assumptions, sensitivity analysis aids in identifying and addressing potential biases, enabling researchers to refine algorithms and decision-making processes to enhance fairness and reliability in medical AI applications [22].
Imagine a healthcare landscape where your personal health data not only enhances medical predictions but also empowers you with valuable insights into your own health. Currently, one particular American research study funded by the National Institutes of Health has taken on the endeavor to understand the intricate interplay between our genetic makeup and various social factors in shaping our health outcomes [23]. Enter initiatives like the All of Us Research Program (All of Us). With a goal of enrolling at least one million participants from diverse backgrounds across the United States, the program seeks to bridge longstanding gaps in medical research representation [24].
What sets All of Us apart is its concerted effort to include communities historically underrepresented in biomedical studies. As of 2024, a substantial portion of its participants hail from these communities, thanks to strategic partnerships with advocacy groups such as the Asian & Pacific Islander American Health Forum and the Center for Black Equity [25]. In fact, “77% of participants are from communities that are historically underrepresented in biomedical research and 46% are individuals from underrepresented racial and ethnic minorities” [24].
Participation in the program is straightforward yet impactful. To join the study, participants complete consent forms, health surveys, and provide a one-time collection of biospecimens, including basic measurements like height, weight, and a blood and urine sample [26]. In return, consenting participants gain access to a treasure trove of insights about their genetic ancestry and traits, whether they have a greater risk of developing certain diseases or conditions, and how their body may respond to different medications [27].
Ultimately, the researchers hope to bridge the gap between clinical medicine and sociodemographic factors that may impact health and disease by augmenting the personalized approach of precision medicine, which delves beyond traditional medicine to encompass multifaceted determinants of health [28]. These health determinants span from the tangible, such as one's residential environment and lifestyle choices, to the intangible, including familial health history and genetic predispositions [29]. At the core of this progressive initiative lies a philosophy grounded in the principles of open science, reproducible research, and equitable access [30], highlighting the recent movement towards a far more inclusive and enlightening era of biomedical research.
Large health data research studies like All of Us will soon set the stage for precision medicine so that medical treatments move away from the “one-size-fits-all” approach and toward individualized care [31]. Aligned with the principles of precision medicine, All of Us aims to tailor healthcare interventions to individual or group-specific factors [23]. With its extensive data, precision medicine has the potential to enhance patient care by training AI algorithms to predict optimal treatment strategies and outcomes [32]. Employing precision medicine is no small feat, but clinicians and researchers acknowledge that offering personalized treatments may revolutionize how we see medicine [32].
One ophthalmologic project demonstrates how using AI can take advantage of the data collected from the program. The researchers leveraged early access to this data and a subset of predictive AI modeling called random forest to determine the need for surgery among patients with glaucoma [34]. Using electronic health records, the random forest model identified several key factors influencing the need for glaucoma surgery [34, 35]. Notably, access to a healthcare system emerged as a significant predictor, alongside systolic and diastolic blood pressure, pulse, and body measurements. This finding underscores the crucial role of access to healthcare in managing glaucoma, as individuals without such access may experience delayed diagnosis and treatment, leading to advanced disease stages necessitating surgical intervention [34].
Although the relationship between health outcomes for ophthalmologic patients and access to care are complex and not fully understood, the model's accuracy was 0.99. This indicated a high likelihood that the features identified by the model were associated with the need for glaucoma surgery intervention to some extent [34]. By employing the AI technique random forest modeling on data collected by All of Us, researchers could identify within ophthalmologic patients multiple predictors of glaucoma surgery need, including lack of healthcare access.
In navigating this new frontier, it is imperative to acknowledge AI's potential to transform healthcare delivery, such as its non-invasive approach to diagnosing malignant lymph nodes in cancer patients [36]. The introduction of AI promises revolutionary advancements in precision medicine, harnessing big data and the expertise of clinicians and AI experts. Amidst the excitement for innovation, critical ethical considerations loom large, including privacy protection, bias mitigation, and addressing healthcare disparities. Despite these challenges, initiatives like the All of Us Research Program exemplify endeavors to promote inclusivity and equity in medical research, laying the groundwork for precision medicine tailored to individual needs. As we move forward, it is imperative to strike a balance between harnessing the power of AI for healthcare transformation and upholding the highest standards of ethical practice, ensuring that progress is made responsibly and equitably for the benefit of all.
References
[1] Wosik, J., Fudim, M., Cameron, B., Gellad, Z. F., Cho, A., Phinney, D., Curtis, S., Roman, M., Poon, E. G., Ferranti, J., Katz, J. N., & Tcheng, J. (2020). Telehealth transformation: COVID-19 and the rise of virtual care. Journal of the American Medical Informatics Association, 27(6), 957–962. https://doi.org/10.1093/jamia/ocaa067
[2] Telehealth. (n.d.). National Institute of Biomedical Imaging and Bioengineering. Retrieved April 8, 2024, from https://www.nibib.nih.gov/science-education/science-topics/telehealth
[3] HPSA and MUA/P: Shortage designation types: Guidance portal. HPSA and MUA/P: Shortage Designation Types | Guidance Portal. (n.d.). https://www.hhs.gov/guidance/document/hpsa-and-muap-shortage-designation-types
[4] Lee, E. K., Donley, G., Ciesielski, T. H., Freedman, D. A., & Cole, M. B. (2023). Spatial availability of federally qualified health centers and disparities in health services utilization in medically underserved areas. Social Science & Medicine, 328, 116009. https://doi.org/10.1016/j.socscimed.2023.116009
[5] Alowais, S. A., Alghamdi, S. S., Alsuhebany, N., Alqahtani, T., Alshaya, A. I., Almohareb, S. N., Aldairem, A., Alrashed, M., Bin Saleh, K., Badreldin, H. A., Al Yami, M. S., Al Harbi, S., & Albekairy, A. M. (2023). Revolutionizing Healthcare: The role of Artificial Intelligence in Clinical Practice. BMC Medical Education, 23(1). https://doi.org/10.1186/s12909-023-04698-z
[6] What is Data Labeling? - Data Labeling Explained - AWS. (n.d.). Amazon Web Services, Inc. https://aws.amazon.com/what-is/data-labeling/
[7] O’Leary, D. E. (2013a). Artificial Intelligence and Big Data. IEEE Intelligent Systems, 28(2), 96–99. https://doi.org/10.1109/mis.2013.39
[8] Castiglioni, I., Rundo, L., Codari, M., Di Leo, G., Salvatore, C., Interlenghi, M., Gallivanone, F., Cozzi, A., D’Amico, N. C., & Sardanelli, F. (2021). AI applications to medical images: From machine learning to deep learning. Physica Medica, 83, 9–24. https://doi.org/10.1016/j.ejmp.2021.02.006
[9] Caldonazzi, N., Rizzo, P. C., Eccher, A., Girolami, I., Fanelli, G. N., Naccarato, A. G., Bonizzi, G., Fusco, N., d'Amati, G., Scarpa, A., Pantanowitz, L., & Marletta, S. (2023). Value of Artificial Intelligence in Evaluating Lymph Node Metastases. Cancers, 15(9), 2491. https://doi.org/10.3390/cancers15092491
[10] Tahmasebi, A., Qu, E., Sevrukov, A., Liu, J.-B., Wang, S., Lyshchik, A., Yu, J., & Eisenbrey, J. R. (2021). Assessment of axillary lymph nodes for metastasis on ultrasound using artificial intelligence. Ultrasonic Imaging, 43(6), 329–336. https://doi.org/10.1177/01617346211035315
[11] Windsor, G. O., Bai, H., Lourenco, A. P., & Jiao, Z. (2023). Application of artificial intelligence in predicting lymph node metastasis in breast cancer. Frontiers in Radiology, 3. https://doi.org/10.3389/fradi.2023.928639
[12] Ultrasound: What it is, purpose, procedure & results. Cleveland Clinic. (n.d.). https://my.clevelandclinic.org/health/diagnostics/4995-ultrasound
[13] Zeng, Y., & Zhang, J. (2020). A machine learning model for detecting invasive ductal carcinoma with Google Cloud AutoML vision. Computers in Biology and Medicine, 122, 103861. https://doi.org/10.1016/j.compbiomed.2020.103861
[14] Hofmann, E. (2017). Big data and supply chain decisions: the impact of volume, variety and velocity properties on the bullwhip effect. International Journal of Production Research, 55(17), 5108-5126.
[15] Dash, S., Shakyawar, S. K., Sharma, M., & Kaushik, S. (2019). Big data in healthcare: management, analysis and future prospects. Journal of big data, 6(1), 1-25. https://doi.org/10.1186/s40537-019-0217-0
[16] Seh, A. H., Zarour, M., Alenezi, M., Sarkar, A. K., Agrawal, A., Kumar, R., & Ahmad Khan, R. (2020, May). Healthcare data breaches: insights and implications. In Healthcare (Vol. 8, No. 2, p. 133). MDPI.
[17] Rocher, L., Hendrickx, J. M., & de Montjoye, Y.-A. (2019). Estimating the success of re-identifications in incomplete datasets using Generative Models. Nature Communications, 10(1). https://doi.org/10.1038/s41467-019-10933-3
[18] Omiye, J. A., Lester, J. C., Spichak, S., Rotemberg, V., & Daneshjou, R. (2023). Large language models propagate race-based medicine. Npj Digital Medicine, 6(1). https://doi.org/10.1038/s41746-023-00939-z
[19] Chang, Y., Wang, X., Wang, J., Yuan, W., Yang, L., Zhu, K., Chen, H., Yi, X., Wang, C., Wang, Y., Ye, W., Zhang, Y., Chang, Y., Yu, P. S., Yang, Q., & Xie, X. (2024). A Survey on Evaluation of Large Language Models. ACM Transactions on Intelligent Systems and Technology. https://doi.org/10.1145/3641289
[20] Cerdeña, J. P., Plaisime, M. V., & Tsai, J. (2020). From race-based to race-conscious medicine: how anti-racist uprisings call us to act. The Lancet, 396(10257), 1125–1128. https://doi.org/10.1016/S0140-6736(20)32076-6
[21] Artsi, Y., Sorin, V., Konen, E., Glicksberg, B. S., Nadkarni, G., & Klang, E. (2024). Large language models for generating medical examinations: systematic review. BMC Medical Education, 24(1), 1-11.
[22] Greenland, S. (1996, December 1). Basic Methods for Sensitivity Analysis of Biases. OUP Academic. https://academic.oup.com/ije/article/25/6/1107/672425
[23] All of Us Research Program | National Institutes of Health (NIH). (2020). All of Us Research Program | NIH. https://allofus.nih.gov/
[24] Bick, A. G., Metcalf, G. A., Mayo, K. R., Lichtenstein, L., Rura, S., Carroll, R. J., Musick, A., Linder, J. E., Jordan, I. K., Nagar, S. D., Sharma, S., Meller, R., Basford, M., Boerwinkle, E., Cicek, M. S., Doheny, K. F., Eichler, E. E., Gabriel, S., Gibbs, R. A., & Glazer, D. (2024). Genomic data in the All of Us Research Program. Nature. https://doi.org/10.1038/s41586-023-06957-x
[25] Communications and Engagement Partners. (2020, June 24). All of Us Research Program | NIH. https://allofus.nih.gov/funding-and-program-partners/communications-and-engagement-partners
[26] Participation | All of Us Research Program | NIH. (n.d.-a). https://allofus.nih.gov/get-involved/participation
[27] What Participants Receive | Join All of Us. (n.d.). Www.joinallofus.org. Retrieved April 10, 2024, from https://www.joinallofus.org/what-participants-receive#
[28] The All of Us Research Program Genomics Investigators. (2024). Genomic data in the All of Us Research Program. Nature, 627, 340–346. https://doi.org/10.1038/s41586-023-06957-x
[29] Gray, I. D., Kross, A. R., Renfrew, M. E., & Wood, P. (2020). Precision medicine in lifestyle medicine: the way of the future?. American Journal of Lifestyle Medicine, 14(2), 169-186.
[30] Framework for Access to All of Us Data Resources. (2021, August). https://www.researchallofus.org/wp-content/themes/research-hub-wordpress-theme/media/data&tools/data-access-use/AoU_Data_Access_Framework_508.pdf
[31] FDA. (2018, September). Precision Medicine | FDA. U.S. Food and Drug Administration. https://www.fda.gov/medical-devices/in-vitro-diagnostics/precision-medicine
[32] Johnson, K. B., Wei, W. Q., Weeraratne, D., Frisse, M. E., Misulis, K., Rhee, K., ... & Snowdon, J. L. (2021). Precision medicine, AI, and the future of personalized health care. Clinical and translational science, 14(1), 86-93.
[34] Baxter, S. L., Saseendrakumar, B. R., Paul, P., Kim, J., Bonomi, L., Kuo, T.-T., Loperena, R., Ratsimbazafy, F., Boerwinkle, E., Cicek, M., Clark, C. R., Cohn, E., Gebo, K., Mayo, K., Mockrin, S., Schully, S. D., Ramirez, A., & Ohno-Machado, L. (2021). Predictive analytics for glaucoma using data from the All of Us Research Program. American Journal of Ophthalmology, 227, 74–86. https://doi.org/10.1016/j.ajo.2021.01.008
[35] Rigatti, S. J. (2017). Random forest. Journal of Insurance Medicine, 47(1), 31-39.
[36] Sebastian, A. M., & Peter, D. (2022). Artificial Intelligence in Cancer Research: Trends, Challenges and Future Directions. Life, 12(12), 1991.
Comments