Background: Bacteremia is a life-threatening condition requiring prompt diagnosis and treatment. Clinical signs of bacteremia often overlap with non-infectious inflammatory responses, leading to diagnostic uncertainty and potentially inappropriate empirical antibiotic use. Artificial intelligence (AI) offers a novel approach to enhance early detection of bacteremia and support clinical decision-making in ICU settings. Aim and Objectives: Aim: To evaluate the utility of an AI-based model in predicting bacteremia in critically ill patients. Objectives: To develop and externally validate an AI-based Bacteremia Prediction Model (AI-BPM). To compare AI predictions with blood culture reports to assess diagnostic accuracy. Methodology: A prospective observational study was conducted in the Medical ICU of GG Hospital from January 2024 to January 2025. A total of 566 patients with at least one blood culture sample were included. A Random Forest Classifier was developed using Python 3.7.3. The dataset was split into training (80%) and testing (20%) sets. Input variables included clinical signs, lab markers, comorbidities, APACHE IV score, and exposure history. Model predictions were compared against culture-confirmed bacteremia to assess performance. Results: The AI-BPM demonstrated excellent diagnostic performance: AUROC: 0.93, Sensitivity: 90, Specificity: 100%, Precision (PPV): 100%, F1 Score: 95% The model showed strong agreement with blood culture results and outperformed several previously published AI-based models in sensitivity and specificity. Conclusion: The AI-BPM is a reliable, data-driven tool for early identification of bacteremia in ICU patients. Its high accuracy and specificity can support timely, evidence-based decisions on antimicrobial use, contributing to better patient outcomes and antimicrobial stewardship. Wider validation and real-time integration into clinical practice are recommended.
Infection prevention and control (IPC) has long been central to reducing healthcare-associated infections (HAIs) and antimicrobial resistance (AMR). Since the 19th century, IPC has evolved into an evidence-based discipline with global health significance. The World Health Organization (WHO) and the European Centre for Disease Prevention and Control (ECDC) have outlined core components for effective IPC programs, emphasizing national strategies, evidence-based guidelines, health worker education, real-time surveillance, and continuous monitoring [1,2]. These IPC measures extend beyond hospital settings and also address community-associated infections (CAIs), which are increasingly contributing to the global burden of resistant infections [3,4].
Despite the progress in IPC, bloodstream infections like bacteremia continue to cause significant morbidity and mortality in critically ill patients. Early and accurate identification of bacteremia is essential for initiating timely antimicrobial therapy and improving outcomes. Conventional diagnostic methods such as blood cultures remain the gold standard but are time-consuming and may delay appropriate clinical interventions. As a result, the integration of novel predictive tools is increasingly necessary in critical care settings.
Artificial intelligence (AI) is emerging as a powerful tool in healthcare, offering data-driven support in disease prediction, personalized treatment, and clinical decision-making. AI applications in infectious disease control include early outbreak detection, hospital infection surveillance, and antimicrobial stewardship [1,5]. Several machine learning (ML) models, including extreme gradient boosting (XGBoost), logistic regression, random forests, and deep learning approaches, have been evaluated for predicting bacteremia. For example, a real-time AI model developed in Taiwan using XGBoost achieved an area under the curve (AUC) of 0.81 in the derivation dataset and 0.76 in the prospective validation cohort of adult febrile emergency department (ED) patients [6].
However, most existing models have been developed in high-income countries and may not be directly applicable to resource-constrained settings. Additionally, there is limited research on the comparative efficacy of multiple ML algorithms such as multilayer perceptron (MLP) and Light Gradient Boosting Machine (LightGBM) for predicting bacteremia in intensive care unit (ICU) patients. While scoring tools like the quick Sequential Organ Failure Assessment (qSOFA) score are widely used to detect sepsis, they are not specifically designed to predict bacteremia, leaving a gap in early risk stratification for bloodstream infections [6].
Given this background, the present study aims to develop and externally validate an AI-based Bacteremia Prediction Model (AI-BPM) among patients admitted to the medical ICU (MICU) of a tertiary care hospital in Southern India. The model’s predictions will be compared against laboratory-confirmed blood culture results to assess diagnostic accuracy. This approach has the potential to support real-time clinical decision-making, reduce diagnostic delays, and improve antimicrobial stewardship efforts in critical care environments.
Study Objectives
Study Design and Setting
This prospective observational study was conducted in the Department of Critical Care Medicine (Medical ICU), GG Hospital, a tertiary care referral center in Southern India. The study spanned a period of one year, from January 2024 to January 2025.
Study Population
All patients admitted to the Medical ICU during the study period were considered eligible if at least one set of blood cultures was obtained during their ICU stay. Patients with incomplete clinical or laboratory data were excluded from model development and validation.
Sample Size and Data Partitioning
A total of 566 eligible patients were enrolled in the study. The dataset was randomly partitioned as follows:
Development of the AI Model
The AI-BPM was developed using the Random Forest Classifier, a supervised machine learning algorithm known for its robustness and ensemble-based decision-making. The model was implemented using Python version 3.7.3 with libraries including scikit-learn, pandas, and numpy.
Model Input Features
The following clinical, demographic, and laboratory features were used as input variables to train the AI model:
Data Collection and Ethical Considerations
Clinical and laboratory data were collected prospectively from patient medical records. The decision to initiate antibiotics or to send blood cultures was entirely at the discretion of the treating physician and not influenced by the AI predictions. The study adhered to ethical standards and was approved by the Institutional Ethics Committee. Patient confidentiality was maintained throughout the study.
Model Output
The AI-BPM was designed to produce a binary prediction:
These predictions were then compared with blood culture reports (considered the reference standard) to evaluate model performance.
Model Evaluation Metrics
Model performance was assessed on the test dataset using the following metrics:
The predictive capacity of the AI-BPM was evaluated to determine its potential role in early risk stratification for bacteremia in ICU settings.
AI-Based Bacteremia Prediction Model Performance
A Random Forest Classifier was trained on 80% of the dataset (n = 452) and tested on 20% (n = 114) to predict the likelihood of bacteremia in ICU patients using clinical, demographic, and laboratory variables. The model’s performance was assessed against confirmed blood culture results.
Table 1: Performance Metrics of the AI-BPM Model on the Test Dataset (n = 114)
Metric |
Value |
Interpretation |
Accuracy |
98% |
The model correctly predicted 98% of the total test cases (bacteremia and non-bacteremia). |
Precision (Positive Predictive Value) |
100% |
Every case predicted as bacteremia by the model was actually bacteremia. No false positives occurred. |
Recall (Sensitivity) |
90% |
90% of all true bacteremia cases were correctly identified, indicating good detection capability. |
F1 Score |
95% |
Balanced performance metric that considers both precision and recall. |
AUROC |
0.93 |
The model had excellent ability to distinguish between bacteremia and non-bacteremia cases. |
The AI-based Bacteremia Prediction Model (AI-BPM) demonstrated high diagnostic performance, with an AUROC of 0.93, indicating excellent discriminatory ability. The precision of 100% implies that the model did not misclassify any non-bacteremia cases as bacteremia, which is critical in avoiding unnecessary antibiotic use. Meanwhile, a recall of 90% shows the model effectively captured the majority of actual bacteremia cases, making it suitable for early risk stratification and clinical decision support. The F1 score of 95% further confirms the model’s strong and balanced performance. Overall, the AI-BPM shows potential for integration into ICU workflows to support clinicians in early identification of patients at risk of bloodstream infections.
Table 2: Association of Clinical and Laboratory Features with Blood Culture Positivity (n = 566)
Feature |
Category |
Culture Negative |
Culture Positive |
% Positive |
p-value |
Signs of localisations |
No |
444 |
98 |
18.1% |
0.485 |
|
Yes |
21 |
3 |
12.5% |
|
Desaturation |
No |
111 |
66 |
37.3% |
<0.0001 |
|
Yes |
354 |
35 |
9.0% |
|
Leukocytosis/Leukopenia |
No |
104 |
1 |
1.0% |
<0.0001 |
|
Yes |
361 |
100 |
21.7% |
|
High neutrophil count |
No |
187 |
1 |
0.5% |
<0.0001 |
|
Yes |
278 |
100 |
26.5% |
|
Raised CRP |
No |
91 |
1 |
1.1% |
<0.0001 |
|
Yes |
370 |
100 |
21.3% |
|
Raised lactate |
No |
110 |
0 |
0.0% |
<0.0001 |
|
Yes |
355 |
101 |
22.1% |
|
URE |
No |
218 |
31 |
12.4% |
0.003 |
|
Yes |
247 |
70 |
22.1% |
|
Raised LFT/RFT |
No |
119 |
5 |
4.0% |
<0.0001 |
|
Yes |
346 |
96 |
21.7% |
|
Recent surgical interventions |
No |
428 |
83 |
16.2% |
0.002 |
|
Yes |
37 |
18 |
32.7% |
|
Immunosuppressive medication use |
No |
422 |
81 |
16.1% |
0.002 |
|
Yes |
43 |
20 |
31.7% |
|
Contact with healthcare settings |
No |
370 |
71 |
16.1% |
0.042 |
|
Yes |
95 |
30 |
24.0% |
|
Feature |
Category |
Culture Negative |
Culture Positive |
% Positive |
p-value |
An analysis of clinical and laboratory features revealed several significant predictors of bacteremia among ICU patients. Patients with desaturation had a significantly higher rate of bacteremia (37.3%) compared to those without desaturation (9.0%; p < 0.0001). Similarly, leukocytosis or leukopenia was strongly associated with bacteremia, observed in 21.7% of cases with abnormal white cell counts versus only 1.0% in those without (p < 0.0001). A high neutrophil count was another robust predictor, with a 26.5% bacteremia rate compared to 0.5% in patients with normal counts (p < 0.0001). Elevated CRP and lactate levels were both significantly associated with bacteremia, showing positivity rates of 21.3% and 22.1%, respectively, in patients with raised values, versus <1% in those with normal values (p < 0.0001 for both).
Further, abnormal liver or renal function tests (LFT/RFT) were associated with bacteremia in 21.7% of patients compared to 4.0% in those with normal values (p < 0.0001). Other significant predictors included recent surgical interventions (32.7% vs. 16.2%; p = 0.002), use of immunosuppressive medications (31.7% vs. 16.1%; p = 0.002), and recent contact with healthcare settings (24.0% vs. 16.1%; p = 0.042). Interestingly, although signs of localisations appeared frequently, they were not statistically significant predictors of bacteremia (p = 0.485).
These findings indicate that specific laboratory markers and clinical exposures significantly contribute to the likelihood of bacteremia and underscore the importance of incorporating such features into predictive models like the AI-BPM.
Bacteremia remains a critical and life-threatening condition in hospitalized patients, particularly those in intensive care units (ICUs). Timely diagnosis and treatment are paramount to improving outcomes. However, the clinical decision-making process is often complicated by the overlapping symptoms of systemic infections and non-infectious inflammatory conditions. This diagnostic uncertainty frequently leads to the empirical use of broad-spectrum antibiotics, which, while potentially life-saving, also contributes to the growing problem of antimicrobial resistance (AMR) [1,2].
In this prospective study, we explored the utility of an Artificial Intelligence-based Bacteremia Prediction Model (AI-BPM) using a Random Forest classifier trained on a comprehensive dataset comprising 566 ICU patients. The AI model demonstrated excellent diagnostic performance, achieving an Area Under the Receiver Operating Characteristic (AUROC) of 0.93, with a sensitivity of 90% and specificity of 100%. These metrics highlight the model’s strong potential for clinical use, particularly in aiding decision-making where uncertainty exists regarding the initiation of empirical antibiotic therapy.
The comparatively higher AUROC in our study is likely attributable to the inclusion of a broad and diverse set of input features, including clinical signs, laboratory biomarkers (e.g., CRP, lactate, leukocytosis), physiological scores (APACHE IV), and patient exposure factors (e.g., prior healthcare contact, surgical interventions). This multi-dimensional approach appears to enhance the model's ability to accurately discriminate between bacteremia and non-bacteremia cases.
Our findings align with and in some cases surpass the results of earlier studies. For instance, Roimi et al. developed a machine learning-based model for early bacteremia detection in ICU patients and reported AUROCs of 0.89 ± 0.01 and 0.92 ± 0.02 in two medical centers [7]. Pai et al. reported AUROCs between 0.821 and 0.855 for AI-based bacteremia prediction in ICU populations [8], while Murri et al. developed a machine learning model yielding an AUROC of 0.74, indicating moderate discriminatory capacity [9].
In contrast, Choi et al. developed an AI bacteremia model with a lower AUROC of 0.754, though the model showed high sensitivity (0.917) but poor specificity (0.340), which could lead to excessive false positives and unnecessary antibiotic use [9]. Our model, by achieving both high sensitivity and specificity, strikes a more clinically useful balance, potentially avoiding both under- and over-treatment.
Importantly, our model is strain-independent and site-agnostic, meaning it can be applied regardless of the specific bacterial pathogen or source of infection. This generalizability makes it adaptable across various departments and clinical settings. Furthermore, the model is designed to be dynamic—daily clinical data can be re-inputted, allowing for real-time reassessment of bacteremia risk. This iterative use could facilitate timely therapeutic decisions, improve patient outcomes, and support antimicrobial stewardship programs by guiding judicious antibiotic use [1,7].
With increasing validation of AI in clinical practice, such models hold promise in transforming how clinicians approach infection diagnostics, especially in resource-constrained settings. By improving the precision of empirical therapy, AI can reduce unnecessary antimicrobial exposure, ultimately impacting both antimicrobial resistance trends and healthcare costs [2,7].
The integration of AI-based predictive tools like the AI-BPM in critical care workflows has the potential to revolutionize early diagnosis of bacteremia, enhance patient care, and contribute to global AMR mitigation efforts. However, multicenter validation, continuous model training with updated data, and real-time integration into hospital information systems will be essential steps for successful implementation.
This study successfully developed and externally validated an Artificial Intelligence-based Bacteremia Prediction Model (AI-BPM) using a Random Forest algorithm in a tertiary care ICU setting. The model demonstrated excellent diagnostic performance, with an AUROC of 0.93, a sensitivity of 90%, and a specificity of 100%, effectively meeting the first objective of accurately predicting the likelihood of bacteremia using clinical, laboratory, and patient history parameters.
In addressing the second objective, the AI model’s predictions showed strong concordance with blood culture results, reinforcing its potential role as a reliable decision-support tool for early identification of bacteremia. The findings suggest that AI-BPM can aid clinicians in timely risk stratification and guide more judicious use of empirical antibiotics, thereby supporting antimicrobial stewardship and improving patient outcomes.
Further validation in larger and multicentric cohorts is recommended to confirm its generalizability and facilitate real-world clinical integration.