Abstract
Lung and colon cancers are two of the most common and deadly tumors around the world, creating significant public health concerns. Artificial intelligence (AI) and machine learning (ML) have heavily improved cancer research, particularly in early detection, histopathological analysis, and personalized therapy planning. However, despite their remarkable accuracy, ML models sometimes lack transparency, making explainability crucial in medical applications. Although various machine learning (ML)-based classifications for cancer models exist, their interpretation is not understood. The current research overcomes the diagnostic gap by developing a highly accurate system that uses XAI (Explainable Artificial Intelligence) methods to clarify its predictions. We used Kaggle's LC25000 dataset, which included histology images for lung and colon tumors in humans. To determine the best cancer classification strategy, we tested various machine learning algorithms, including Random Forest, Decision Tree, Support Vector Machine (SVM), and Extreme Gradient Boosting. Furthermore, XAI approaches such as LIME (Local Interpretable Model-Agnostic Explanations) and SHAP (Shapley Additive Explanations) were utilized to evaluate model predictions and identify important information affecting classification outcomes. XGBoost confirmed that it was useful in identifying colon and lung cancer by achieving the highest accuracy of 99.80% among the models used. Also, XAI techniques offered useful information on the most significant features. SHAP analysis highlighted LBP and color histogram features as key for distinguishing lung and colon tissues, while LIME confirmed their importance by identifying critical image regions influencing predictions
Keywords
Lung cancer LC25000 dataset Colon cancer Machine learning Explainable AI