|Ahead of print publication
Deep learning approach to detect high-risk oral epithelial dysplasia: A step towards computer-assisted dysplasia grading
C Nandini1, Shaik Basha2, Aarchi Agarawal2, R Parikh Neelampari3, Krishna P Miyapuram4, R Jadeja Nileshwariba3
1 Department of Oral Pathology and Microbiology, Gujarat University, Ahmedabad, Gujarat, India
2 Department of Computer Science Engineering, Indian Institute of Technology (IIT), Gandhinagar, Gujarat, India
3 Department of Oral and Maxillofacial Pathology, Karnavati School of Dentistry, Karnavati University, Gandhinagar, Gujarat, India
4 Centre for Cognitive and Brain Sciences, Indian Institute of Technology, Gandhinagar, Gujarat, India
|Date of Submission||07-Feb-2022|
|Date of Acceptance||01-Apr-2022|
|Date of Web Publication||07-Sep-2022|
4, New Pushpak Society, Behind Avsar Party Plot, Hansol, Ahmedabad, Gujarat
Source of Support: None, Conflict of Interest: None
Introduction: Oral epithelial dysplasia (OED) is associated with high interobserver and intraobserver disagreement. With the exponential increase in the applicability of artificial intelligence tools such as deep learning (DL) in pathology, it would now be possible to achieve high accuracy and objectivity in grading of OED. In this research work, we have proposed a DL approach to epithelial dysplasia grading by creating a convolutional neural network (CNN) model from scratch. Materials and Methods: The dataset includes 445 high-resolution ×400 photomicrographs captured from histopathologically diagnosed cases of high-risk dysplasia (HR) and normal buccal mucosa (NBM) that were used to train, validate and test the two-dimensional CNN (2DCNN) model. Results: The whole dataset was divided into 60% training set, 20% validation set and 20% test set. The model achieved training accuracy of 97.21%, validation accuracy of 90% and test accuracy of 91.30%. Conclusion: The DL model was able to distinguish between normal epithelium and HR epithelial dysplasia with high grades of accuracy. These results are encouraging for researchers to formulate DL models to grade and classify OED using various grading systems.
Keywords: Convolutional neural network, deep learning, epithelial dysplasia, oral cancer
|How to cite this URL:|
Nandini C, Basha S, Agarawal A, Neelampari R P, Miyapuram KP, Nileshwariba R J. Deep learning approach to detect high-risk oral epithelial dysplasia: A step towards computer-assisted dysplasia grading. Adv Hum Biol [Epub ahead of print] [cited 2022 Sep 27]. Available from: https://www.aihbonline.com/preprintarticle.asp?id=355695
| Introduction|| |
Malignant neoplasm, more commonly addressed to as “Cancer”, is rapidly becoming a global burden on human health. With the significant advancements in the management of cardiovascular diseases, cancer is expected to become the most common cause of death in many parts of the world. Out of all the types, oral cancer or oral squamous cell carcinoma is one of the most common forms of cancer, especially in the countries such as India, where such a high incidence is clearly attributed to the extensive consumption of tobacco and its products. Majority of oral cancers are preceded by the potentially malignant disorders that histopathologically show a spectrum of abnormalities ranging from hyperplasia to intraepithelial neoplasia which constitute the features of oral epithelial dysplasia (OED).,
The term dysplasia was introduced by Reagon in 1958 in a study where he described the features of dysplasia., Several grading systems of epithelial dysplasia have been introduced in the past, and the binary grading system proposed by Kujan et al. using the 2005 WHO criteria is the most recently introduced system.,
Unfortunately, the grading systems of epithelial dysplasia suffer a significant interobserver and intraobserver variability because of high subjectivity of the criteria used for grading. Few methods suggested to reduce such variability and increase the objectivity and reproducibility are to adopt digital image analysis, and morphometric approaches to identify and study comparatively reliable criteria of OED and use of artificial intelligence (AI) tools such as machine learning (ML) and deep learning (DL) algorithms which are known to work without bias. Moreover, with the advent of digital pathology techniques, abundant digital pathological data are being produced, which can now be used for innovative research by the application of AI tools such as ML and DL.
ML and DL have been used in a lot of recent research studies in medical imaging and especially in the detection of cancer-related abnormalities. Nahid et al. proposed histopathological breast cancer image classification model using basic ML techniques, followed by DL and achieved an accuracy of 91%, while Fu et al. proposed a DL approach to detect oral cavity squamous cell carcinoma from photographic images with an accuracy which is comparable to specialist panel who classified the images manually. Few authors proposed a convolutional neural network (CNN)-based approach to perform classification, segmentation and visualization of cancer from microscopic biopsy images.,, Further, Adel et al. proposed a computer-aided diagnostic approach to detect and classify OED using basic ML techniques with an accuracy of 92.8%, whereas Gupta et al. used DL model to classify dysplastic tissue images with an accuracy of 89.3%. In most of the existing works, ML and DL models are implemented to classify histopathological images of patients suffering from oral dysplasia into different classes based of severity.
In the present study, it was intended to train the model to identify and distinguish between two very distinct histopathological entities, high-risk (HR) epithelial dysplasia (which is the nearest precursor to malignant transformation) and normal buccal mucosa (NBM). Hence, a two-dimensional CNN (2DCNN) model was created to detect of HR epithelial dysplasia out of a cocktail of images of HR epithelial dysplasia and normal epithelium from buccal mucosa NBM. As a sequel of this study, the model would be trained to distinguish the HR epithelial dysplasia from both low-risk epithelial dysplasia and normal epithelium.
| Materials and Methods|| |
To accomplish the current study, H- and E-stained histopathological slides of 50 patients were selected from the archives of the Department of Oral Pathology, Karnavati School of Dentistry, Gandhinagar, out of which 30 were histopathologically confirmed cases of HR epithelial dysplasia and 20 cases were histopathologically diagnosed as normal epithelium of the buccal mucosa. The cases that showed a consistent interobserver agreement were included in the study, whereas cases with a consistent interobserver variability and H and E sections with insufficient length of epithelium, oblique orientation and insufficient depth were excluded from the analysis.
High-resolution static photomicrographs of all the representative areas in the slides were captured on ×400 magnification, forming a dataset constituting 445 images in total. All the photomicrographs were captured using a binocular light microscope (Lawrence and Mayo, India) mounted with a Nikon D3300 digital single-lens reflex camera with the help of camera adaptor (Amscope.com). The images were captured at an International Organization for Standardization 400 and 1/10 s shutter speed as these camera settings were found to produce the most suitable and maximum vividness of colours and plotted uniform histograms. [Figure 1]a and [Figure 1]b show photomicrographs of normal epithelium of buccal mucosa NBM and HR epithelial dysplasia, respectively.
|Figure 1: (a) Normal epithelium of buccal mucosa. (b) high-risk epithelial dysplasia.|
Click here to view
A total of 445 static photomicrographic images of very high resolution (6000 × 4000 × 3 pixels) belonging to both HR epithelial dysplasia and NBM were selected for analysis.
To reduce the computational requirement, the images were resized during preprocessing phase. The performance of CNN classification model by resizing input images to various sizes to identify the optimal input size both in terms of performance and computational complexity. During our experiments, resizing inputs to 600 × 400 × 3 gave the best performance and also reduced the computational training time by multiple folds. All the input slides were labelled as 0 for HR epithelial dysplasia and 1 for NBM. Label 0 is considered positive label in this case as it indicates HR epithelial dysplasia. To use ImageDataGenerator functionality provided by Keras, where only the input images of a required batch will be fetched during the training phase dynamically, the whole data were distributed into folders of train, validation and test. Hence, the dataset was randomly divided into training data (60%), validation data (20%) and test data (20%).
Proposed convolutional neural network architecture
In the present study, the popular DL algorithm for image classification called CNN was used. A 2D convolution (conv2D) model with 12 layers (three conv2D layers, one max pooling 2D layer, three batch normalisation layers, one dropout layer, one flatten layer, two dense layers and one sigmoid activation layer) was created using Keras application programmable interface to classify histopathological slides as HR or NBM [Figure 2].
|Figure 2: Proposed two-dimensional convolutional neural network model using Keras application programmable interface.|
Click here to view
| Results|| |
The 2DCNN model was trained with 60% of the whole data, 20% of the whole data were used as validation data and verification of the performance of the model was done with the remaining 20% as test data. To reduce overfitting of data, a dropout layer was added. Hyperparameter tuning was done on the proposed model to identify the ideal set of parameters with accuracy as the performance metric. Various combinations of kernal size, batch size, number of epochs and number of neurons in the dense layer as shown in [Table 1] were used for hyperparameter tuning. The final hyperparameters chosen were kernel size (3), batch size (10), number of epochs (10) and number of neurons in dense layer (64).
Our model achieved training accuracy of 97.21%, validation accuracy of 90% and test accuracy of 91.30%. The confusion matrix for the classification of test data is shown diagrammatically [Figure 3]. For each of the classes, HR and NBM, the precision, recall and F1 scores were noted [Table 2].
| Discussion|| |
Oral squamous cell carcinoma is one of the most common malignant neoplasm arising in the oral cavity and most of the times is preceded by the presence of a clinically detectable lesions and histopathologically diagnosable epithelial dysplasia. Efforts have been made continuously by the researchers to improve the objectivity and reproducibility of the epithelial dysplasia grading systems to make them more reliable. In recent years, the application of computer-assisted techniques in the identification and grading of epithelial dysplasia has been extensively done. In the present study, a 2D convolution neural network model was developed to identify the presence of HR epithelial dysplasia in the histopathological slides and distinguish it from normal epithelium of buccal mucosa. The histopathological slides diagnosed as HR epithelial dysplasia according to the binary system of grading proposed by Kujan et al., were selected from the archives. Only two categories, normal epithelium and HR epithelial dysplasia were selected (avoiding the cases of low-risk dysplasia) as they are very distinctly different in their histological appearance which would facilitate the training and validation of the CNN model at a preliminary level. In the present study, haematoxylin- and eosin-stained histopathological slides of HR epithelial dysplasia and normal epithelium were selected. All the cases pertained to buccal mucosa as Ragavendra et al. proposed that site-wise differences in cell and nuclear measurements exist in normal oral epithelium. Hence, morphometric comparison of any oral lesion with the normal should be site-specific. In the present study, the high-resolution photomicrographs were captured at ×400 magnification as it is the most commonly used magnification in routine diagnostic pathology practice.
Application of a DL algorithm was preferred for the current study over a basic ML algorithm as the former is more efficient and comprehensive in image processing and classification. This was in accordance with Gupta et al., who also used a CNN model for grading of OED and in contrast with Sami et al., Aurchana et al., and Adel et al., used basic ML techniques for classification of OED.
The whole dataset of 445 photomicrographs was randomly divided into a training dataset, validation dataset and testing dataset. The training dataset comprised 60% of the whole dataset where the images were labelled as HR or NBM and the algorithm was trained to identify and distinguish between the two groups. The validation dataset and testing dataset comprised 20% each of the dataset and were used to test the accuracy of the algorithm with unlabelled images.
The present model achieved high levels of training accuracy of 97.21%, validation accuracy of 90% and test accuracy of 91.30%. In the present study, the dataset was small with 445 images, but proper hyperparameter tuning and dropout were used to avoid overfitting. This suggests that the potential of DL models to extract features from very complex medical images can be very effective in reducing the risk of interobserver and intraobserver disagreement in the grading of OED.
| Conclusion|| |
It was encouraging to see that the 2D convolution neural network model developed during the study differentiated between HR epithelial dysplasia and normal epithelium with a great accuracy. The present model has differentiated between the NBM and the HR epithelial dysplasia with high rates of testing accuracy. Taking this into consideration, the model would be further developed for the classification and segregation of cases of OED into low-risk and HR dysplasia. Furthermore, attempts would be made to identify the more reliable and objective dysplastic features out of all the cytological and architectural features that actually contribute to HR dysplasia and hence malignant transformation. Furthermore, the potential of DL models to extract features from very complex medical images should be further explored to enhance the accuracy and efficiency of medical diagnostics.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Raju Ragavendra T, Rammanohar M, Sowmya K. Morphometric computer-assisted image analysis of oral epithelial cells in normal epithelium and leukoplakia. J Oral Pathol Med 2010;39:149-54.
Warnakulasuriya S, Johnson NW, van der Waal I. Nomenclature and classification of potentially malignant disorders of the oral mucosa. J Oral Pathol Med 2007;36:575-80.
Fischer DJ, Epstein JB, Morton TH, Schwartz SM. Interobserver reliability in the histopathologic diagnosis of oral pre-malignant and malignant lesions. J Oral Pathol Med 2004;33:65-70.
Gupta RK, Kaur M, Manhas J. Tissue level based deep learning framework for early detection of dysplasia in oral squamous epithelium. J Multimed Inf Syst 2019;6:81-6.
Kujan O, Oliver RJ, Khattab A, Roberts SA, Thakker N, Sloan P. Evaluation of a new binary system of grading oral epithelial dysplasia for prediction of malignant transformation. Oral Oncol 2006;42:987-93.
Müller S. Oral epithelial dysplasia, atypical verrucous lesions and oral potentially malignant disorders: Focus on histopathology. Oral Surg Oral Med Oral Pathol Oral Radiol 2018;125:591-602.
Tilakaratne WM, Sherriff M, Morgan PR, Odell EW. Grading oral epithelial dysplasia: Analysis of individual features. J Oral Pathol Med 2011;40:533-40.
Nahid AA, Mehrabi MA, Kong Y. Histopathological breast cancer image classification by deep neural network techniques guided by local clustering. Biomed Res Int 2018;2018:2362108.
Fu Q, Chen Y, Li Z, Jing Q, Hu C, Liu H, et al.
A deep learning algorithm for detection of oral cavity squamous cell carcinoma from photographic images: A retrospective study. EClinicalMedicine 2020;27:100558.
Xu Y, Jia Z, Wang LB, Ai Y, Zhang F, Lai M, et al
. Large scale tissue histopathology image classification, segmentation, and visualization via deep convolutional activation features. BMC Bioinformatics 2017;18:281.
Wang A, Khosla R, Gargeya H, Irshad A, Beck H, “Deep Learning for Identifying Metastatic Breast Cancer,” arXiv preprint arXiv:1606.05718, 2016.
Kumar R, Srivastava R, Srivastava S. Detection and classification of cancer from microscopic biopsy images using clinically significant and biologically interpretable features. J Med Eng 2015;2015:457906.
Dey A. Machine Learning Algorithms: A Review. International Journal of Computer Science and Information Technologies. 2016;7: 1174-9.
Smitha T, Sharada P, Girish H. Morphometry of the basal cell layer of oral leukoplakia and oral squamous cell carcinoma using computer-aided image analysis. J Oral Maxillofac Pathol 2011;15:26-33.
] [Full text]
Hou L, Samaras D, Kurc TM, Gao Y, Davis JE, Saltz JH. Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification. Proceedings of the IEEE conference on computer vision and pattern Recognition (CVPR). 2016;2016:2424-33. doi: 10.1109/CVPR.2016.266. PMID: 27795661; PMCID: PMC5085270.
Aurchana P, Dhanalakshmi P, Chidambaram C. SVM based classification of epithelial dysplasia using surf and sift features. Int J Pure Appl Math 2017;117:1163-75.
[Figure 1], [Figure 2], [Figure 3]
[Table 1], [Table 2]