Transcript
Page 1: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

Original Article

Machine Learning Classification of Articular Cartilage Integrity Using

Near Infrared Spectroscopy

ISAAC O. AFARA ,1 JAAKKO K. SARIN,1,2 SIMO OJANEN,1,3 MIKKO A. J. FINNILA,1,3 WALTER HERZOG,4

SIMO SAARAKKALA,3,5 RAMI K. KORHONEN,1 and JUHA TOYRAS1,2,6

1Department of Applied Physics, University of Eastern Finland, Kuopio, Finland; 2Diagnostic Imaging Centre, KuopioUniversity Hospital, Kuopio, Finland; 3Research Unit of Medical Imaging, Physics and Technology, Faculty of Medicine,

University of Oulu, Oulu, Finland; 4Human Performance Laboratory, Faculty of Kinesiology, University of Calgary, Calgary,Canada; 5Department of Diagnostic Radiology, Oulu University Hospital, Oulu, Finland; and 6School of Information

Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia

(Received 13 July 2019; accepted 18 February 2020; published online 9 March 2020)

Associate Editor Cheng Dong oversaw the review of this article.

Abstract

Introduction—Assessment of cartilage integrity during ar-throscopy is limited by the subjective visual nature of thetechnique. To address this shortcoming in diagnostic evalu-ation of articular cartilage, near infrared spectroscopy(NIRS) has been proposed. In this study, we evaluated thecapacity of NIRS, combined with machine learning tech-niques, to classify cartilage integrity.Methods—Rabbit (n = 14) knee joints with artificial injury,induced via unilateral anterior cruciate ligament transection(ACLT), and the corresponding contra-lateral (CL) joints,including joints from separate non-operated control(CNTRL) animals (n = 8), were used. After sacrifice, NIRspectra (1000–2500 nm) were acquired from differentanatomical locations of the joints (nTOTAL = 313:nCNTRL = 111, nCL = 97, nACLT = 105). Machine and deeplearning methods (support vector machines–SVM, logisticregression–LR, and deep neural networks–DNN) were thenused to develop models for classifying the samples basedsolely on their NIR spectra.Results—The results show that the model based on SVM isoptimal of distinguishing between ACLT and CNTRLsamples (ROC_AUC = 0.93, kappa = 0.86), LR is capableof distinguishing between CL and CNTRL samples (RO-C_AUC = 0.91, kappa = 0.81), while DNN is optimal fordiscriminating between the different classes (multi-classclassification, kappa = 0.48).Conclusion—We show that NIR spectroscopy, when com-bined with machine learning techniques, is capable of holisticassessment of cartilage integrity, with potential for accuratelydistinguishing between healthy and diseased cartilage.

Keywords—Osteoarthritis, Cartilage, Near infrared spec-

troscopy, Machine learning, Deep learning, Classification.

INTRODUCTION

Osteoarthritis (OA) is a disabling musculoskeletalcondition affecting a significant proportion of theworld’s population. OA is mainly associated withdamage and erosion of articular cartilage, the func-tional soft tissue surrounding the ends of articulatingbones. OA could be triggered by aging (idiopathic OA)or injury to cartilage and other joint connective tissues,leading to cartilage lesions, which without interventionmay degenerate to post-traumatic OA (PTOA). Sur-gical intervention in the repair of cartilage injury iscurrently conducted via arthroscopy, which allows vi-sual inspection of the intra-articular joint space forassessment of lesion size, depth and severity. However,diagnosis of the extent of cartilage injuries during ar-throscopy is limited by the subjective visual nature ofthe technique,31 decreasing the chances of successfulintervention. Thus, methods with potential for sensi-tive and reliable arthroscopic assessment of cartilageintegrity would be highly desirable.

There has been growing interest in utilizing fastoptical methods, particularly near infrared spec-troscopy (NIRS), for the evaluation of cartilage con-ditions.1,2,5–7,19,22,27,28,33 NIRS is capable ofcharacterizing cartilage physical5,27 and functional6,27

properties, and composition.1,22,28 Furthermore, NIRShas been proposed for assessing4,23,32 cartilage integrityand monitoring the progression of OA using an animalmodel,3 which provides in vivo and ex vivo data duringmethods development and optimization. Thus, themethod has the potential to overcome the limitationsof current conventional arthroscopy and enable

Address correspondence to Isaac O. Afara, Department of

Applied Physics, University of Eastern Finland, Kuopio, Finland.

Electronic mail: [email protected]

Cellular and Molecular Bioengineering, Vol. 13, No. 3, June 2020 (� 2020) pp. 219–228

https://doi.org/10.1007/s12195-020-00612-5

BIOMEDICALENGINEERING SOCIETY

1865-5025/20/0600-0219/0 � 2020 The Author(s). This article is an open access publication

219

Page 2: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

quantitative diagnosis of cartilage condition duringrepair surgery.

During degeneration, complex and often simulta-neous micro- and macroscopic changes occur withinthe matrix of articular cartilage, which consists mainlyof collagen, proteoglycans (PG) and water. Forexample, the early stages of OA are characterized byincreased water content, loss of PG, and degenerationof the superficial collagen network, resulting in alteredmechanical responses. Since NIRS is sensitive tospecific molecular species containing CH, NH, OH andSH bonds, which constitute the fundamental structureof cartilage composition, it can detect changes in thetissue during degeneration.3 Thus, the NIR spectrumof cartilage contains information on its physico-chemical, structural, and functional characteristics.

Analysis of NIR spectra for assessment of cartilageintegrity has so far relied on traditional multivariatelinear methods, particularly partial least squaresregression (PLSR), e.g., for prediction of specific tissueproperties from cartilage NIR spectrum. Only fewstudies have applied machine learning techniques, suchas support vector machines (SVM)3 and neural net-works (ANN),28,30 for analysis of cartilage NIR spec-tral data, specifically regression analysis. However, nostudy has applied machine learning techniques forclassification of connective tissue integrity based onNIRS. Furthermore, machine learning algorithms,particularly deep learning methods such as convolu-tional neural networks, are gaining interests in thediagnosis of musculoskeletal disorders, with recentstudies,8,35 demonstrating their capacity for diagnosisof OA from radiographic images with unprecedentedaccuracy.

Classification of cartilage integrity based on NIRSharnesses the capacity of the spectrum, which encodeslatent and inherent properties of the cartilage matrix,to provide a holistic assessment of the tissue. Forexample, features in the spectrum, such as spectralpeaks and shapes and shapes due to absorption andscattering associated with the composition and struc-ture of cartilage, encode information on tissue thick-ness5 and biomechanical27 properties. In this study, wecompared the performance of traditional machinelearning techniques, including support vector machines(SVM) and logistic regression (LR), with deep learningmethods, particularly deep neural networks (DNN),for classification of cartilage integrity based on NIRS.SVM is a supervised learning algorithm that uses themaximum margin principle to find a hyperplane thatbest separates two or multiple classes, while LR usesthe logit (sigmoid) function to model the relationshipbetween independent (predictors) and response (clas-

ses) variables based on maximum likelihood estima-tion. Unlike traditional analytical approach that isproblem-defined, neural networks are universalapproximators. They can discover features in largedatasets by using the backpropagation algorithm toindicate how a model should change its internalparameters that are used to compute the representationin each layer from the representation in the previouslayer.15 We hypothesized that machine learning tech-niques, which encompass both traditional (e.g., SVMand LR) and state-of-the-art (e.g., DNN) artificialintelligence techniques, can harness sample-relatedinformation embedded in the NIR spectrum for clas-sification and holistic assessment of the tissue integrity.

MATERIALS AND METHODS

Sample Preparation and Experimental Protocol

Unilateral anterior cruciate ligament transection(ACLT) surgery, to induce degenerative changes con-sistent with early stage osteoarthritis,18 was performedon skeletally mature female New Zealand white rabbits(n = 14, age = 12 months at the time of operation)under general anaesthesia. Anesthesia was induced byfirst delivering a pre-med sedative (SQ, Acepromazinemaleate, 1 mg/kg body weight, AceVet�, VetoquinolInc., Lavaltrie, QC, Canada). After 30 minutes, ani-mals were placed under deep surgical anesthesia using5% Isoflurane (Fresenius Kabi Inc., Richmond Hill,ON, Canada) in medical oxygen (1 l/min). Anesthesiawas maintained with 1 to 2% Isoflurane in medicaloxygen, and monitored continuously.

Animals were sacrificed under general anesthesia attwo time points post-injury: 2 weeks (n= 8 animals, 16knee joints) and 8 weeks (n = 6 animals, 12 kneejoints). Subsequently, the experimental (ACLT) andcontra-lateral (CL) knee joints were harvested from theanimals after sacrifice. Control (CNTRL) knee joints(n = 16) were collected from age and gender-matchedanimals (n = 8) not subjected to any surgical proce-dure. Subsequently, osteochondral (cartilage-bone)samples were collected from patella, lateral femoralgroove, the medial and lateral tibial plateaus, and thefemoral condyles (Figs. 1a–1d) from all joints. Spectralmeasurements and histological analysis were per-formed on all samples, resulting in a total of 313(nCNTRL = 111; nCL = 97; nACL = 105) measurementlocations. All experiments were carried out in accor-dance with guidelines of the Canadian Council onAnimal Care and were approved by the committee onAnimal Ethics at the University of Calgary [#AC11-0035].

BIOMEDICALENGINEERING SOCIETY

AFARA et al.220

Page 3: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

NIR Spectroscopy

Diffuse reflectance NIRS of the samples was per-formed using an Avantes dual VIS-NIR spectrometer(AvaSpec-ULS2048L, wavelength 350–1100 nm andAvaSpec-NIR256-2.5-HSC, wavelength 1000–2500nm, Avantes BV, Apeldoorn, The Netherlands) andlight source (AvaLight HAL-(S)-mini, wavelength360–2500 nm, Avantes BV, Netherlands). Spectraldata were acquired using a custom-designed arthro-scopic fibre optic probe (Avantes BV). The reusablestainless-steel fibre probe (diameter = 3.25 mm) isautoclave sterilisable (121 �C) and has a tip shaped likea traditional arthroscopic hook. The probe (windowdiameter = 2 mm) contains 114 optical fibres (diam-eter = 100 lm), with 100 emitting fibres and 14 col-lecting fibres (7 fibers collecting light from the sampleback to each spectrometer). To ensure sample preser-vation and physiological conditions, each sample’ssurface was covered with 0.15 M phosphate- bufferedsaline (PBS) solution at room temperature duringspectral measurements. Although the entire probe tip(diameter = 3.2 mm) did not always cover the samplesurface as a result of the curvature, the central regionconsisting of the receiving fibres (diameter = 1 mm) ofthe probe window (diameter = 2 mm)25 was always infull contact with the sample surface during spectralacquisition. Contact was made between the probe tipand sample surface before spectral acquisition to avoidspectral absorption from the surrounding PBS. Spec-

tral acquisition was performed using the AvaSoftsoftware (ver. 8.7.0, Avantes BV).

Three spectral measurements were acquired persample (1.8 seconds/measurement), with proberealignment prior to each measurement. Each spec-trum consisted of 100 co-added (averaged) scans, withthe final spectrum for each sample obtained as theaverage of the three spectral measurements, resultingin a total of 313 spectra (nCNTRL = 111, nCL = 97,nACLT = 105). Due to spectral saturation, i.e., nearcomplete absorption of the NIR photons, in thecombination region of the NIR spectral range (1900–2500 nm), data used in the analyses were restricted to1000–1900 nm (140 variables).

Histological Analysis and Tissue Grading

Following NIRS measurements, the samples weresubjected to histological analysis for assessment oftissue integrity. Briefly, the samples were fixed in for-malin, decalcified with ethylenediaminetetraacetic acid(EDTA), dehydrated in a series of graded alcohols andembedded in paraffin according to the standard pro-tocol.20,21 Subsequently, 3 lm thick sections were cutfrom the samples and stained with Safranin-O(Figs. 1e–1g), a cationic stain that binds stoichiomet-rically to negatively charged glycosaminoglycans inPG. Following staining, the sections were subjected tostandard digital densitometry protocol20 to quantifythe depth-wise PG distribution (Fig. 1h).

BIOMEDICALENGINEERING SOCIETY

FIGURE 1. Rabbit knee joint showing anatomical locations (a–d) where spectral measurements were collected, andrepresentative Safranin-O stained sections obtained from the medial femoral condyle of control (CNTRL, e), contra-lateral (CL,f) and anterior cruciate ligament transected (ACLT, g) joints. (h) shows the average (thick line) and 95% CI (dashed line) ofproteoglycan (PG) content profile of samples from the different groups. [M medial, L lateral].

NIRS Classification of Cartilage Health 221

Page 4: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

Spectral Pre-processing

Since NIRS measurements are prone to noise due toambient environment and stray light, pre-processing isoften required to minimise the influence of noise on theoutcome of spectral analysis. In this study, spectralpre-processing was performed using Savitzky–Golayfilter with different combinations of parameters. Filterwindow sizes of 9 (� 57 nm) and 11 (� 69.7 nm),polynomial order (2 and 3), and three differentderivative orders: 0 (spectral smoothing), 1 (1stderivative) and 2 (2nd derivative, Fig. 2a), wereapplied to generate different pre-processed spectraused in machine learning analyses for classification ofcartilage integrity.

Classification Models and Parameter Tuning

For assessment of cartilage integrity, classificationmodels (Fig. 2b) were developed using traditionalmachine learning methods, including support vectormachines (SVM) and logistic regression (LR), anddeep learning methods, specifically deep neural net-works (DNN). Three different classifiers were devel-oped, including two binary classifiers: ‘classifier 1’ fordistinguishing between CNTRL and ACLT samples,and ‘classifier 2’ for differentiating between CNTRLand CL samples from their NIR spectra. The thirdclassifier (‘classifier 3’) was aimed at multi-class clas-sification of cartilage integrity, i.e., differentiatingbetween CNTRL, CL and ACLT, akin to classifyingbetween healthy, moderate and advanced cartilagedegeneration, respectively. Prior to analysis, the jointswere split by animals into training (� 80%) and test (�20%) sets to avoid biased estimators. Furthermore, inorder to account for site-specific variations in cartilageproperties, the sample splitting was done based onanimals so that the training and test set data includedsamples from the different anatomical locations.Sample number distribution for the different classifi-cation problems are: nclassifier_1_training = 173 (nACLT =

84, nCNTRL = 89), nclassifier_2_training = 165 (nCL = 76,nCNTRL = 89), nclassifier_3_training = 249 (nACLT = 84,nCL = 76, nCNTRL = 89), nclassifier_1_test =nclassifier_2_test = 43, nclassifier_3_test = 64.

Since SVM and LR consist of multiple hyperpa-rameters that alter the decision boundary for classifi-cation, and affect a model’s performance onindependent data, it is essential to find the combinationof hyperparameters that optimizes model performanceand generalization. To achieve optimal performance, a‘fit’ and ‘score’ method was employed using theGridSearchCV algorithm.9 In this algorithm, a modelis first fitted on the training data using a classifier (e.g.,based on SVM or LR) with a set of hyperparametervalues. The resulting model is then validated andscored using a specified metric, e.g., mean absoluteerror (MAE) was used in this study. The process isrepeated with different combinations of hyperparam-eter values, allowing an exhaustive cross-validatedsearch over a user-defined grid of classifier hyperpa-rameters, where the best classifier hyperparameters arethose that yield the lowest error score.

Two different DNN architectures were developedfor the deep learning-based classifiers, with the optimalarchitecture determined after preliminary tests. TheDNN architecture for the binary classification models(classifier 1 and classifier 2) consisted of 4 fully-con-nected (dense) layers, consisting of 70, 70, 35 and 14neurons. The architecture for multi-class classification(classifier 3) consisted of 5 fully-connected layers,consisting of 140, 140, 70, 35 and 14 neurons. Adropout layer34 (rate = 0.4) was inserted between thelast hidden layer and the output layer to minimizeoverfitting. Based on preliminary model assessment,hyperbolic tangent (‘tanh’) function was optimal forneuron activation in all layers, except the output layer,where ‘softmax’ activation was used because the net-works are designed for classification. During training,the training set was further (randomly) divided intotraining (90%) and validation (10%) sets. During each

BIOMEDICALENGINEERING SOCIETY

FIGURE 2. Schematic illustration of the analysis protocol showing (a) representative first derivative pre-processed NIR spectraand (b) protocol for training, validation, and testing of classifiers performance.

AFARA et al.222

Page 5: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

training step, validation metrics (accuracy and loss)were calculated and used to tune the model’s hyper-parameters (weights and biases). Optimal networkhyperparameters were achieved using the Adam opti-mization algorithm in all networks, with‘binary_crossentropy’ and ‘categorical_crossentropy’loss functions in the binary and multi-class networks,respectively. Model training was halted when noimprovement in validation metrics are observed after20 consecutive epochs.

Classifier performance was assessed in terms ofprecision, recall, f1-score, kappa and area under re-ceiver operator characteristic (ROC_AUC). It is worthnoting that in binary classification, recall of the posi-tive class (CNTRL) is the ‘sensitivity’ of the classifier;while recall of the negative class (ACLT and CL) is the‘specificity’ of the classifier. Machine learning analysiswas performed using Scikit-Learn24 (ver. 0.20.1)package in Python (ver. 3.6). Deep learning analysiswas performed using Keras deep learning library12

(ver. 2.2.4), with TensorFlow13 (ver. 1.12.0) backend,in Python (ver. 3.6).

RESULTS

Histological analysis shows significant decrease indepth-wise PG content at the superficial and middlezones of the ACLT group compared to the CNTRLgroup, and throughout the cartilage depth comparedto the CL group (Fig. 1h). Spectral pre-processing witha filter window of 9, polynomial order of 2, and noderivative (i.e., only spectral smoothing) of the NIRspectra yielded the best performance for both binaryand multi-class cases, possibly due to noise and loss ofspectral fidelity after derivative pre-processing. Fordistinguishing between ACLT and CNTRL samples(classifier 1), SVM (optimal parameters: C = 10, ker-nel = linear) and LR (optimal parameters: C = 1000,regularization = l2) were optimal, with similar levelsof performance (Table 1, Figs. 3a and 3b). However,in differentiating between CL and CNTRL samples(classifier 2), the LR model (optimal parameters: C =1000, regularization = l1) was optimal (Fig. 3e), withSVM yielding the worst result (Fig. 3d). AlthoughDNN performed slightly poorer than the traditionalmethods in binary classification, it was better at clas-sifying CNTRL samples (Figs. 3c and 3f).

For multi-class classification, the model based onDNN outperformed SVM and LR (Fig. 4c), albeitwith poorer performance compared to the binaryclassifiers. The results also show that more sampleswere misclassified by the SVM and LR multi-classmodels (Figs. 4a and 4b), with most of the misclassi-fications occurring with the ACLT samples being

classified as CL, suggesting that these classifiers havepoor sensitivity in distinguishing between the NIRspectra of ACLT and CL samples.

DISCUSSION

We demonstrate the capacity of NIRS, combinedwith machine learning methods, to support a holisticassessment of cartilage integrity via classificationanalysis. We quantified the potential of NIRS to dis-tinguish between cartilage of control and experimentaljoints, simulating differences between healthy anddegenerated cartilage. Although NIRS has beenextensively applied for predicting cartilage biochemi-cal1,22,30 and structural28 properties, which are thebasis of its function, these properties would need to bemeasured for each location independently because ofthe site-specific variation in cartilage properties. Fur-thermore, the functional properties of articular carti-lage, alone, are not reliable indicators of tissueintegrity. This is supported by the observation ofBrown et al.,10 who showed significant overlap in thestiffness of normal and degenerated cartilage due tosite-specific variations in articular cartilage structure,functional and material properties. More so, besidesregression analysis for prediction of cartilage proper-ties,28,30 it is worth noting that no study has appliedmachine learning techniques for classification of car-tilage integrity based on non-destructive spectroscopicmethods like NIR spectroscopy, which has immensediagnostic potential.

While the exact mechanism of cartilage degradationand pathophysiological process in OA remains un-clear, a complex interplay between factors includinggenetics, environment, metabolism and biochemistryhas been proposed.16 During degeneration, matrix-degrading enzymes are overexpressed, resulting in

BIOMEDICALENGINEERING SOCIETY

TABLE 1. Performance metrics of the best classifiers forassessing cartilage integrity based on NIRS: best classifier 1for differentiating between ACLT and CNTRL is based onSVM; best classifier 2 for differentiating between CL andCNTRL is based on LR; and best classifier 3 for multi-class

classification is based on DNN.

Precision Recall f1-score ROC_AUC kappa

Classifier 1

ACLT 0.95 0.90 0.93 0.93 0.86

CNTRL 0.91 0.95 0.93

Classifier 2

CL 0.90 0.90 0.90 0.91 0.81

CNTRL 0.91 0.91 0.91

Classifier 3

ACLT 0.58 0.52 0.55 – 0.48

CL 0.63 0.57 0.60

CNTRL 0.73 0.86 0.79

NIRS Classification of Cartilage Health 223

Page 6: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

degradation of the matrix network on a molecular le-vel. This leads to decrease in the size of matrix mole-cules and eventual loss of cartilage extra-cellularmatrix (collagen and PGs), together with a concomi-tant increase in tissue water content. This degenerative

process initiates a proliferative chondrocyte responsewith synthesis of increased quantities of extra-cellularmatrix macromolecules. Even in samples with markedhistological alterations, some chondrocytes can beobserved to display strong anabolic activities, often

BIOMEDICALENGINEERING SOCIETY

FIGURE 3. Confusion matrix showing prediction performance of the binary classification models. Performance of classifier 1 andclassifier 2 models based on SVM (a: kernel = linear and C=10; d: C = 1000, gamma = 0.001, kernel = rbf), LR (b: regularizationpenalty = l2, C = 1000; e: regularization penalty = l1, C = 1000) and DNN (c and f) for predicting cartilage integrity in the independenttest set, respectively. Optimal spectral preprocessing was based on Savitzky–Golay filtering with window size of 9, polynomialorder of 2 and no derivative.

FIGURE 4. Confusion matrix of the best multi-class classification models based on SVM (a, kernel = linear, and C=100), LR (b:regularization penalty = l1, C = 1000) and DNN (c) The optimal spectral preprocessing was based on Savitzky–Golay filtering withwindow size of 9, polynomial order of 2 and no derivative.

AFARA et al.224

Page 7: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

characterized by cell clusters (cell cloning) surroundedby newly synthesized matrix molecules. However, asthe disease progresses, the chondrocytes are not able tokeep up their repair activity and the reparative at-tempts are outpaced by cartilage degradation26

resulting in complete loss of the cartilage tissue.Probing these changes by taking advantage of theinteraction of light with the structure and compositionof cartilage enables identification of integrity-relatedfeatures in the spectra, e.g., increased OH peaks(~ 1450 and ~ 1900 nm)22 from increased water con-tent, for tissue assessment. Thus, adaptation of ma-chine learning-based approaches takes advantage ofthe relevant integrity-related features in the NIRspectrum for classification of tissue health, which maybe difficult to capture via single traditional macro-scopic parameters, such as instantaneous and dynamicmoduli.

Depending on the anatomical location, samplesfrom the contra-lateral (CL) joints may exhibit slightlyhigher PG content compared to CNTRL samples, asopposed to the lower PG content observed in theACLT samples (Fig. 1h). This may be due to exposureof the CL joints to abnormal loading as a result ofaltered gait following injury of the ACLT joint,14,18

triggering increased synthesis of PG by chondrocytes, aphysiological response observed in the early stages ofcartilage degeneration, and a precursor to gradual PGloss and matrix disruption.17 Thus, CL and ACLTjoints present different manifestation of early stagecartilage matrix degeneration. As such, the mainchanges observed in samples with early matrix alter-ation are related to changes in PG content, whichinfluences the water content and potentially thesuperficial collagen structure, all of which aredetectable using NIRS.1,2,22,30 This result suggests thatcartilage degeneration, and thus changes in its spectralfeatures, may not follow a linear trend, and thereforemay benefit from a non-linear modelling approachsuch as neural networks, including shallow and deepnetworks.

Differentiating between CNTRL and ACLT sam-ples was trivial for all classifiers, even at this early stageof cartilage degeneration, possibly due to significantdifferences between the spectra of samples from bothclasses. Inspection of the coefficients of the optimalclassifiers (via the feature importance plot, Fig. 5),obtained as part of the standard output from the ma-chine learning analysis, provides insight into the vari-ables, and thus the specific matrix components, thatcontribute substantially to the classification betweensample classes. The features in classifier 1 (Fig. 5a)show that spectral variables related to the water peaks

(1370–1550 nm and 1850–1900 nm) are the mostdominant. This is indicative of the increased watercontent in the ACLT samples during degenera-tion—one of the significant and noticeable changesduring cartilage degeneration.11

In the case of classifier 2, which is aimed at distin-guishing between CNTRL and CL samples, it is un-clear why LR outperformed the other methods.However, the difference in performance is not thatsubstantial (Figs. 3d–3f). The dominant features in theLR classifier (Fig. 5b) are spread across the spectrum,suggesting that differences between the classes (CL andCNTRL) might be due to overall changes within thetissue matrix. In multi-class classification, aimed atevaluating the capacity of NIRS to differentiatebetween CNTRL, CL and ACLT samples, the deeplearning-based model was optimal, possibly due to thecapacity of neural networks to model non-linear rela-tionships. The performance observed with the DNNmodel suggests that the relationship between thespectra and cartilage integrity consists of both linearand non-linear changes in cartilage components, whichis consistent with the progression of cartilage degen-eration.

The high misclassifications rate of CL observed inthe multi-class case, particularly in the SVM and LRmulti-class models, might be due to pooling of samplesfrom the animals sacrificed at 2 and 8 weeks, poten-tially resulting in samples with inconsistent (location-specific) trends of tissue degeneration within the sameclass, since certain locations exhibit more pronouncedchanges at 2 weeks than at 8 weeks, and vice versa.Nevertheless, the deep learning model was able toclassify the samples with somewhat reasonable per-formance (Fig. 4c). Since deep learning models requirea large amount of data for training, the DNN modelperformance is likely to improve in all classificationcases when more data are available. Furthermore, theapproaches adopted in this study would need to befurther validated using large animal models andhuman joints ex vivo. In a recent study, Sarin et al.28

applied shallow neural networks for predicting carti-lage properties from its NIR spectrum, thus supportingthe outcomes and underlying hypothesis of this studythat machine learning methods may offer betterapproaches for classification of cartilage integrity fromNIR spectra than traditional methods.

In conclusion, NIRS combined with machinelearning techniques, could provide a powerful tool forclassification of cartilage integrity, with the potentialfor accurately distinguishing between normal and earlyosteoarthritic cartilage. This finding, combined withrecent application of NIR for estimating cartilage

BIOMEDICALENGINEERING SOCIETY

NIRS Classification of Cartilage Health 225

Page 8: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

biomechanical properties in human cadaver kneejoints,25,29 is insofar significant as it suggests thatNIRS can be adapted for rapid diagnosis of cartilageintegrity during knee arthroscopy, where it may becritical to correctly discriminate between healthy anddegenerated cartilage prior to removal of the degen-erated tissue. However, our approach needs furthervalidation with human samples prior to clinicalapplications in cartilage/joint repair surgery.

ACKNOWLEDGMENTS

Open access funding provided by University ofEastern Finland (UEF) including Kuopio UniversityHospital. Dr. Afara acknowledges funding supportfrom the Finnish Cultural Foundation (SuomenKulttuurirahasto: 00160079 and 00171194) and Acad-emy of Finland (Project 315820). This study was alsosupported by Academy of Finland projects of Profes-sor Toyras (267551), Professor Korhonen (286526,324529), and Professor Saarakkala (303786). Stateresearch funding (Kuopio University Hospital VTRProjects 5041750 and 5041744: Professor Toyras) andSigrid Juselius Foundation (Professor Korhonen) arealso acknowledged. Professor Herzog acknowledgesthe Canadian Institutes of Health Research, the KillamFoundation and the Canada Research Chair Program.Dr Finnila acknowledges strategic funding from theUniversity of Eastern Finland. Mr Ojanen acknowl-edges funding from Saastamoinen Foundation, Pai-vikki and Sakari Sohlberg Foundation, and FinnishCultural Foundation (North Savo Regional Fund).

CONFLICT OF INTEREST

Afara IO, Sarin JK, Ojanen S, Finnila M, HerzogW, Saarakkala S, Korhonen RK and Toyras J declarethat they have no conflicts of interest.

ETHICAL STANDARDS

No human studies were carried out by the authorsfor this article. All animal experiments were carried outin accordance with guidelines of the Canadian Councilon Animal Care and were approved by the committeeon Animal Ethics at the University of Calgary [#AC11-0035].

OPEN ACCESS

This article is licensed under a Creative CommonsAttribution 4.0 International License, which permitsuse, sharing, adaptation, distribution and reproductionin any medium or format, as long as you give appro-priate credit to the original author(s) and the source,provide a link to the Creative Commons licence, andindicate if changes were made. The images or otherthird party material in this article are included in thearticle’s Creative Commons licence, unless indicatedotherwise in a credit line to the material. If material isnot included in the article’s Creative Commons licenceand your intended use is not permitted by statutoryregulation or exceeds the permitted use, you will needto obtain permission directly from the copyrightholder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

BIOMEDICALENGINEERING SOCIETY

FIGURE 5. Normalized feature importance of SVM classifier 1 (a) and LR classifier 2 (b) models.

AFARA et al.226

Page 9: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

REFERENCES

1Afara, I. O., M. Hauta-Kasari, J. S. Jurvelin, A. Oloyede,and J. Toyras. Optical absorption spectra of humanarticular cartilage correlate with biomechanical properties,histological score and biochemical composition. Physiol.Meas. 36:1913–1928, 2015.2Afara, I. O., H. Moody, S. Singh, I. Prasadam, and A.Oloyede. Spatial mapping of proteoglycan content inarticular cartilage using near-infrared (NIR) spectroscopy.Biomed. Opt. Express 6:144–154, 2015.3Afara, I. O., I. Prasadam, Z. Arabshahi, Y. Xiao, and A.Oloyede. Monitoring osteoarthritis progression using nearinfrared (NIR) spectroscopy. Sci. Rep. 7:11463, 2017.4Afara, I., I. Prasadam, R. Crawford, Y. Xiao, and A.Oloyede. Non-destructive evaluation of articular cartilagedefects using near-infrared (NIR) spectroscopy in os-teoarthritic rat models and its direct relation to Mankinscore. Osteoarthr. Cartil. 20:1367–1373, 2012.5Afara, I., S. Singh, and A. Oloyede. Application of nearinfrared (NIR) spectroscopy for determining the thicknessof articular cartilage. Med. Eng. Phys. 35:88–95, 2012.6Afara, I. O., S. Singh, and A. Oloyede. Load-unloadingresponse of intact and artificially degraded articular carti-lage correlated with near infrared (NIR) absorption spec-tra. J. Mech. Behav. Biomed. Mater. 20:249–258, 2013.7Afara, I. O., et al. Near infrared spectroscopy for rapiddetermination of Mankin score components: a potentialtool for quantitative characterization of articular cartilageat surgery. Arthroscopy 30:1146–1155, 2014.8Antony, J., K. McGuinness, N.E. O’Connor, and K.Moran. Quantifying radiographic knee osteoarthritisseverity using deep convolutional neural networks. In:Proceedings—International Conference on PatternRecognition, 2017. https://doi.org/10.1109/ICPR.2016.7899799.9Bergstra, J., and Y. Bengio. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13:281–305,2012.

10Brown, C. P., R. W. Crawford, and A. Oloyede. Indenta-tion stiffness does not discriminate between normal anddegraded articular cartilage. Clin. Biomech. (Bristol, Avon)22:843–848, 2007.

11Buckwalter, J. A., H. J. Mankin, and A. Grodzinsky.Articular Cartilage and Osteoarthritis. Instr. Course Lect.54:466–480, 2005.

12Chollet, F. Keras: deep learning library for theano andtensorflow. GitHub Repos. 2015. https://doi.org/10.1111/j.1439-0310.1985.tb00118.x.

13GoogleResearch. TensorFlow: large-scale machine learningon heterogeneous systems. Google Res 2015. https://doi.org/10.1207/s15326985ep4001.

14Han, S. K., R. Seerattan, and W. Herzog. Mechanicalloading of in situ chondrocytes in lapine retropatellar car-tilage after anterior cruciate ligament transection. J. R. Soc.Interface 2010. https://doi.org/10.1098/rsif.2009.0458.

15LeCun, Y. A., Y. Bengio, and G. E. Hinton. Deep learning.Nature 521:436–444, 2015.

16Lorenz, H., and W. Richter. Osteoarthritis: cellular andmolecular changes in degenerating cartilage. Prog. His-tochem. Cytochem. 40:135–163, 2006.

17Lorenzo, P., M. T. Bayliss, and D. Heinegard. Alteredpatterns and synthesis of extracellular matrix macro-molecules in early osteoarthritis. Matrix Biol. 2004. https://doi.org/10.1016/j.matbio.2004.07.007.

18Makela, J. T. A., et al. Site-dependent changes in structureand function of lapine articular cartilage 4 weeks afteranterior cruciate ligament transection. Osteoarthr. Cartil.2014. https://doi.org/10.1016/j.joca.2014.04.010.

19Marticke, J. K., et al. How do visual, spectroscopic andbiomechanical changes of cartilage correlate in osteoar-thritic knee joints? Clin. Biomech. (Bristol, Avon) 25:332–340, 2010.

20Ojanen, S. P., et al. Site-specific glycosaminoglycan contentis better maintained in the pericellular matrix than theextracellular matrix in early post-traumatic osteoarthritis.PLoS ONE 2018. https://doi.org/10.1371/journal.pone.0196203.

21Ojanen, S. P., et al. Anterior cruciate ligament transection ofrabbits alters composition, structure and biomechanics ofarticular cartilage and chondrocyte deformation 2 weekspost-surgery in a site-specificmanner. J. Biomech. 98:109450,2020. https://doi.org/10.1016/j.jbiomech.2019.109450.

22Padalkar, M. V., R. G. Spencer, and N. Pleshko. Nearinfrared spectroscopic evaluation of water in hyaline car-tilage. Ann. Biomed. Eng. 41:2426–2436, 2013.

23Palukuru, U. P., C. M. McGoverin, and N. Pleshko.Assessment of hyaline cartilage matrix composition usingnear infrared spectroscopy. Matrix Biol. 38:3–11, 2014.

24Pedregosa, F., et al. Scikitlearn: machine learning in pythonGael Varoquaux. J. Mach. Learn. Res. 2011. https://doi.org/10.1007/s13398-014-0173-7.2.

25Prakash, M., et al. Near-infrared spectroscopy enablesquantitative evaluation of human cartilage biomechanicalproperties during arthroscopy. Osteoarthr. Cartil. 27:1235–1243, 2019.

26Sandell, L. J., and T. Aigner. Articular cartilage andchanges in arthritis an introduction: cell biology ofosteoarthritis. Arthritis Res. 3:107–113, 2001.

27Sarin, J. K., et al. Near infrared spectroscopic mapping offunctional properties of equine articular cartilage. Ann.Biomed. Eng. 44:3335–3345, 2016.

28Sarin, J. K., et al. Combination of optical coherencetomography and near infrared spectroscopy enhancesdetermination of articular cartilage composition andstructure. Sci. Rep. 7:10586, 2017.

29Sarin, J. K., et al. Arthroscopic near infrared spectroscopyenables simultaneous quantitative evaluation of articularcartilage and subchondral bone in vivo. Sci. Rep. 8:13409,2018.

30Sarin, J. K., et al. Arthroscopic determination of cartilageproteoglycan content and collagen network structure withnear-infrared spectroscopy. Ann. Biomed. Eng. 2019. https://doi.org/10.1007/s10439-019-02280-7.

31Spahn, G., H. M. Klinger, and G. O. Hofmann. How validis the arthroscopic diagnosis of cartilage lesions? Results ofan opinion survey among highly experienced arthroscopicsurgeons. Arch. Orthop. Trauma Surg. 129:1117–1121,2009.

32Spahn, G., et al. Evaluation of cartilage defects with near-infrared spectroscopy (NIR): an ex vivo study. Med. Eng.Phys. 30:285–292, 2008.

33Spahn, G., et al. Near-infrared spectroscopy for arthro-scopic evaluation of cartilage lesions: results of a blinded,prospective, interobserver study. Am. J. Sports Med.38:2516–2521, 2010.

34Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, andR. Salakhutdinov. Dropout: a simple way to prevent neuralnetworks from overfitting. J. Mach. Learn. Res. 15:1929–1958, 2014.

BIOMEDICALENGINEERING SOCIETY

NIRS Classification of Cartilage Health 227

Page 10: Machine Learning Classification of Articular …...including joints from separate non-operated control (CNTRL) animals (n = 8), were used. After sacrifice, NIR spectra (1000–2500

35Tiulpin, A., J. Thevenot, E. Rahtu, P. Lehenkari, and S.Saarakkala. Automatic knee osteoarthritis diagnosis fromplain radiographs: a deep learning-based approach. Sci.Rep. 2018. https://doi.org/10.1038/s41598-018-20132-7.

Publisher’s Note Springer Nature remains neutral with re-gard to jurisdictional claims in published maps and institu-tional affiliations.

BIOMEDICALENGINEERING SOCIETY

AFARA et al.228


Recommended