BackgroundKnee arthroscopy is one of the most complex minimally invasive surgeries, and it is routinely performed to treat a range of ailments and injuries to the knee joint. Its complex ergonomic design imposes visualization and navigation constraints, consequently leading to unintended tissue damage and a steep learning curve before surgeons gain proficiency. The lack of robust visual texture and landmark frame features further limits the success of image-guided approaches to knee arthroscopy Feature- and texture-less tissue structures of knee anatomy, lighting conditions, noise, blur, debris, lack of accurate ground-truth label, tissue degeneration, and injury make semantic segmentation an extremely challenging task. To address this complex research problem, this study reported the utility of reconstructed surface reflectance as a viable piece of information that could be used with cutting-edge deep learning technique to achieve highly accurate segmented scenes. MethodsWe proposed an intraoperative, two-tier deep learning method that makes full use of tissue reflectance information present within an RGB frame to segment texture-less images into multiple tissue types from knee arthroscopy video frames. This study included several cadaver knees experiments at the Medical and Engineering Research Facility, located within the Prince Charles Hospital campus, Brisbane Queensland. Data were collected from a total of five cadaver knees, three were males and one from a female. The age range of the donors was 56–93 years. Aging-related tissue degeneration and some anterior cruciate ligament injury were observed in most cadaver knees. An arthroscopic image dataset was created and subsequently labeled by clinical experts. This study also included validation of a prototype stereo arthroscope, along with conventional arthroscope, to attain larger field of view and stereo vision. We reconstructed surface reflectance from camera responses that exhibited distinct spatial features at different wavelengths ranging from 380 to 730 nm in the RGB spectrum. Toward the aim to segment texture-less tissue types, this data was used within a two-stage deep learning model. ResultsThe accuracy of the network was measured using dice coefficient score. The average segmentation accuracy for the tissue-type articular cruciate ligament (ACL) was 0.6625, for the tissue-type bone was 0.84, and for the tissue-type meniscus was 0.565. For the analysis, we excluded extremely poor quality of frames. Here, a frame is considered extremely poor quality when more than 50% of any tissue structures are over- or underexposed due to nonuniform light exposure. Additionally, when only high quality of frames was considered during the training and validation stage, the average bone segmentation accuracy improved to 0.92 and the average ACL segmentation accuracy reached 0.73. These two tissue types, namely, femur bone and ACL, have a high importance in arthroscopy for tissue tracking. Comparatively, the previous work based on RGB data achieved a much lower average accuracy for femur, tibia, ACL, and meniscus of 0.78, 0.50, 0.41, and 0.43 using U-Net and 0.79, 0.50, 0.51, and 0.48 using U-Net++. From this analysis, it is clear that our multispectral method outperforms the previously proposed methods and delivers a much better solution in achieving automatic arthroscopic scene segmentation. ConclusionThe method was based on the deep learning model and requires a reconstructed surface reflectance. It could provide tissue awareness in an intraoperative manner that has a high potential to improve surgical precisions. It could be applied to other minimally invasive surgeries as an online segmentation tool for training, aiding, and guiding the surgeons as well as image-guided surgeries.