当前位置：首页 > news >正文

DIOR-ViT：用于病理图像癌症分类的差分序数学习视觉Transformer|文献速递-医学影像算法文献分享

news 2025/7/25 11:47:07

Title

题目

DIOR-ViT: Differential ordinal learning Vision Transformer for cancerclassification in pathology images

DIOR-ViT：用于病理图像癌症分类的差分序数学习视觉Transformer

文献速递介绍

癌症是全球主要的死亡原因之一（Bray 等人，2021）。2020 年，全球新增癌症病例 1930 万例，死于癌症的人数达 1000 万（Sung 等人，2021）。一旦怀疑患有癌症，就会通过活检或手术获取组织标本，用苏木精 - 伊红（H&E）染色，然后由病理学家在显微镜下进行评估。尽管医学领域取得了诸多技术进步，但对这类组织标本的人工组织学评估仍然是确诊的依据，这往往也是癌症治疗和患者管理的基础。由于当前的病理学实践在很大程度上是人工进行的，因此其吞吐量低，且存在显著的观察者间和观察者内差异（Sooriakumaran 等人，2005），这限制了癌症诊断的速度和准确性，进而降低了医疗服务质量。因此，迫切需要开发一种高通量、客观且可靠的癌症诊断方法。计算病理学整合了人工智能、全切片成像（WSI）和临床信息学，是临床和科学界的一个新兴学科（Cui 和 Zhang，2021）。计算病理学工具在改进和重塑当前病理学实践方面显示出巨大潜力（Rakha 等人，2021）。许多此类工具基于深度学习算法构建，特别是卷积神经网络（CNNs）。基于 CNN 的方法已成功应用于多个领域，包括组织分割（Vu 和 Kwak，2019）、有丝分裂检测（Sohail 等人，2021）、治疗反应预测（Hildebrand 等人，2021）、细胞核分割与分类（Graham 等人，2019；Doan 等人，2022）以及癌症分级（Le Vuong 等人，2021；Vuong 等人，2021）。另一方面，视觉Transformer（ViT）近年来因其在计算机视觉任务中的卓越表现而备受关注，这些任务包括目标检测（Carion 等人，2020；Zhu 等人，2020）、语义分割（Zheng 等人，2021）和场景理解（Zhou 等人，2018）。它也已应用于病理图像分析，例如癌症亚型分类（Chen 等人，2022）、转移检测（Gul 等人，2022）和生存分析（Shen 等人，2022）。每个癌症级别都有其独特的组织学和形态学模式。相同级别的组织样本具有这些模式。病理学家或计算病理学工具的作用是识别这些独特模式，并赋予相应的级别或类别标签。从这个意义上讲，癌症级别或类别标签通常被视为不同的、独立的类别来研究，即作为一个分类问题。然而，这忽略了癌症级别之间的关系。不同的癌症级别之间存在自然的顺序（Le Vuong 等人，2021）；肿瘤细胞的侵袭性越强，癌症级别就越高。为了纳入这种顺序关系，癌症分级可以被视为一个序数分类或回归问题，其中癌症级别在序数尺度上进行预测。例如，Le Vuong 等人（2021）将癌症分级同时表述为分类问题和序数分类问题，并同时进行这两个分类任务。另一方面，这种顺序关系可以通过相同和/或不同级别的组织样本之间的成对比较来实现。这种学习顺序关系的方法称为顺序学习（Lim 等人，2020）。顺序学习在年龄估计中已被证明是有效的（Lim 等人，2020）。然而，据我们所知，顺序学习和序数分类在计算病理学的癌症分级中尚未得到充分利用。在此，我们提出了一种用于视觉Transformer的差分序数分类学习框架（DIOR - ViT），它能够以准确且可靠的方式对病理图像进行癌症分级（图1）。所提出的DIOR - ViT基于网络架构（即ViT）以及多任务学习和顺序学习这两种学习范式的最新发展而构建。在DIOR - ViT中，利用ViT将输入的组织样本映射到高维特征空间，以实现对样本的高效且有效的表示。根据多任务学习的原理，DIOR - ViT同时执行分类癌症分类和差分序数癌症分类。在分类中，DIOR - ViT为给定的输入样本预测类别标签。同时，在差分序数分类中，它接收另一个指定为被减数样本的样本，并预测两个样本之间类别标签的差异。对于分类癌症分类，DIOR - ViT旨在捕捉组织样本的特定组织学模式，并确定其相应的类别，即癌症级别。在差分序数癌症分类中，DIOR - ViT通过差分成对比较，即特征空间中两个样本之间的差异和真实标签，来学习组织样本之间的顺序关系。为了高效且有效地优化DIOR - ViT，我们引入了一种新的损失函数，称为负绝对差对数似然（NAD）损失，该损失函数专门针对差分序数分类。为了评估DIOR - ViT的性能，我们采用了多种多组织癌症数据集，包括结直肠、前列腺和胃组织，并将其与计算机视觉和计算病理学中的几种最先进模型进行了比较。所提出的DIOR - ViT不仅在多个癌症数据集上展示出准确且稳健的分类结果，而且显著优于所有竞争模型。我们工作的主要贡献总结如下： - 我们提出了一种基于视觉Transformer的差分序数分类学习框架，用于改进计算病理学中的癌症分级。 - 我们引入了差分序数分类学习，其中我们在特征空间和真实标签中定义组织样本之间的差分关系，通过成对比较学习这种差分关系，并进行序数分类以帮助改进癌症分级。 - 我们提出了负绝对差对数似然损失，该损失专门针对差分序数分类，从而改进了所提出模型的优化。 - 我们引入了一个大型胃癌数据集，其中包括超过100,000张组织图像，分为四个组织病理学类别，如良性、管状高分化肿瘤、管状中分化肿瘤和管状低分化肿瘤。 - 我们在三种类型的癌症数据集上评估了所提出方法的性能。所提出的方法在所有三种类型的癌症数据集上都取得了优于其他竞争模型的分类性能。

Abatract

摘要

In computational pathology, cancer grading has been mainly studied as a categorical classification problem,which does not utilize the ordering nature of cancer grades such as the higher the grade is, the worse thecancer is. To incorporate the ordering relationship among cancer grades, we introduce a differential ordinallearning problem in which we define and learn the degree of difference in the categorical class labels betweenpairs of samples by using their differences in the feature space. To this end, we propose a transformerbased neural network that simultaneously conducts both categorical classification and differential ordinalclassification for cancer grading. We also propose a tailored loss function for differential ordinal learning.Evaluating the proposed method on three different types of cancer datasets, we demonstrate that the adoptionof differential ordinal learning can improve the accuracy and reliability of cancer grading, outperformingconventional cancer grading approaches. The proposed approach should be applicable to other diseases andproblems as they involve ordinal relationship among class labels.

在计算病理学中，癌症分级主要被当作一个分类问题来研究，这种方式没有利用癌症级别的序数特性——例如，级别越高，癌症病情越严重。为了融入癌症级别之间的序关系，我们引入了一个差分序数学习问题，通过利用样本在特征空间中的差异，来定义和学习样本对之间类别标签的差异程度。为此，我们提出了一种基于Transformer的神经网络，该网络可同时进行癌症分级的分类任务和差分序数分类任务。我们还为差分序数学习设计了一个定制化的损失函数。通过在三种不同类型的癌症数据集上对所提方法进行评估，我们证明：采用差分序数学习能够提高癌症分级的准确性和可靠性，其性能优于传统的癌症分级方法。由于其他疾病和问题的类别标签之间也存在序关系，因此所提方法同样适用于这些场景。

Method

方法

Let {𝑥𝑖 , 𝑦𝑖 } 𝑁 𝑖=1 be a set of 𝑁 pathology image-ground truth pairswhere 𝑥𝑖 ∈ 𝑅3 is the 𝑖th pathology image and 𝑦**𝑖 ∈  = {𝑐1 , 𝑐2 , …, 𝑐𝑛𝑐*} isthe corresponding ground truth class label 𝑐1 , 𝑐2 , …, 𝑐𝑁𝑐are defined onan ordinal scale where 𝑁**𝑐 is the cardinality of class labels; for instance,𝑐1 = 1, 𝑐2 = 2, …, 𝑐𝑁𝑐= 𝑁𝑐 (𝑁𝑐 ∈ ). The objective our study is to learna transformer-based classifier  that maps an input sample 𝑥𝑖 into ahigh-dimensional feature space 𝛺, in which the ordering relationshipamong input samples are retained, and conducts cancer classificationthat are both accurate and reliable. It can be formulated as follows:𝑎𝑟𝑔𝑚𝑖𝑛𝜃𝑁∑𝑖=1( (𝑥𝑖* ), 𝑦𝑖 ; 𝜃) (1)where  is the loss function and 𝜃 is a set of learnable parameters of .

3.1. 问题公式化令({x{i}, y{i}}{i=1}^{N})为(N)个病理图像-真实标签对的集合，其中(x{i} \in R^{3})表示第(i)个病理图像，(y{i} \in \mathcal{C} = {c{1}, c{2}, \ldots, c{n{c}}})为对应的真实类别标签。(c{1}, c{2}, \ldots, c{N{c}})定义在序数尺度上，(N{c})为类别标签的基数；例如，(c{1}=1)，(c{2}=2)，……，(c{N{c}}=N{c})（(N{c} \in \mathcal{N})）。本研究的目标是学习一个基于Transformer的分类器(\mathcal{T})，该分类器能将输入样本(x{i})映射到高维特征空间(\Omega)（在该空间中，输入样本之间的顺序关系得以保留），并进行准确且可靠的癌症分类。其可公式化为： [ \underset{\theta}{argmin} \sum{i=1}^{N} \mathcal{L}\left(\mathcal{T}\left(x{i}\right), y{i} ; \theta\right) \tag{1} ] 其中，(\mathcal{L})为损失函数，(\theta)为(\mathcal{T})的一组可学习参数。

Conclusion

结论

In this work, we propose a transformer-based neural network, DIORViT, that simultaneously conducts both categorical classification anddifferential ordinal classification for cancer grading in computationalpathology. Introducing differential ordinal learning, DIOR-ViT bridgesthe gap between the conventional categorical classification and thepractice of cancer grading in pathology. The experimental results onthree types of cancer datasets confirm the validity of DIOR-ViT anddifferential ordinal learning, which is not only able to identify cancertissues of differing grades with high accuracy but also able to adaptwell to unseen data. These results also demonstrate the importance ofpathology knowledge and its integration into computational pathologytools for improved diagnosis. Cancer grading is not the only problemthat can be formulated as a differential ordinal learning problem.Other problems, diseases, and datasets in computational pathology andmedical imaging in general can benefit from it as the class labelspossess ordering relationships. The future study will entail the furtherdevelopment and application of the proposed method to other problemsand domains.

在本研究中，我们提出了一种基于Transformer的神经网络——DIOR-ViT，它能在计算病理学中同时进行癌症分级的分类任务和差分序数分类任务。通过引入差分序数学习，DIOR-ViT填补了传统分类与病理学中癌症分级实践之间的差距。在三种癌症数据集上的实验结果证实了DIOR-ViT和差分序数学习的有效性，该方法不仅能高精度识别不同级别的癌组织，还能很好地适应未见过的数据。这些结果也表明，病理学知识及其与计算病理学工具的融合对于提高诊断水平具有重要意义。癌症分级并非唯一可表述为差分序数学习的问题。由于计算病理学和一般医学影像中的其他问题、疾病及数据集的类别标签存在序关系，它们通常也能从该方法中受益。未来的研究将进一步拓展所提方法的开发，并将其应用于其他问题和领域。

Results

结果

5.1. Colorectal cancer classification

Table 2 shows the classification results of the proposed DIOR-ViTand several other competing models on two independent colorectalcancer datasets (𝐶**𝑇 𝑒𝑠𝑡𝐼 and 𝐶𝑇 𝑒𝑠𝑡𝐼𝐼 ). The results clearly demonstratethat the proposed DIOR-ViT were superior to other competing models.On 𝐶𝑇 𝑒𝑠𝑡𝐼 , DIOR-ViT achieved the best performance in Acc (87.78%)and 𝑘**𝑤(0.942) and was the runner-up in F1-macro to one of themulti-task learning models 𝑀𝐴𝐸−𝐶𝐸𝑜*by 0.002. Among the competingmodels, the two multi-task learning models outperformed others in allevaluation metrics. Moreover, on 𝐶𝑇 𝑒𝑠𝑡𝐼𝐼 , DIOR-ViT outperformed allthe competing models by a large margin such as ≥ 5.26% Acc ≥ 0.030F1-macro, and ≥ 0.027 𝑘𝑤 (Fig. 4a). It is noteworthy that 𝐶𝑇 𝑒𝑠𝑡𝐼 and𝐶𝑇 𝑒𝑠𝑡𝐼𝐼* were acquired from different time periods and using differentdigital slide scanners. The superior performance of DIOR-ViT on 𝐶𝑇 𝑒𝑠𝑡𝐼𝐼*indicates the better generalizability of the model under different acquisition settings. Unlike the results on 𝐶𝑇 𝑒𝑠𝑡𝐼 , the two transformer-basedmodels (ViT and Swin) demonstrated the best performance amongother competing models; however, another transformer-based model(DeiT III) was the worst model.

表2展示了所提出的DIOR-ViT模型与其他几种竞争模型在两个独立的结直肠癌数据集（C**TestI和CTestII）上的分类结果。结果清楚地表明，所提出的DIOR-ViT优于其他竞争模型。在CTestI上，DIOR-ViT在准确率（Acc，87.78%）和kw（0.942）方面表现最佳，在F1宏分数（F1-macro）上以0.002的差距仅次于其中一个多任务学习模型MAE−CE𝑜。在竞争模型中，两个多任务学习模型在所有评估指标上的表现都优于其他模型。此外，在C**TestII上，DIOR-ViT大幅优于所有竞争模型，准确率（Acc）高出≥5.26%，F1宏分数高出≥0.030，kw高出≥0.027（图4a）。值得注意的是，CTestI和CTestII来自不同的时间段，且使用不同的数字切片扫描仪。DIOR-ViT在C**TestII上的优异表现表明，该模型在不同采集设置下具有更好的泛化能力。与CTestI上的结果不同，两个基于Transformer的模型（ViT和Swin）在其他竞争模型中表现最佳；然而，另一个基于Transformer的模型（DeiT III）则是表现最差的模型。

Figure

图

Fig. 1. The overview of DIOR-ViT. DIOR-ViT consists of a feature extractor, a categorical classifier, and a differential ordinal classifier.

图1. DIOR-ViT的整体框架。DIOR-ViT由特征提取器、分类分类器和差分序数分类器组成。

Fig. 2. Generation of the ground truth differential ordinal labels.

图2. 真实差分序数标签的生成过程。

Fig. 3. Loss curves for (a) mean squared error 𝑀𝑆𝐸 , (b) mean absolute error 𝑀𝐴𝐸 , (c) ordinal cross entropy loss 𝐶𝐸𝑜, and (d) negative absolute difference log-likelihood loss𝑁𝐴𝐷. 𝑑 denotes the difference between the ground truth label and the predicted label in the differential ordinal classification

图3. 各类损失函数的曲线：(a) 均方误差损失（𝑀𝑆𝐸）；(b) 平均绝对误差损失（𝑀𝐴𝐸）；(c) 序数交叉熵损失（𝐶𝐸）；(d) 负绝对差对数似然损失（𝑁𝐴𝐷）。其中，𝑑表示差分序数分类中真实标签与预测标签之间的差值。

Fig. 4. US images and their corresponding US layouts. The top row displays the originalUS images while the bottom row presents their corresponding US layouts. Each columnrefers to a specific case

图4. 超声图像及其相应的超声布局。上行展示原始超声图像，下行呈现其对应的超声布局。每一列对应一个特定病例。

Fig. 5. Example images and GradCAM results for each class from the 𝐶𝑇 𝑒𝑠𝑡𝐼𝐼 dataset

图5. 来自CTestII数据集的每个类别的示例图像和GradCAM结果。

Fig. 6. Differential ordinal classification results on (a) colorectal, (b) prostate, and (c) gastric tissue datasets. The number below each image represents the output of the differentialordinal classifier.

图6. 结直肠（a）、前列腺（b）和胃（c）组织数据集上的差分序数分类结果。每张图像下方的数字表示差分序数分类器的输出。

Table

表

Table 1Details of colorectal, prostate, and gastric prostate tissue datasets

表1 结直肠、前列腺和胃前列腺组织数据集的详细信息

Table 2Result of colorectal cancer classification. Bold indicates best performance.

表2 结直肠癌分类结果。粗体表示最佳性能。

Table 3Result of prostate cancer classification. Bold indicates best performance

表 3前列腺癌分类结果。粗体表示最佳性能。

Table 4Result of Gastric cancer classification. Bold indicates best performance

表4 胃癌分类结果。粗体表示最佳性能。

Table 5Result of cancer classification with differing loss functions. Bold indicates best performance

表5 不同损失函数的癌症分类结果。粗体表示最佳性能。

Table 6Result of cancer classification with differing feature extractors. Bold indicates best performance

表6 不同特征提取器的癌症分类结果。粗体表示最佳性能。

Table 7Results of cancer classification with differing 𝑁𝐴𝐷 weights. Bold indicates best performance

表7 不同𝑁𝐴𝐷权重的癌症分类结果。粗体表示最佳性能。

Table 8Results of cancer classification with differing learning rates. Bold indicates best performance.

表8 不同学习率的癌症分类结果。粗体表示最佳性能。

Table 9Results of cancer classification with differing input sizes. Bold indicates best performance.

表9 不同输入尺寸的癌症分类结果。粗体表示最佳性能。

Table 10Results of differential ordinal classification results. Numbers in parenthesis represent the nominal values assigned to categorical classlabels.

表10 差分序数分类结果。括号中的数字表示为分类标签分配的标称值。