当前位置: 首页 > news >正文

[论文精读]Multi-Channel Graph Neural Network for Entity Alignment

论文网址:Multi-Channel Graph Neural Network for Entity Alignment (aclanthology.org)

论文代码:https:// github.com/thunlp/MuGNN

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Preliminaries and Framework

2.3.1. Preliminaries

2.3.2. Framework

2.4. KG Completion

2.4.1. Rule Inference and Transfer

2.4.2. Rule Grounding

2.5. Multi-Channel Graph Neural Network

2.5.1. Relation Weighting

2.5.2. Multi-Channel GNN Encoder

2.5.3. Align Model

2.6. Experiment

2.6.1. Experiment Settings

2.6.2. Overall Performance

2.6.3. Impact of Two Channels and Rule Transfer

2.6.4. Impact of Seed Alignments

2.6.5. Qualitative Analysis

2.7. Related Work

2.8. Conclusions

3. 知识补充

3.1. Adagrad Optimizer

4. Reference


1. 心得

(1)是比较容易理解的论文

2. 论文逐段精读

2.1. Abstract

        ①Limitations of entity alignment: structural heterogeneity and limited seed alignments

        ②They proposed Multi-channel Graph Neural Network model (MuGNN)

2.2. Introduction

        ①Knowledge graph (KG) stores information by directed graph, where the nodes are entity and the edges denote relationship

        ②Mother tongue information usually stores more information:

(作者觉得KG1的Jilin会对齐KG2的Jilin City,因为他们有相似的方言和连接的长春。这个感觉不是一定吧?取决于具体模型?感觉还是挺有差别的啊这俩东西,结构上也没有很相似

        ③To solve the problem, it is necessary to fill in missing entities and eliminate unnecessary ones

2.3. Preliminaries and Framework

2.3.1. Preliminaries

(1)KG

        ①Defining a directed graph G=\left ( E,R,T \right ), which contains entity set E, relation set R and triplets T
        ②Triplet t=(e_{i},r_{ij},e_{j})\in T

(2)Rule knowledge

        ①For rule k=(r_{c}|r_{s1},\cdots,r_{sp})\mathcal{K}=\{k\}, it means there are \forall x,y\in E:(x,r_{s},y)\Rightarrow (x,r_{c},y)

(3)Rule Grounding

        ①通过上面的递推,实体可以找到更进一步的关系

(4)Entity alignment

        ①Alignments in two entities: \mathcal{A}_{e}=\{(e,e^{\prime}) \in E\times E^{\prime}|e \leftrightarrow e^{\prime}\}

        ②Alignment relation: \mathcal{A}_{r}^{s}=\{(r,r^{\prime})\in R\times R'|r\leftrightarrow r'\}

2.3.2. Framework

        ①Workflow of MuGNN:

(1)KG completion

        ①Adopt rule mining system AMIE+

(2)Multi-channel Graph Neural Network

        ①Encoding KG in different channels

2.4. KG Completion

2.4.1. Rule Inference and Transfer

        

2.4.2. Rule Grounding

        ①比如从KG2中找到province(x,y) \wedge dialect(y,z) \Rightarrow dialect(x,z)关系,就可以补充到KG1中去

2.5. Multi-Channel Graph Neural Network

2.5.1. Relation Weighting

        ①They will generate a weighted relationship matrix

        ②They construct self attention adjacency matrix and cross-KG attention adjacency matrix for each channel

(1)KG Self-Attention(这个是为了补齐)

        ①Normalized connection weights:

a_{ij}=softmax(c_{ij})=\frac{exp(c_{ij})}{\sum_{e_{k}\in N_{e_{i}}\cup e_{i}}exp(c_{ik})}

where e_i contains self loop and e_{k} \in N_{e_{i}}\cup\{e_{i}\} denotes the neighbors of e_i

        ②c_{ij} denotes the attention coefficient between two entities:

\begin{aligned} \text{cij}& =attn(\mathbf{We_{i}},\mathbf{We_{j}}) \\ &=LeakyReLU(\mathbf{p[We_{i}\|We_{j}]}) \end{aligned}

where \mathbf{W} and \mathbf{p} are trainable parameters

(2)Cross-KG Attention(这个是为了修剪,是另一个邻接矩阵)

        ①Pruning operation :

a_{ij}=\max\limits_{r\in R,r'\in R'}\mathbf{1}((e_i,r,e_j)\in T)sim(r,r')

if (e_i,r,e_j)\in T) is true then it will be 1 otherwise 0, sim\left ( \cdot \right ) denotes inner product similarity measure sim(r,r')=\mathbf{r}^{T}\mathbf{r}^{\prime}

2.5.2. Multi-Channel GNN Encoder

       ①Propagation of GNN:

\mathrm{GNN}(A,H,W)=\sigma(\mathbf{AHW})

and they chose \sigma \left ( \cdot \right ) as ReLU

        ②Multi GNN encoder:

\mathrm{MultiGNN}(H^{l};A_{1},\cdots,A_{c})=\mathrm{Pooling}(H_{1}^{l+1},\cdots,H_{c}^{l+1})

where c denotes the number of channels

        ③Updating function:

\mathbf{H}_i^{l+1}=\mathrm{GNN}(A_i,H^l,W_i)

        ④Pooling strategy: mean pooling

2.5.3. Align Model

        ①Embedding two KG to the same vector space and measure the distance to judge the equivalence relation:

\mathcal{L}_{a}=\sum_{(e,e^{'})\in\mathcal{A}_{e}^{s}}\sum_{(e_{-},e_{-}^{'})\in\mathcal{A}_{e}^{s-}}[d(e,e^{'})+\gamma_{1}-d(e_{-},e_{-}^{'})]_{+}+\\\sum_{(r,r^{'})\in\mathcal{A}_{r}^{s}}\sum_{(r_{-},r_{-}^{'})\in\mathcal{A}_{r}^{s-}}[d(r,r^{'})+\gamma_{2}-d(r_{-},r_{-}^{'})]_{+}

where [\cdot]_{+}=max\{0,\cdot\}d(\cdot)=\|\cdot\|_{2}\mathcal{A}_e^{s-} and \mathcal{A}_r^{s-} are negative pairs in the original sets, \gamma _1> 0 and \gamma _2> 0 are margin hyper-parameters separating positive and negative entity and relation alignments

        ②Triplet loss:

\begin{gathered} L_{r} =\sum_{g^{+}\in\mathcal{G}(\mathcal{K})g^{-}\in\mathcal{G}^{-}(\mathcal{K})}[\gamma_{r}-I(g^{+})+I(g^{-})]_{+} \\ +\sum_{t^{+}\in Tt^{-}\in T^{-}}[\gamma_{r}-I(t^{+})+I(t^{-})]_{+} \end{gathered}

        ③I\left ( \cdot \right ) denotes the true value function for triplet t:

I(t)=1-\frac{1}{3\sqrt{d}}\|\mathbf{e}_{i}+\mathbf{r}_{ij}-\mathbf{e}_{j}\|_{2}

then it can be recursively transformed into:

I(t_{s})=I(t_{s1}\wedge t_{s2})=I(t_{s1})\cdot I(t_{s2})\\I(t_{s}\Rightarrow t_{c})=I(t_{s})\cdot I(t_{c})-I(t_{s})+1

where d is the embedding size

        ④The overall loss:

\mathcal{L}=\mathcal{L}_a+\mathcal{L}_r'+\mathcal{L}_r

2.6. Experiment

2.6.1. Experiment Settings

(1)Datasets

        ①Datasets: DBP15K (contains DBPZH-EN(Chinese to English), DBPJA-EN (Japanese to English), and DBPFREN (French to English)) and DWY100K (contains DWY-WD (DBpedia to Wikidata) and DWY-YG (DBpedia to YAGO3))

        ②Statistics of datasets:

        ③Statistics of KG in datasets:

(2)Baselines

        ①MTransE

        ②JAPE

        ③GCN-Align

        ④AlignEA

(3)Training Details

        ①Training ratio: 30% for training and 70% for testing

        ②All the embedding size: 128

        ③All the GNN layers: 2

        ④Optimizer: Adagrad

        ⑤Hyperparameter: \gamma _1=1.0,\gamma _2=1.0,\gamma _r=0.12

        ⑥Grid search to learning rate in {0.1,0.01,0.001}, L2 in {0.01,0.001,0.0001}, dropout rate in {0.1,0.2,0.5}. They finally got 0.001,0.01,0.2 optimal each

2.6.2. Overall Performance

2.6.3. Impact of Two Channels and Rule Transfer

        ①Module ablation:

2.6.4. Impact of Seed Alignments

        ①Ratio of seeds:

2.6.5. Qualitative Analysis

        ①Two examples of how the rule works:

2.7. Related Work

        Introduces some related works

2.8. Conclusions

        They aim to further research word ambiguity

3. 知识补充

3.1. Adagrad Optimizer

(1)补充学习:Deep Learning 最优化方法之AdaGrad - 知乎 (zhihu.com)

4. Reference

Cao, Y. et al. (2019) 'Multi-Channel Graph Neural Network for Entity Alignment', Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, doi: 10.18653/v1/P19-1140

http://www.lryc.cn/news/451336.html

相关文章:

  • Study-Oracle-10-ORALCE19C-RAC集群搭建(一)
  • 1.8 物理层下的传输媒体
  • 指纹定位的原理与应用场景
  • 发现一款适合所有用户小巧且强大的编辑器(完美替换Windows记事本)
  • Mysql知识点整理
  • ISA-95制造业中企业和控制系统的集成的国际标准-(4)
  • Redis篇(Redis原理 - 数据结构)(持续更新迭代)
  • Disco公司的DBG工艺详解
  • 大学学校用电安全远程监测预警系统
  • C++网络编程之IP地址和端口
  • 陶瓷4D打印有挑战,水凝胶助力新突破,复杂结构轻松造
  • 网络安全的详细学习顺序
  • 人工智能与机器学习原理精解【28】
  • StarRocks 中如何做到查询超时(QueryTimeout)
  • Windows 开发工具使用技巧 Visual Studio使用安装和使用技巧 Visual Studio 快捷键
  • 计算机网络-系分(5)
  • React Native使用高德地图
  • 排序算法的理解
  • Yocto - 使用Yocto开发嵌入式Linux系统_04 使用Toaster来创建一个image
  • 【C#生态园】后端服务与网络库:选择适合你游戏开发的利器
  • 计算机前沿技术-人工智能算法-大语言模型-最新研究进展-2024-09-30
  • 【漏洞复现】JeecgBoot 积木报表 queryFieldBySql sql注入漏洞
  • Qt6 中相对于 Qt5 的新增特性及亮点
  • 超轻巧modbus调试助手使用说明
  • Percona Monitoring and Management
  • WarehouseController
  • 基于 STM32 单片机的温室物理无害生长系统
  • 新版pycharm如何导入自定义环境
  • 一文彻底搞懂多模态 - 多模态理解+视觉大模型+多模态检索
  • 提升效率的编程世界探索与体验