Definition

Heterogeneous graph transformer (HGT) is an extension of the GCN designed to handle the graph with multiple types of nodes and edges.

Architecture

Meta-relation triplets

is the target node, and are the source nodes, and are the edges, and the corresponding meta-relations are and .

HGT uses meta-relation triplets (source node type, edge type, target node type) to describe the relationships between different entity types in a graph.

Heterogeneous Mutual Attention

HGT uses an attention mechanism that considers node types and edge types. The attention weights are computed based on the meta-relation triplets, allowing for type-specific information propagation.

For a meta-relation , the attention is calculated by where is the neighbors of the node , and is the number of attention heads.

The attention head is calculated by where:

  • is a query vector
  • is a key vector
  • and are the feature vectors of the source and target nodes at layer .
  • , , and are learnable parameter matrices specific to the meta-relation.

Heterogeneous Message Passing

The information of source nodes are passed to the target node. Similar to the attention process, the meta-relations of edges are incorporated into the message passing process.

For a meta-relation , the multi-head message is calculated by where is the number of message heads.

The message head is calculated by where:

  • is a value vector
  • , are learnable parameter matrices specific to the meta-relation.

Target-Specific Aggregation

The heterogeneous multi-head attention and message are aggregated from the source nodes to the target node. The attention values are used as weights averaging corresponding messages from the source nodes.

The updated vector is calculated by

The final output, contextualized representation of the node, is calculated using the updated vector. where

  • is the feature vectors of the target node at layer , and works as a residual connection.
  • is learnable parameter matrix specific to the meta-relation.
  • is a non-linear Activation Function