metapath2vec: Scalable Representation Learning for Heterogeneous Networks

8/10/2021, 11:11:00 AM
Injung Hwang
포스팅 종류
(SIGKDD Conference on Knowledge Discovery and Data Mining)


word2vec → node2vec (homogeneous) → metapath2vec (heterogeneous)
Metapath-based word2vec method
In a nutshell, Applied "Node2Vec" for Heterogeneous Graph

Background: Heterogeneous Graph

Mining Heterogeneous Information Networks: A Structural Analysis Approach (KDD'12)

Heterogeneous Graph?

The graph where the number of types of nodes and their relations are more than 1

Schema & Network


By different meta-path comes different output

Strength for link types (weighted Meta-path)

Can use different weights following types of links to make proper graph mining


Applied word2vec-method, latent-space representation learning, to heterogeneous graph w/ meta-path-based random walks
Extend the skip-gram model to facilitate the modeling of geographically and semantically close nodes
Develop a heterogeneous negative sampling-based method

Metapath2vec & Metapath2vec++


Homogeneous graph embedding
Heterogeneous graph embedding
Negative sampling
Build the node frequency distribution by viewing different types of nodes homogeneously
Draw negative nodes regardless of node types


Neighborhood construction
Random walk by ignoring the types of nodes has bias to highly visible types of nodes
Nodes with a dominant number of paths
With a governing percentage of paths pointing to a small set of nodes
The flow of the walker is conditioned on the pre-defined meta-path ρ\rho
Meta-paths are commonly used in a symmetric way (The types of first and last are same)


Softmax within same types of nodes
Metapath2vec++ specifies one set of multinomial distributions for each type of neighborhood in the output layer of the skip-gram model
Copy of Metapath2vec vs. Metapath2vec++
Softmax with same type nodes


AMiner Computer Science (CS) dataset
9,323,739 computer scientists and 3,194,405 papers
from 3,883 computer science venues
the Database and Infor- mation Systems (DBIS) dataset
464 venues, their top-5000 authors, and corresponding 72,902 publications
The number of walks per node w: 1000;
The walk length l : 100;
The vector dimension d: 128 (LINE: 128 for each order);
The neighborhood size k : 7;
The size of negative samples: 5.
“APA” → the coauthor semantic
“APVPA” → heterogeneous semantic of authors publishing papers at the same venues

Multi-class classification

Parameter Sensitivity

Node Clustering

Parameter Sensitivity

Case Study

Similarity Search

in most cases, the top three results cover venues with similar prestige to the query one
STOC to FOCS in theory OSDI to SOSP in system HPCA to ISCA in architecture CCS to S&P in security CSCW to CHI in human-computer interaction EMNLP to ACL in NLP ICML to NIPS in machine learning WSDM to WWW in Web AAAI to IJCAI in artificial intelligence PVLDB to SIGMOD in database, etc.


Refer to Figure 1
Instead of separating the two types of nodes into two columns, it is capable of grouping each pair of one venue and its corresponding author closely
R. E. Tarjan and FOCS, H. Jensen and SIGGRAPH, H. Ishli and CHI, R. Agrawal and SIG- MOD, etc.
Together, both models arrange nodes from similar fields close to each other and dissimilar ones distant from each other
such as the “Core CS” cluster of systems (OSDI), networking (SIGCOMM), security (S&P), and architecture (ISCA), as well as the “Big AI” clus- ter of data mining (KDD), information retrieval (SIGIR), artificial intelligence (AI), machine learning (NIPS), NLP (ACL), and vision (CVPR).
Notice that the heterogeneous embeddings are able to unveil the similarities across different do- mains
including the “Core CS” sub-field cluster at the bottom right and the “Big AI” sub-field cluster at the top right
Demonstrate metapath2vec++’s novel capability to discover, model, and capture the underlying structural and semantic relationships between multiple types of nodes in heterogeneous networks.


Implemented in C and C++
with Quad 12 (48) core 2.3 GHz Intel Xeon CPUs E7-4850