Community Detection Algorithms

Community detection algorithms evaluate clusters of nodes and determine whether they have a tendency to strengthen or break apart:

  • Connected Components: Identifies the connected nodes in an undirected graph.
  • Label Propagation: Detects structures in a graph by propagating labels throughout the graph and forming groups based on the label propagation.
  • Triangle Enumeration: Identifies each triangle in a graph.
  • Triangle Count: Determines the number of triangles that a graph includes and calculates the average clustering coefficient for the resulting network of nodes.
  • Vertex Triangle Count: Determines the number of triangles that a vertex is a member of and computes the clustering coefficient for the vertex.

Connected Components

The Connected Components algorithm identifies the connected vertices in an undirected graph. The algorithm returns a unique connected component ID for each vertex in the graph.

Syntax

To incorporate the Connected Components algorithm in a query, include the following SERVICE call in the WHERE clause.

SERVICE <csi:connected_components>
  {
    [] <csi:binding-vertex>  ?binding_vertex_variable ;
       <csi:binding-id>      ?component_id_variable ;
       <csi:edge-label>      <edge_uri> ;
     [ <csi:graph>           <graph_uri> ] .  
}
Argument Range Description
binding-vertex variable Required argument that defines the name to use for the result column that lists the source nodes (vertices).
binding-id variable Required argument that defines the name to use for the result column that lists the assigned component identifiers.
edge-label URI Required argument that lists the edge URI that connects the nodes or vertices.
graph URI Optional argument that specifies the graph to query.

Label Propagation

The Label Propagation algorithm detects structures in a graph by propagating labels throughout the graph and forming groups based on the label propagation.

Syntax

To incorporate the Label Propagation algorithm in a query, include the following SERVICE call in the WHERE clause.

SERVICE <csi:label_propagation>
  {
    [] <csi:binding-vertex>  ?vertex_variable ;
       <csi:binding-label>   ?label_variable ;
       <csi:edge-label>      <edge_uri> ;
     [ <csi:max-iterations>  number_of_iterations ] .
}
Argument Range Description
binding-vertex variable Required argument that defines the name to use for the result column that lists the source nodes (vertices).
binding-label variable Required argument that defines the name to use for the result column that lists the assigned label values.
edge-label URI Required argument that lists the edge URI that defines the graph to operate on. The graph is the set of vertices that are connected by this URI.
max-iterations 1 - 100 Optional argument that specifies the maximum number of times to iterate through the graph. The default value is 5.

Triangle Enumeration

The Triangle Enumeration algorithm identifies each of the triangles that exist in the specified graph. A triangle is defined as three nodes that are connected by three edges (a-b, b-c, c-a).

Syntax

To incorporate the Triangle Enumeration algorithm in a query, include the following SERVICE call in the WHERE clause.

SERVICE <csi:triangles>
  {
    [] <csi:binding-vertex1>  ?vertex1_variable ;
       <csi:binding-vertex2>  ?vertex2_variable ;
       <csi:binding-vertex3>  ?vertex3_variable ;
       <csi:edge-label>       <edge_uri> .
}
Argument Range Description
binding-vertex1 variable Required argument that defines the name to use for the result column that lists the first node in the triangle.
binding-vertex2 variable Required argument that defines the name to use for the result column that lists the second node in the triangle.
binding-vertex3 variable Required argument that defines the name to use for the result column that lists the third node in the triangle.
edge-label URI Required argument that lists the edge URI that defines the graph to operate on. The graph is the set of vertices that are connected by this URI.

Triangle Count

The Triangle Count algorithm determines the number of triangles that a graph includes and calculates the average clustering coefficient for the resulting network of nodes. A triangle is defined as three nodes that are connected by three edges (a-b, b-c, c-a).

Syntax

To incorporate the Triangle Count algorithm in a query, include the following SERVICE call in the WHERE clause.

SERVICE <csi:triangles>
  {
    [] <csi:binding-average-clustering-coefficient>  ?binding_avg_cc_variable ;
       <csi:binding-triangle-count>                  ?triangle_count_variable ;
       <csi:edge-label>                              <edge_uri> .
}
Argument Range Description
binding-average-clustering-coefficient variable Required argument that defines the name to use for the result column that lists the average clustering coefficient value. The algorithm returns a double value that indicates the average degree to which the nodes in the network tend to cluster.
binding-triangle-count variable Required argument that defines the name to use for the result column that lists the number of triangles in the graph.
edge-label URI Required argument that lists the edge URI that defines the graph to operate on. The graph is the set of vertices that are connected by this URI.

Vertex Triangle Count

The Vertex Triangle Count algorithm counts number of triangles that a vertex is a member of and computes the clustering coefficient for the vertex. A triangle is defined as three nodes that are connected by three edges (a-b, b-c, c-a).

Syntax

To incorporate the Vertex Triangle Count algorithm in a query, include the following SERVICE call in the WHERE clause.

SERVICE <csi:triangles>
  {
    [] <csi:binding-vertex>                  ?vertex_variable ;
       <csi:binding-vertex-triangle-count>   ?triangle_count_variable ;
       <csi:binding-clustering-coefficient>  ?binding_cc_variable ;
       <csi:edge-label>                      <edge_uri> .
}
Argument Range Description
binding-vertex variable Required argument that defines the name to use for the result column that lists each vertex.
binding-vertex-triangle-count variable Required argument that defines the name to use for the result column that lists the number of triangles that a vertex belongs to.
binding-clustering-coefficient variable Required argument that defines the name to use for the result column that lists the clustering coefficient value for each node. The algorithm returns a double value that indicates the degree to which the node tends to cluster with other nodes.
edge-label URI Required argument that lists the edge URI that defines the graph to operate on. The graph is the set of vertices that are connected by this URI.