Installation

pip install sparsegraph

Classes

class sparsegraph.SparseGraph(adjacency: csr_matrix, labels: List[T])[source]

get_label(index: int) → T[source]

get_largest_component(directed: bool = True, connection: Literal['strong', 'weak'] = 'strong') → SparseGraph[source]

Parameters:

directed – If True the graph is treated as directed.
connection – If "strong", the connected components will all be strongly connected together. If "weak" the connected components will be weakly connected. If directed == False the parameter is ignored and the graph will be treated as strongly connected.

in_degree(index: int) → int[source]

incoming_neighbors(index: int) → ndarray[source]

out_degree(index: int) → int[source]

outgoing_neighbors(index: int) → ndarray[source]

remove_indices(indices: List[int]) → SparseGraph[source]

Algorithms

sparsegraph.alg.distance_from(graph: SparseGraph, start_node_idx: int, *, verbose=False) → ndarray[Any, dtype[int64]]

Breadth first search to find the distance from a start node to all other nodes in the graph.

Parameters:

graph – A SparseGraph graph.
start_node_idx – The index of the node to start the search from.
verbose – If True, display a progress bar.

sparsegraph.alg.katz_centrality(graph: SparseGraph, *, alpha: float = 0.1, beta: float = 1, max_iter: int = 10000, tol: float = 1e-06, normalized: bool = True, verbose: bool = False)

Find Katz centrality defined by \(C_{\textrm{Katz}}(i) = \sum_j (I_{ij} + \alpha A_{ij} + \alpha^2 A_{ij}^2 + \alpha^3 A_{ij}^3 + \dots)\)

Parameters:

graph – A SparseGraph graph.
alpha – The attenuation factor. The attenuation factor must satisfy \(\alpha < \frac{1}{\max(\lambda_1, \dots, \lambda_n)}\) or the algorithm will not converge.
beta – The constant factor. Defaults to 1.
max_iter – The maximum number of iterations to perform before raising a RuntimeError. Defaults to 10,000.
tol – The tolerance for convergence. Defaults to 1.0e-6.
normalized – If True the centrality scores are normalized.
verbose – If True a progress bar is displayed during the power iteration.

sparsegraph.alg.betweenness_centrality(graph: SparseGraph, *, normalized=False, verbose=False)

Finds the betweenness centrality defined by \(C_{\textrm{betweenness}}(i) = \sum_{s \neq i \neq t} \frac{\sigma_{st}(i)}{\sigma_{st}}\) where \(\sigma_{st}\) is the number of shortest paths from \(s\) to \(t\) and \(\sigma_{st}(i)\) is the number of shortest paths from \(s\) to \(t\) that pass through \(i\).

Parameters:

graph – A SparseGraph graph.
normalized – If True the centrality scores are normalized.
verbose – If True a progress bar is displayed during the power iteration.

References

Brandes (2000). A faster algorithm for betweenness centrality, https://doi.org/10.1080/0022250X.2001.9990249.

sparsegraph.alg.estimate_closeness_centrality(graph: SparseGraph, *, k: int = 10000) → ndarray[Any, dtype[float64]]

Finds an estimate of closeness centrality defined by \(C_{\textrm{closeness}}(v) = \frac{n-1}{\sum_{u}d(u,v)}.\) where \(d(u,v)\) is the shortest path distance between \(u\) and \(v\).

Parameters:

graph – A SparseGraph graph.
k – The number of random starting points to use. A larger value will produce a more accurate estimate.

References

Eppstein and Wang (2000). Fast approximation of centrality. https://doi.org/10.48550/arXiv.cs/0009005

sparsegraph.alg.estimate_radius_and_diameter(graph: SparseGraph, *, k: int = 1000, verbose: bool = False) → tuple[int, int]

Estimates the radius and diameter of a graph by iteratively sampling nodes and running a breadth first search from the furthest node.

Parameters:

graph – A SparseGraph graph.
k – The number of nodes to sample.

Returns:

(radius, diameter)

References

Boitmanis et. al (2006). Fast and Simple Approximation of the Diameter and Radius of a Graph. https://doi.org/10.1007/11764298_9