Graph#
- class Graph(num_shards=None)#
Bases:
GraphView
A temporal graph with event semantics.
- Parameters:
num_shards (int, optional) – The number of locks to use in the storage to allow for multithreaded updates.
Methods:
add_constant_properties
(properties)Adds static properties to the graph.
add_edge
(timestamp, src, dst[, properties, ...])Adds a new edge with the given source and destination nodes and properties to the graph.
add_node
(timestamp, id[, properties, ...])Adds a new node with the given id and properties to the graph.
add_properties
(timestamp, properties[, ...])Adds properties to the graph.
cache
(path)Write Graph to cache file and initialise the cache.
create_node
(timestamp, id[, properties, ...])Creates a new node with the given id and properties to the graph.
deserialise
(bytes)Load Graph from serialised bytes.
edge
(src, dst)Gets the edge with the specified source and destination nodes
View graph with event semantics
from_parquet
(graph_dir)Read graph from parquet files
Returns all the node types in the graph.
import_edge
(edge[, merge])Import a single edge into the graph.
import_edge_as
(edge, new_id[, merge])Import a single edge into the graph with new id.
import_edges
(edges[, merge])Import multiple edges into the graph.
import_edges_as
(edges, new_ids[, merge])Import multiple edges into the graph with new ids.
import_node
(node[, merge])Import a single node into the graph.
import_node_as
(node, new_id[, merge])Import a single node into the graph with new id.
import_nodes
(nodes[, merge])Import multiple nodes into the graph.
import_nodes_as
(nodes, new_ids[, merge])Import multiple nodes into the graph with new ids.
Gives the large connected component of a graph.
load_cached
(path)Load Graph from a file and initialise it as a cache file.
load_edge_props_from_pandas
(df, src, dst[, ...])Load edge properties from a Pandas DataFrame.
load_edge_props_from_parquet
(parquet_path, ...)Load edge properties from parquet file
load_edges_from_pandas
(df, time, src, dst[, ...])Load edges from a Pandas DataFrame into the graph.
load_edges_from_parquet
(parquet_path, time, ...)Load edges from a Parquet file into the graph.
load_from_file
(path)Load Graph from a file.
load_node_props_from_pandas
(df, id[, ...])Load node properties from a Pandas DataFrame.
load_node_props_from_parquet
(parquet_path, id)Load node properties from a parquet file.
load_nodes_from_pandas
(df, time, id[, ...])Load nodes from a Pandas DataFrame into the graph.
load_nodes_from_parquet
(parquet_path, time, id)Load nodes from a Parquet file into the graph.
node
(id)Gets the node with the specified id
View graph with persistent semantics
save_to_file
(path)Saves the Graph to the given path.
save_to_zip
(path)Saves the Graph to the given path.
Serialise Graph to bytes.
to_parquet
(graph_dir)Persist graph to parquet files
update_constant_properties
(properties)Updates static properties to the graph.
Persist the new updates by appending them to the cache file.
- add_constant_properties(properties)#
Adds static properties to the graph.
- add_edge(timestamp, src, dst, properties=None, layer=None, secondary_index=None)#
Adds a new edge with the given source and destination nodes and properties to the graph.
- Parameters:
- Returns:
The added edge.
- Return type:
- Raises:
GraphError – If the operation fails.
- add_node(timestamp, id, properties=None, node_type=None, secondary_index=None)#
Adds a new node with the given id and properties to the graph.
- Parameters:
- Returns:
The added node.
- Return type:
- Raises:
GraphError – If the operation fails.
- add_properties(timestamp, properties, secondary_index=None)#
Adds properties to the graph.
- Parameters:
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- cache(path)#
Write Graph to cache file and initialise the cache.
Future updates are tracked. Use write_updates to persist them to the cache file. If the file already exists its contents are overwritten.
- create_node(timestamp, id, properties=None, node_type=None, secondary_index=None)#
Creates a new node with the given id and properties to the graph. It fails if the node already exists.
- Parameters:
- Returns:
The created node.
- Return type:
- Raises:
GraphError – If the operation fails.
- static deserialise(bytes)#
Load Graph from serialised bytes.
- edge(src, dst)#
Gets the edge with the specified source and destination nodes
- Parameters:
- Returns:
the edge with the specified source and destination nodes, or None if the edge does not exist
- Return type:
- event_graph()#
View graph with event semantics
- Returns:
the graph with event semantics applied
- Return type:
- static from_parquet(graph_dir)#
Read graph from parquet files
- get_all_node_types()#
Returns all the node types in the graph.
- import_edge(edge, merge=False)#
Import a single edge into the graph.
- Parameters:
edge (Edge) – A Edge object representing the edge to be imported.
merge (bool) – An optional boolean flag. Defaults to False. If merge is False, the function will return an error if the imported edge already exists in the graph. If merge is True, the function merges the histories of the imported edge and the existing edge (in the graph).
- Returns:
An Edge object if the edge was successfully imported.
- Return type:
- Raises:
GraphError – If the operation fails.
- import_edge_as(edge, new_id, merge=False)#
Import a single edge into the graph with new id.
- Parameters:
edge (Edge) – A Edge object representing the edge to be imported.
new_id (tuple) – The ID of the new edge. It’s a tuple of the source and destination node ids.
merge (bool) – An optional boolean flag. Defaults to False. If merge is False, the function will return an error if the imported edge already exists in the graph. If merge is True, the function merges the histories of the imported edge and the existing edge (in the graph).
- Returns:
An Edge object if the edge was successfully imported.
- Return type:
- Raises:
GraphError – If the operation fails.
- import_edges(edges, merge=False)#
Import multiple edges into the graph.
- Parameters:
edges (List[Edge]) – A list of Edge objects representing the edges to be imported.
merge (bool) – An optional boolean flag. Defaults to False. If merge is False, the function will return an error if any of the imported edges already exists in the graph. If merge is True, the function merges the histories of the imported edges and the existing edges (in the graph).
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- import_edges_as(edges, new_ids, merge=False)#
Import multiple edges into the graph with new ids.
- Parameters:
edges (List[Edge]) – A list of Edge objects representing the edges to be imported.
new_ids (List[Tuple[int, int]]) – The IDs of the new edges. It’s a vector of tuples of the source and destination node ids.
merge (bool) – An optional boolean flag. Defaults to False. If merge is False, the function will return an error if any of the imported edges already exists in the graph. If merge is True, the function merges the histories of the imported edges and the existing edges (in the graph).
- Returns:
This function does not return a value if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- import_node(node, merge=False)#
Import a single node into the graph.
- Parameters:
node (Node) – A Node object representing the node to be imported.
merge (bool) – An optional boolean flag. Defaults to False. If merge is False, the function will return an error if the imported node already exists in the graph. If merge is True, the function merges the histories of the imported node and the existing node (in the graph).
- Returns:
A node object if the node was successfully imported.
- Return type:
- Raises:
GraphError – If the operation fails.
- import_node_as(node, new_id, merge=False)#
Import a single node into the graph with new id.
- Parameters:
node (Node) – A Node object representing the node to be imported.
merge (bool) – An optional boolean flag. Defaults to False. If merge is False, the function will return an error if the imported node already exists in the graph. If merge is True, the function merges the histories of the imported node and the existing node (in the graph).
- Returns:
A node object if the node was successfully imported.
- Return type:
- Raises:
GraphError – If the operation fails.
- import_nodes(nodes, merge=False)#
Import multiple nodes into the graph.
- Parameters:
nodes (List[Node]) – A vector of Node objects representing the nodes to be imported.
merge (bool) – An optional boolean flag. Defaults to False. If merge is False, the function will return an error if any of the imported nodes already exists in the graph. If merge is True, the function merges the histories of the imported nodes and the existing nodes (in the graph).
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- import_nodes_as(nodes, new_ids, merge=False)#
Import multiple nodes into the graph with new ids.
- Parameters:
nodes (List[Node]) – A vector of Node objects representing the nodes to be imported.
new_ids (List[str|int]) – A list of node IDs to use for the imported nodes.
merge (bool) – An optional boolean flag. Defaults to False. If merge is True, the function will return an error if any of the imported nodes already exists in the graph. If merge is False, the function merges the histories of the imported nodes and the existing nodes (in the graph).
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- largest_connected_component()#
Gives the large connected component of a graph.
# Example Usage: g.largest_connected_component()
- Returns:
sub-graph of the graph g containing the largest connected component
- Return type:
- static load_cached(path)#
Load Graph from a file and initialise it as a cache file.
Future updates are tracked. Use write_updates to persist them to the cache file.
- load_edge_props_from_pandas(df, src, dst, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#
Load edge properties from a Pandas DataFrame.
- Parameters:
df (DataFrame) – The Pandas DataFrame containing edge information.
src (str) – The column name for the source node.
dst (str) – The column name for the destination node.
constant_properties (List[str], optional) – List of constant edge property column names. Defaults to None.
shared_constant_properties (PropInput, optional) – A dictionary of constant properties that will be added to every edge. Defaults to None.
layer (str, optional) – The edge layer name. Defaults to None.
layer_col (str, optional) – The edge layer col name in dataframe. Defaults to None.
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- load_edge_props_from_parquet(parquet_path, src, dst, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#
Load edge properties from parquet file
- Parameters:
parquet_path (str) – Parquet file or directory of Parquet files path containing edge information.
src (str) – The column name for the source node.
dst (str) – The column name for the destination node.
constant_properties (List[str], optional) – List of constant edge property column names. Defaults to None.
shared_constant_properties (PropInput, optional) – A dictionary of constant properties that will be added to every edge. Defaults to None.
layer (str, optional) – The edge layer name. Defaults to None.
layer_col (str, optional) – The edge layer col name in dataframe. Defaults to None.
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- load_edges_from_pandas(df, time, src, dst, properties=None, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#
Load edges from a Pandas DataFrame into the graph.
- Parameters:
df (DataFrame) – The Pandas DataFrame containing the edges.
time (str) – The column name for the update timestamps.
src (str) – The column name for the source node ids.
dst (str) – The column name for the destination node ids.
properties (List[str], optional) – List of edge property column names. Defaults to None.
constant_properties (List[str], optional) – List of constant edge property column names. Defaults to None.
shared_constant_properties (PropInput, optional) – A dictionary of constant properties that will be added to every edge. Defaults to None.
layer (str, optional) – A constant value to use as the layer for all edges. Defaults to None. (cannot be used in combination with layer_col)
layer_col (str, optional) – The edge layer col name in dataframe. Defaults to None. (cannot be used in combination with layer)
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- load_edges_from_parquet(parquet_path, time, src, dst, properties=None, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#
Load edges from a Parquet file into the graph.
- Parameters:
parquet_path (str) – Parquet file or directory of Parquet files path containing edges
time (str) – The column name for the update timestamps.
src (str) – The column name for the source node ids.
dst (str) – The column name for the destination node ids.
properties (List[str], optional) – List of edge property column names. Defaults to None.
constant_properties (List[str], optional) – List of constant edge property column names. Defaults to None.
shared_constant_properties (PropInput, optional) – A dictionary of constant properties that will be added to every edge. Defaults to None.
layer (str, optional) – A constant value to use as the layer for all edges. Defaults to None. (cannot be used in combination with layer_col)
layer_col (str, optional) – The edge layer col name in dataframe. Defaults to None. (cannot be used in combination with layer)
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- static load_from_file(path)#
Load Graph from a file.
- load_node_props_from_pandas(df, id, node_type=None, node_type_col=None, constant_properties=None, shared_constant_properties=None)#
Load node properties from a Pandas DataFrame.
- Parameters:
df (DataFrame) – The Pandas DataFrame containing node information.
id (str) – The column name for the node IDs.
node_type (str, optional) – A constant value to use as the node type for all nodes. Defaults to None. (cannot be used in combination with node_type_col)
node_type_col (str, optional) – The node type col name in dataframe. Defaults to None. (cannot be used in combination with node_type)
constant_properties (List[str], optional) – List of constant node property column names. Defaults to None.
shared_constant_properties (PropInput, optional) – A dictionary of constant properties that will be added to every node. Defaults to None.
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- load_node_props_from_parquet(parquet_path, id, node_type=None, node_type_col=None, constant_properties=None, shared_constant_properties=None)#
Load node properties from a parquet file.
- Parameters:
parquet_path (str) – Parquet file or directory of Parquet files path containing node information.
id (str) – The column name for the node IDs.
node_type (str, optional) – A constant value to use as the node type for all nodes. Defaults to None. (cannot be used in combination with node_type_col)
node_type_col (str, optional) – The node type col name in dataframe. Defaults to None. (cannot be used in combination with node_type)
constant_properties (List[str], optional) – List of constant node property column names. Defaults to None.
shared_constant_properties (PropInput, optional) – A dictionary of constant properties that will be added to every node. Defaults to None.
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- load_nodes_from_pandas(df, time, id, node_type=None, node_type_col=None, properties=None, constant_properties=None, shared_constant_properties=None)#
Load nodes from a Pandas DataFrame into the graph.
- Parameters:
df (DataFrame) – The Pandas DataFrame containing the nodes.
time (str) – The column name for the timestamps.
id (str) – The column name for the node IDs.
node_type (str, optional) – A constant value to use as the node type for all nodes. Defaults to None. (cannot be used in combination with node_type_col)
node_type_col (str, optional) – The node type col name in dataframe. Defaults to None. (cannot be used in combination with node_type)
properties (List[str], optional) – List of node property column names. Defaults to None.
constant_properties (List[str], optional) – List of constant node property column names. Defaults to None.
shared_constant_properties (PropInput, optional) – A dictionary of constant properties that will be added to every node. Defaults to None.
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- load_nodes_from_parquet(parquet_path, time, id, node_type=None, node_type_col=None, properties=None, constant_properties=None, shared_constant_properties=None)#
Load nodes from a Parquet file into the graph.
- Parameters:
parquet_path (str) – Parquet file or directory of Parquet files containing the nodes
time (str) – The column name for the timestamps.
id (str) – The column name for the node IDs.
node_type (str, optional) – A constant value to use as the node type for all nodes. Defaults to None. (cannot be used in combination with node_type_col)
node_type_col (str, optional) – The node type col name in dataframe. Defaults to None. (cannot be used in combination with node_type)
properties (List[str], optional) – List of node property column names. Defaults to None.
constant_properties (List[str], optional) – List of constant node property column names. Defaults to None.
shared_constant_properties (PropInput, optional) – A dictionary of constant properties that will be added to every node. Defaults to None.
- Returns:
This function does not return a value, if the operation is successful.
- Return type:
- Raises:
GraphError – If the operation fails.
- node(id)#
Gets the node with the specified id
- Parameters:
- Returns:
The node object with the specified id, or None if the node does not exist
- Return type:
- persistent_graph()#
View graph with persistent semantics
- Returns:
the graph with persistent semantics applied
- Return type:
- save_to_file(path)#
Saves the Graph to the given path.
- save_to_zip(path)#
Saves the Graph to the given path.
- to_parquet(graph_dir)#
Persist graph to parquet files
- Parameters:
graph_dir (str | PathLike) – the folder where the graph will be persisted as parquet
- update_constant_properties(properties)#
Updates static properties to the graph.