Graph#

class raphtory.Graph(num_shards=None)#

Bases: GraphView

A temporal graph with event semantics.

Parameters:

num_shards (int, optional) – The number of locks to use in the storage to allow for multithreaded updates.

Methods:

add_constant_properties(properties)

Adds static properties to the graph.

add_edge(timestamp, src, dst[, properties, ...])

Adds a new edge with the given source and destination nodes and properties to the graph.

add_node(timestamp, id[, properties, ...])

Adds a new node with the given id and properties to the graph.

add_properties(timestamp, properties[, ...])

Adds properties to the graph.

after(start)

Create a view of the GraphView including all events after start (exclusive).

at(time)

Create a view of the GraphView including all events at time.

before(end)

Create a view of the GraphView including all events before end (exclusive).

cache(path)

Write Graph to cache file and initialise the cache.

cache_view()

Applies the filters to the graph and retains the node ids and the edge ids in the graph that satisfy the filters creates bitsets per layer for nodes and edges

count_edges()

Number of edges in the graph

count_nodes()

Number of nodes in the graph

count_temporal_edges()

Number of edges in the graph

create_node(timestamp, id[, properties, ...])

Creates a new node with the given id and properties to the graph.

default_layer()

Return a view of GraphView containing only the default edge layer :returns: The layered view :rtype: GraphView

deserialise(bytes)

Load Graph from serialised bytes.

edge(src, dst)

Gets the edge with the specified source and destination nodes

event_graph()

exclude_layer(name)

Return a view of GraphView containing all layers except the excluded name Errors if any of the layers do not exist.

exclude_layers(names)

Return a view of GraphView containing all layers except the excluded names Errors if any of the layers do not exist.

exclude_nodes(nodes)

Returns a subgraph given a set of nodes that are excluded from the subgraph

exclude_valid_layer(name)

Return a view of GraphView containing all layers except the excluded name :param name: layer name that is excluded for the new view :type name: str

exclude_valid_layers(names)

Return a view of GraphView containing all layers except the excluded names :param names: list of layer names that are excluded for the new view :type names: list[str]

expanding(step)

Creates a WindowSet with the given step size using an expanding window.

filter_edges(filter)

Return a filtered view that only includes edges that satisfy the filter

filter_exploded_edges(filter)

Return a filtered view that only includes exploded edges that satisfy the filter

filter_nodes(filter)

Return a filtered view that only includes nodes that satisfy the filter

find_edges(properties_dict)

Get the edges that match the properties name and value :param property_dict: the properties name and value :type property_dict: dict[str, Prop]

find_nodes(properties_dict)

Get the nodes that match the properties name and value :param property_dict: the properties name and value :type property_dict: dict[str, Prop]

get_all_node_types()

Returns all the node types in the graph.

has_edge(src, dst)

Returns true if the graph contains the specified edge

has_layer(name)

Check if GraphView has the layer "name"

has_node(id)

Returns true if the graph contains the specified node

import_edge(edge[, merge])

Import a single edge into the graph.

import_edge_as(edge, new_id[, merge])

Import a single edge into the graph with new id.

import_edges(edges[, merge])

Import multiple edges into the graph.

import_edges_as(edges, new_ids[, merge])

Import multiple edges into the graph with new ids.

import_node(node[, merge])

Import a single node into the graph.

import_node_as(node, new_id[, merge])

Import a single node into the graph with new id.

import_nodes(nodes[, merge])

Import multiple nodes into the graph.

import_nodes_as(nodes, new_ids[, merge])

Import multiple nodes into the graph with new ids.

index()

Indexes all node and edge properties.

largest_connected_component()

Gives the large connected component of a graph.

latest()

Create a view of the GraphView including all events at the latest time.

layer(name)

Return a view of GraphView containing the layer "name" Errors if the layer does not exist

layers(names)

Return a view of GraphView containing all layers names Errors if any of the layers do not exist.

load_cached(path)

Load Graph from a file and initialise it as a cache file.

load_edge_props_from_pandas(df, src, dst[, ...])

Load edge properties from a Pandas DataFrame.

load_edge_props_from_parquet(parquet_path, ...)

Load edge properties from parquet file

load_edges_from_pandas(df, time, src, dst[, ...])

Load edges from a Pandas DataFrame into the graph.

load_edges_from_parquet(parquet_path, time, ...)

Load edges from a Parquet file into the graph.

load_from_file(path)

Load Graph from a file.

load_node_props_from_pandas(df, id[, ...])

Load node properties from a Pandas DataFrame.

load_node_props_from_parquet(parquet_path, id)

Load node properties from a parquet file.

load_nodes_from_pandas(df, time, id[, ...])

Load nodes from a Pandas DataFrame into the graph.

load_nodes_from_parquet(parquet_path, time, id)

Load nodes from a Parquet file into the graph.

materialize()

Returns a 'materialized' clone of the graph view - i.e. a new graph with a copy of the data seen within the view instead of just a mask over the original graph.

node(id)

Gets the node with the specified id

persistent_graph()

Get persistent graph

rolling(window[, step])

Creates a WindowSet with the given window size and optional step using a rolling window.

save_to_file(path)

Saves the Graph to the given path.

save_to_zip(path)

Saves the Graph to the given path.

serialise()

Serialise Graph to bytes.

shrink_end(end)

Set the end of the window to the smaller of end and self.end()

shrink_start(start)

Set the start of the window to the larger of start and self.start()

shrink_window(start, end)

Shrink both the start and end of the window (same as calling shrink_start followed by shrink_end but more efficient)

snapshot_at(time)

Create a view of the GraphView including all events that have not been explicitly deleted at time.

snapshot_latest()

Create a view of the GraphView including all events that have not been explicitly deleted at the latest time.

subgraph(nodes)

Returns a subgraph given a set of nodes

subgraph_node_types(node_types)

Returns a subgraph filtered by node types given a set of node types

to_networkx([explode_edges, ...])

Returns a graph with NetworkX.

to_pyvis([explode_edges, edge_color, shape, ...])

Draw a graph with PyVis.

update_constant_properties(properties)

Updates static properties to the graph.

valid_layers(names)

Return a view of GraphView containing all layers names Any layers that do not exist are ignored

vectorise(embedding[, cache, ...])

Create a VectorisedGraph from the current graph

window(start, end)

Create a view of the GraphView including all events between start (inclusive) and end (exclusive)

write_updates()

Persist the new updates by appending them to the cache file.

Attributes:

earliest_date_time

DateTime of earliest activity in the graph

earliest_time

Timestamp of earliest activity in the graph

edges

Gets all edges in the graph

end

Gets the latest time that this GraphView is valid.

end_date_time

Gets the latest datetime that this GraphView is valid

latest_date_time

DateTime of latest activity in the graph

latest_time

Timestamp of latest activity in the graph

nodes

Gets the nodes in the graph

properties

Get all graph properties

start

Gets the start time for rolling and expanding windows for this GraphView

start_date_time

Gets the earliest datetime that this GraphView is valid

unique_layers

Return all the layer ids in the graph

window_size

Get the window size (difference between start and end) for this GraphView

add_constant_properties(properties)#

Adds static properties to the graph.

Parameters:

properties (PropInput) – The static properties of the graph.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

add_edge(timestamp, src, dst, properties=None, layer=None, secondary_index=None)#

Adds a new edge with the given source and destination nodes and properties to the graph.

Parameters:
  • timestamp (TimeInput) – The timestamp of the edge.

  • src (str|int) – The id of the source node.

  • dst (str|int) – The id of the destination node.

  • properties (PropInput, optional) – The properties of the edge, as a dict of string and properties.

  • layer (str, optional) – The layer of the edge. secondary_index (int, optional): The optional integer which will be used as a secondary index

Returns:

The added edge.

Return type:

MutableEdge

Raises:

GraphError – If the operation fails.

add_node(timestamp, id, properties=None, node_type=None, secondary_index=None)#

Adds a new node with the given id and properties to the graph.

Parameters:
  • timestamp (TimeInput) – The timestamp of the node.

  • id (str|int) – The id of the node.

  • properties (PropInput, optional) – The properties of the node.

  • node_type (str, optional) – The optional string which will be used as a node type

  • secondary_index (int, optional) – The optional integer which will be used as a secondary index

Returns:

The added node.

Return type:

MutableNode

Raises:

GraphError – If the operation fails.

add_properties(timestamp, properties, secondary_index=None)#

Adds properties to the graph.

Parameters:
  • timestamp (TimeInput) – The timestamp of the temporal property.

  • properties (PropInput) – The temporal properties of the graph.

  • secondary_index (int, optional) – The optional integer which will be used as a secondary index

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

after(start)#

Create a view of the GraphView including all events after start (exclusive).

Parameters:

start (TimeInput) – The start time of the window.

Returns:

GraphView

at(time)#

Create a view of the GraphView including all events at time.

Parameters:

time (TimeInput) – The time of the window.

Returns:

GraphView

before(end)#

Create a view of the GraphView including all events before end (exclusive).

Parameters:

end (TimeInput) – The end time of the window.

Returns:

GraphView

cache(path)#

Write Graph to cache file and initialise the cache.

Future updates are tracked. Use write_updates to persist them to the cache file. If the file already exists its contents are overwritten.

Parameters:

path (str) – The path to the cache file

cache_view()#

Applies the filters to the graph and retains the node ids and the edge ids in the graph that satisfy the filters creates bitsets per layer for nodes and edges

Returns:

Returns the masked graph

Return type:

MaskedGraph

count_edges()#

Number of edges in the graph

Returns:

the number of edges in the graph

Return type:

int

count_nodes()#

Number of nodes in the graph

Returns:

the number of nodes in the graph

Return type:

int

count_temporal_edges()#

Number of edges in the graph

Returns:

the number of temporal edges in the graph

Return type:

int

create_node(timestamp, id, properties=None, node_type=None, secondary_index=None)#

Creates a new node with the given id and properties to the graph. It fails if the node already exists.

Parameters:
  • timestamp (TimeInput) – The timestamp of the node.

  • id (str|int) – The id of the node.

  • properties (PropInput, optional) – The properties of the node.

  • node_type (str, optional) – The optional string which will be used as a node type

  • secondary_index (int, optional) – The optional integer which will be used as a secondary index

Returns:

The created node.

Return type:

MutableNode

Raises:

GraphError – If the operation fails.

default_layer()#

Return a view of GraphView containing only the default edge layer :returns: The layered view :rtype: GraphView

static deserialise(bytes)#

Load Graph from serialised bytes.

Parameters:

bytes (bytes) – The serialised bytes to decode

Returns:

Graph

earliest_date_time#

DateTime of earliest activity in the graph

Returns:

the datetime of the earliest activity in the graph

Return type:

Optional[Datetime]

earliest_time#

Timestamp of earliest activity in the graph

Returns:

the timestamp of the earliest activity in the graph

Return type:

Optional[int]

edge(src, dst)#

Gets the edge with the specified source and destination nodes

Parameters:
  • src (str|int) – the source node id

  • dst (str|int) – the destination node id

Returns:

the edge with the specified source and destination nodes, or None if the edge does not exist

Return type:

Edge

edges#

Gets all edges in the graph

Returns:

the edges in the graph

Return type:

Edges

end#

Gets the latest time that this GraphView is valid.

Returns:

The latest time that this GraphView is valid or None if the GraphView is valid for all times.

Return type:

Optional[int]

end_date_time#

Gets the latest datetime that this GraphView is valid

Returns:

The latest datetime that this GraphView is valid or None if the GraphView is valid for all times.

Return type:

Optional[Datetime]

event_graph()#
exclude_layer(name)#

Return a view of GraphView containing all layers except the excluded name Errors if any of the layers do not exist.

Parameters:

name (str) – layer name that is excluded for the new view

Returns:

The layered view

Return type:

GraphView

exclude_layers(names)#

Return a view of GraphView containing all layers except the excluded names Errors if any of the layers do not exist.

Parameters:

names (list[str]) – list of layer names that are excluded for the new view

Returns:

The layered view

Return type:

GraphView

exclude_nodes(nodes)#

Returns a subgraph given a set of nodes that are excluded from the subgraph

Parameters:

nodes (list[InputNode]) – set of nodes

Returns:

Returns the subgraph

Return type:

GraphView

exclude_valid_layer(name)#

Return a view of GraphView containing all layers except the excluded name :param name: layer name that is excluded for the new view :type name: str

Returns:

The layered view

Return type:

GraphView

exclude_valid_layers(names)#

Return a view of GraphView containing all layers except the excluded names :param names: list of layer names that are excluded for the new view :type names: list[str]

Returns:

The layered view

Return type:

GraphView

expanding(step)#

Creates a WindowSet with the given step size using an expanding window.

An expanding window is a window that grows by step size at each iteration.

Parameters:

step (int | str) – The step size of the window.

Returns:

A WindowSet object.

Return type:

WindowSet

filter_edges(filter)#

Return a filtered view that only includes edges that satisfy the filter

Arguments
filter (PropertyFilter): The filter to apply to the edge properties. Construct a

filter using Prop.

Returns:

The filtered view

Return type:

GraphView

filter_exploded_edges(filter)#

Return a filtered view that only includes exploded edges that satisfy the filter

Parameters:

filter (PropertyFilter) – The filter to apply to the exploded edge properties. Construct a filter using Prop.

Returns:

The filtered view

Return type:

GraphView

filter_nodes(filter)#

Return a filtered view that only includes nodes that satisfy the filter

Arguments
filter (PropertyFilter): The filter to apply to the node properties. Construct a

filter using Prop.

Returns:

The filtered view

Return type:

GraphView

find_edges(properties_dict)#

Get the edges that match the properties name and value :param property_dict: the properties name and value :type property_dict: dict[str, Prop]

Returns:

the edges that match the properties name and value

Return type:

list[Edge]

find_nodes(properties_dict)#

Get the nodes that match the properties name and value :param property_dict: the properties name and value :type property_dict: dict[str, Prop]

Returns:

the nodes that match the properties name and value

Return type:

list[Node]

get_all_node_types()#

Returns all the node types in the graph.

Returns: List[str]

has_edge(src, dst)#

Returns true if the graph contains the specified edge

Parameters:
  • src (str or int) – the source node id

  • dst (str or int) – the destination node id

Returns:

true if the graph contains the specified edge, false otherwise

Return type:

bool

has_layer(name)#

Check if GraphView has the layer “name”

Parameters:

name (str) – the name of the layer to check

Returns:

bool

has_node(id)#

Returns true if the graph contains the specified node

Parameters:

id (str or int) – the node id

Returns:

true if the graph contains the specified node, false otherwise

Return type:

bool

import_edge(edge, merge=False)#

Import a single edge into the graph.

Parameters:
  • edge (Edge) – A Edge object representing the edge to be imported.

  • merge (bool) – An optional boolean flag. If merge is false, the function will return an error if the imported edge already exists in the graph. If merge is true, the function merges the histories of the imported edge and the existing edge (in the graph).

Returns:

An EdgeView object if the edge was successfully imported.

Return type:

EdgeView

Raises:

GraphError – If the operation fails.

import_edge_as(edge, new_id, merge=False)#

Import a single edge into the graph with new id.

Parameters:
  • edge (Edge) – A Edge object representing the edge to be imported.

  • new_id (tuple) – The ID of the new edge. It’s a tuple of the source and destination node ids.

  • merge (bool) – An optional boolean flag. If merge is false, the function will return an error if the imported edge already exists in the graph. If merge is true, the function merges the histories of the imported edge and the existing edge (in the graph).

Returns:

An EdgeView object if the edge was successfully imported.

Return type:

EdgeView

Raises:

GraphError – If the operation fails.

import_edges(edges, merge=False)#

Import multiple edges into the graph.

Parameters:
  • edges (List[Edge]) – A list of Edge objects representing the edges to be imported.

  • merge (bool) – An optional boolean flag. If merge is false, the function will return an error if any of the imported edges already exists in the graph. If merge is true, the function merges the histories of the imported edges and the existing edges (in the graph).

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

import_edges_as(edges, new_ids, merge=False)#

Import multiple edges into the graph with new ids.

Parameters:
  • edges (List[Edge]) – A list of Edge objects representing the edges to be imported.

  • new_ids (List[tuple])

  • merge (bool) – An optional boolean flag. If merge is false, the function will return an error if any of the imported edges already exists in the graph. If merge is true, the function merges the histories of the imported edges and the existing edges (in the graph).

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

import_node(node, merge=False)#

Import a single node into the graph.

Parameters:
  • node (Node) – A Node object representing the node to be imported.

  • merge (bool) – An optional boolean flag. If merge is false, the function will return an error if the imported node already exists in the graph. If merge is true, the function merges the histories of the imported node and the existing node (in the graph).

Returns:

A node object if the node was successfully imported.

Return type:

Node

Raises:

GraphError – If the operation fails.

import_node_as(node, new_id, merge=False)#

Import a single node into the graph with new id.

Parameters:
  • node (Node) – A Node object representing the node to be imported.

  • new_id (str|int) – The new node id.

  • merge (bool) – An optional boolean flag. If merge is false, the function will return an error if the imported node already exists in the graph. If merge is true, the function merges the histories of the imported node and the existing node (in the graph).

Returns:

A node object if the node was successfully imported.

Return type:

Node

Raises:

GraphError – If the operation fails.

import_nodes(nodes, merge=False)#

Import multiple nodes into the graph.

Parameters:
  • nodes (List[Node]) – A vector of Node objects representing the nodes to be imported.

  • merge (bool) – An optional boolean flag. If merge is false, the function will return an error if any of the imported nodes already exists in the graph. If merge is true, the function merges the histories of the imported nodes and the existing nodes (in the graph).

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

import_nodes_as(nodes, new_ids, merge=False)#

Import multiple nodes into the graph with new ids.

Parameters:
  • nodes (List[Node]) – A vector of Node objects representing the nodes to be imported.

  • new_ids (List[str|int]) – A list of node IDs to use for the imported nodes.

  • merge (bool) – An optional boolean flag. If merge is false, the function will return an error if any of the imported nodes already exists in the graph. If merge is true, the function merges the histories of the imported nodes and the existing nodes (in the graph).

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

index()#

Indexes all node and edge properties. Returns a GraphIndex which allows the user to search the edges and nodes of the graph via tantivity fuzzy matching queries. Note this is currently immutable and will not update if the graph changes. This is to be improved in a future release.

Returns:

GraphIndex - Returns a GraphIndex

largest_connected_component()#

Gives the large connected component of a graph.

# Example Usage: g.largest_connected_component()

# Returns: Graph: sub-graph of the graph g containing the largest connected component

latest()#

Create a view of the GraphView including all events at the latest time.

Returns:

GraphView

latest_date_time#

DateTime of latest activity in the graph

Returns:

the datetime of the latest activity in the graph

Return type:

Optional[Datetime]

latest_time#

Timestamp of latest activity in the graph

Returns:

the timestamp of the latest activity in the graph

Return type:

Optional[int]

layer(name)#

Return a view of GraphView containing the layer “name” Errors if the layer does not exist

Parameters:

name (str) – then name of the layer.

Returns:

The layered view

Return type:

GraphView

layers(names)#

Return a view of GraphView containing all layers names Errors if any of the layers do not exist.

Parameters:

names (list[str]) – list of layer names for the new view

Returns:

The layered view

Return type:

GraphView

static load_cached(path)#

Load Graph from a file and initialise it as a cache file.

Future updates are tracked. Use write_updates to persist them to the cache file.

Parameters:

path (str) – The path to the cache file

Returns:

Graph

load_edge_props_from_pandas(df, src, dst, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#

Load edge properties from a Pandas DataFrame.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing edge information.

  • src (str) – The column name for the source node.

  • dst (str) – The column name for the destination node.

  • constant_properties (List[str]) – List of constant edge property column names. Defaults to None. (optional)

  • shared_constant_properties (PropInput) – A dictionary of constant properties that will be added to every edge. Defaults to None. (optional)

  • layer (str) – The edge layer name (optional) Defaults to None.

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_edge_props_from_parquet(parquet_path, src, dst, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#

Load edge properties from parquet file

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files path containing edge information.

  • src (str) – The column name for the source node.

  • dst (str) – The column name for the destination node.

  • constant_properties (List[str]) – List of constant edge property column names. Defaults to None. (optional)

  • shared_constant_properties (PropInput) – A dictionary of constant properties that will be added to every edge. Defaults to None. (optional)

  • layer (str) – The edge layer name (optional) Defaults to None.

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_edges_from_pandas(df, time, src, dst, properties=None, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#

Load edges from a Pandas DataFrame into the graph.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing the edges.

  • time (str) – The column name for the update timestamps.

  • src (str) – The column name for the source node ids.

  • dst (str) – The column name for the destination node ids.

  • properties (List[str]) – List of edge property column names. Defaults to None. (optional)

  • constant_properties (List[str]) – List of constant edge property column names. Defaults to None. (optional)

  • shared_constant_properties (PropInput) – A dictionary of constant properties that will be added to every edge. Defaults to None. (optional)

  • layer (str) – A constant value to use as the layer for all edges (optional) Defaults to None. (cannot be used in combination with layer_col)

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None. (cannot be used in combination with layer)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_edges_from_parquet(parquet_path, time, src, dst, properties=None, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#

Load edges from a Parquet file into the graph.

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files path containing edges

  • time (str) – The column name for the update timestamps.

  • src (str) – The column name for the source node ids.

  • dst (str) – The column name for the destination node ids.

  • properties (List[str]) – List of edge property column names. Defaults to None. (optional)

  • constant_properties (List[str]) – List of constant edge property column names. Defaults to None. (optional)

  • shared_constant_properties (PropInput) – A dictionary of constant properties that will be added to every edge. Defaults to None. (optional)

  • layer (str) – A constant value to use as the layer for all edges (optional) Defaults to None. (cannot be used in combination with layer_col)

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None. (cannot be used in combination with layer)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

static load_from_file(path)#

Load Graph from a file.

Parameters:

path (str) – The path to the file.

Returns:

Graph

load_node_props_from_pandas(df, id, node_type=None, node_type_col=None, constant_properties=None, shared_constant_properties=None)#

Load node properties from a Pandas DataFrame.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing node information.

  • id (str) – The column name for the node IDs.

  • node_type (str) – A constant value to use as the node type for all nodes (optional). Defaults to None. (cannot be used in combination with node_type_col)

  • node_type_col (str) – The node type col name in dataframe (optional) Defaults to None. (cannot be used in combination with node_type)

  • constant_properties (List[str]) – List of constant node property column names. Defaults to None. (optional)

  • shared_constant_properties (PropInput) – A dictionary of constant properties that will be added to every node. Defaults to None. (optional)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_node_props_from_parquet(parquet_path, id, node_type=None, node_type_col=None, constant_properties=None, shared_constant_properties=None)#

Load node properties from a parquet file.

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files path containing node information.

  • id (str) – The column name for the node IDs.

  • node_type (str) – A constant value to use as the node type for all nodes (optional). Defaults to None. (cannot be used in combination with node_type_col)

  • node_type_col (str) – The node type col name in dataframe (optional) Defaults to None. (cannot be used in combination with node_type)

  • constant_properties (List[str]) – List of constant node property column names. Defaults to None. (optional)

  • shared_constant_properties (PropInput) – A dictionary of constant properties that will be added to every node. Defaults to None. (optional)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_nodes_from_pandas(df, time, id, node_type=None, node_type_col=None, properties=None, constant_properties=None, shared_constant_properties=None)#

Load nodes from a Pandas DataFrame into the graph.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing the nodes.

  • time (str) – The column name for the timestamps.

  • id (str) – The column name for the node IDs.

  • node_type (str) – A constant value to use as the node type for all nodes (optional). Defaults to None. (cannot be used in combination with node_type_col)

  • node_type_col (str) – The node type col name in dataframe (optional) Defaults to None. (cannot be used in combination with node_type)

  • properties (List[str]) – List of node property column names. Defaults to None. (optional)

  • constant_properties (List[str]) – List of constant node property column names. Defaults to None. (optional)

  • shared_constant_properties (PropInput) – A dictionary of constant properties that will be added to every node. Defaults to None. (optional)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_nodes_from_parquet(parquet_path, time, id, node_type=None, node_type_col=None, properties=None, constant_properties=None, shared_constant_properties=None)#

Load nodes from a Parquet file into the graph.

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files containing the nodes

  • time (str) – The column name for the timestamps.

  • id (str) – The column name for the node IDs.

  • node_type (str) – A constant value to use as the node type for all nodes (optional). Defaults to None. (cannot be used in combination with node_type_col)

  • node_type_col (str) – The node type col name in dataframe (optional) Defaults to None. (cannot be used in combination with node_type)

  • properties (List[str]) – List of node property column names. Defaults to None. (optional)

  • constant_properties (List[str]) – List of constant node property column names. Defaults to None. (optional)

  • shared_constant_properties (PropInput) – A dictionary of constant properties that will be added to every node. Defaults to None. (optional)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

materialize()#

Returns a ‘materialized’ clone of the graph view - i.e. a new graph with a copy of the data seen within the view instead of just a mask over the original graph

Returns:

Returns a graph clone

Return type:

GraphView

node(id)#

Gets the node with the specified id

Parameters:

id (str|int) – the node id

Returns:

The node object with the specified id, or None if the node does not exist

Return type:

Node

nodes#

Gets the nodes in the graph

Returns:

the nodes in the graph

Return type:

Nodes

persistent_graph()#

Get persistent graph

properties#

Get all graph properties

Returns:

Properties paired with their names

Return type:

Properties

rolling(window, step=None)#

Creates a WindowSet with the given window size and optional step using a rolling window.

A rolling window is a window that moves forward by step size at each iteration.

Parameters:
  • window (int | str) – The size of the window.

  • step (int | str | None) – The step size of the window. step defaults to window.

Returns:

A WindowSet object.

Return type:

WindowSet

save_to_file(path)#

Saves the Graph to the given path.

Parameters:

path (str) – The path to the file.

save_to_zip(path)#

Saves the Graph to the given path.

Parameters:

path (str) – The path to the file.

serialise()#

Serialise Graph to bytes.

Returns:

bytes

shrink_end(end)#

Set the end of the window to the smaller of end and self.end()

Parameters:

end (TimeInput) – the new end time of the window

Returns:

GraphView

shrink_start(start)#

Set the start of the window to the larger of start and self.start()

Parameters:

start (TimeInput) – the new start time of the window

Returns:

GraphView

shrink_window(start, end)#

Shrink both the start and end of the window (same as calling shrink_start followed by shrink_end but more efficient)

Parameters:
  • start (TimeInput) – the new start time for the window

  • end (TimeInput) – the new end time for the window

snapshot_at(time)#

Create a view of the GraphView including all events that have not been explicitly deleted at time.

This is equivalent to before(time + 1) for EventGraph`s and `at(time) for `PersitentGraph`s

Parameters:

time (TimeInput) – The time of the window.

Returns:

GraphView

snapshot_latest()#

Create a view of the GraphView including all events that have not been explicitly deleted at the latest time.

This is equivalent to a no-op for EventGraph`s and `latest() for `PersitentGraph`s

Returns:

GraphView

start#

Gets the start time for rolling and expanding windows for this GraphView

Returns:

The earliest time that this GraphView is valid or None if the GraphView is valid for all times.

Return type:

Optional[int]

start_date_time#

Gets the earliest datetime that this GraphView is valid

Returns:

The earliest datetime that this GraphView is valid or None if the GraphView is valid for all times.

Return type:

Optional[Datetime]

subgraph(nodes)#

Returns a subgraph given a set of nodes

Parameters:

nodes (list[InputNode]) – set of nodes

Returns:

Returns the subgraph

Return type:

GraphView

subgraph_node_types(node_types)#

Returns a subgraph filtered by node types given a set of node types

Parameters:

node_types (list[str]) – set of node types

Returns:

Returns the subgraph

Return type:

GraphView

to_networkx(explode_edges=False, include_node_properties=True, include_edge_properties=True, include_update_history=True, include_property_history=True)#

Returns a graph with NetworkX.

Network X is a required dependency. If you intend to use this function make sure that you install Network X with pip install networkx

Parameters:
  • explode_edges (bool) – A boolean that is set to True if you want to explode the edges in the graph. By default this is set to False.

  • include_node_properties (bool) – A boolean that is set to True if you want to include the node properties in the graph. By default this is set to True.

  • include_edge_properties (bool) – A boolean that is set to True if you want to include the edge properties in the graph. By default this is set to True.

  • include_update_history (bool) – A boolean that is set to True if you want to include the update histories in the graph. By default this is set to True.

  • include_property_history (bool) – A boolean that is set to True if you want to include the histories in the graph. By default this is set to True.

Returns:

A Networkx MultiDiGraph.

to_pyvis(explode_edges=False, edge_color='#000000', shape=None, node_image=None, edge_weight=None, edge_label=None, colour_nodes_by_type=False, notebook=False, **kwargs)#

Draw a graph with PyVis. Pyvis is a required dependency. If you intend to use this function make sure that you install Pyvis with pip install pyvis

Args:

graph (graph): A Raphtory graph. explode_edges (bool): A boolean that is set to True if you want to explode the edges in the graph. Defaults to False. edge_color (str): A string defining the colour of the edges in the graph. Defaults to “#000000”. shape (str): An optional string defining what the node looks like. Defaults to “dot”.

There are two types of nodes. One type has the label inside of it and the other type has the label underneath it. The types with the label inside of it are: ellipse, circle, database, box, text. The ones with the label outside of it are: image, circularImage, diamond, dot, star, triangle, triangleDown, square and icon.

node_image (str, optional): An optional string defining the url of a custom node image. edge_weight (str, optional): An optional string defining the name of the property where edge weight is set on your Raphtory graph.

If not provided, the edge weight is set to 1.0 for all edges.

edge_label (str): An optional string defining the name of the property where edge label is set on your Raphtory graph. By default, an empty string as the label is set. notebook (bool): A boolean that is set to True if using jupyter notebook. Defaults to False kwargs: Additional keyword arguments that are passed to the pyvis Network class.

Returns:

A pyvis network

unique_layers#

Return all the layer ids in the graph

Returns:

list[str]

update_constant_properties(properties)#

Updates static properties to the graph.

Parameters:

properties (PropInput) – The static properties of the graph.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

valid_layers(names)#

Return a view of GraphView containing all layers names Any layers that do not exist are ignored

Parameters:

names (list[str]) – list of layer names for the new view

Returns:

The layered view

Return type:

GraphView

vectorise(embedding, cache=None, overwrite_cache=False, graph_template=None, node_template=None, edge_template=None, graph_name=None, verbose=False)#

Create a VectorisedGraph from the current graph

Parameters:
  • embedding (Callable[[list], list]) – the embedding function to translate documents to embeddings

  • cache (str) – the file to be used as a cache to avoid calling the embedding function (optional)

  • overwrite_cache (bool) – whether or not to overwrite the cache if there are new embeddings (optional)

  • graph_template (str) – the document template for the graphs (optional)

  • node_template (str) – the document template for the nodes (optional)

  • edge_template (str) – the document template for the edges (optional)

  • verbose (bool) – whether or not to print logs reporting the progress

Returns:

A VectorisedGraph with all the documents/embeddings computed and with an initial empty selection

window(start, end)#

Create a view of the GraphView including all events between start (inclusive) and end (exclusive)

Parameters:
  • start (TimeInput | None) – The start time of the window (unbounded if None).

  • end (TimeInput | None) – The end time of the window (unbounded if None).

Returns: r GraphView

window_size#

Get the window size (difference between start and end) for this GraphView

Returns:

Optional[int]

write_updates()#

Persist the new updates by appending them to the cache file.

class raphtory.PersistentGraph#

Bases: GraphView

A temporal graph that allows edges and nodes to be deleted.

Methods:

add_constant_properties(properties)

Adds static properties to the graph.

add_edge(timestamp, src, dst[, properties, ...])

Adds a new edge with the given source and destination nodes and properties to the graph.

add_node(timestamp, id[, properties, ...])

Adds a new node with the given id and properties to the graph.

add_properties(timestamp, properties[, ...])

Adds properties to the graph.

after(start)

Create a view of the GraphView including all events after start (exclusive).

at(time)

Create a view of the GraphView including all events at time.

before(end)

Create a view of the GraphView including all events before end (exclusive).

cache(path)

Write PersistentGraph to cache file and initialise the cache.

cache_view()

Applies the filters to the graph and retains the node ids and the edge ids in the graph that satisfy the filters creates bitsets per layer for nodes and edges

count_edges()

Number of edges in the graph

count_nodes()

Number of nodes in the graph

count_temporal_edges()

Number of edges in the graph

create_node(timestamp, id[, properties, ...])

Creates a new node with the given id and properties to the graph.

default_layer()

Return a view of GraphView containing only the default edge layer :returns: The layered view :rtype: GraphView

delete_edge(timestamp, src, dst[, layer, ...])

Deletes an edge given the timestamp, src and dst nodes and layer (optional)

deserialise(bytes)

Load PersistentGraph from serialised bytes.

edge(src, dst)

Gets the edge with the specified source and destination nodes

event_graph()

Get event graph

exclude_layer(name)

Return a view of GraphView containing all layers except the excluded name Errors if any of the layers do not exist.

exclude_layers(names)

Return a view of GraphView containing all layers except the excluded names Errors if any of the layers do not exist.

exclude_nodes(nodes)

Returns a subgraph given a set of nodes that are excluded from the subgraph

exclude_valid_layer(name)

Return a view of GraphView containing all layers except the excluded name :param name: layer name that is excluded for the new view :type name: str

exclude_valid_layers(names)

Return a view of GraphView containing all layers except the excluded names :param names: list of layer names that are excluded for the new view :type names: list[str]

expanding(step)

Creates a WindowSet with the given step size using an expanding window.

filter_edges(filter)

Return a filtered view that only includes edges that satisfy the filter

filter_exploded_edges(filter)

Return a filtered view that only includes exploded edges that satisfy the filter

filter_nodes(filter)

Return a filtered view that only includes nodes that satisfy the filter

find_edges(properties_dict)

Get the edges that match the properties name and value :param property_dict: the properties name and value :type property_dict: dict[str, Prop]

find_nodes(properties_dict)

Get the nodes that match the properties name and value :param property_dict: the properties name and value :type property_dict: dict[str, Prop]

get_all_node_types()

Returns all the node types in the graph.

has_edge(src, dst)

Returns true if the graph contains the specified edge

has_layer(name)

Check if GraphView has the layer "name"

has_node(id)

Returns true if the graph contains the specified node

import_edge(edge[, merge])

Import a single edge into the graph.

import_edge_as(edge, new_id[, merge])

Import a single edge into the graph with new id.

import_edges(edges[, merge])

Import multiple edges into the graph.

import_edges_as(edges, new_ids[, merge])

Import multiple edges into the graph with new ids.

import_node(node[, merge])

Import a single node into the graph.

import_node_as(node, new_id[, merge])

Import a single node into the graph with new id.

import_nodes(nodes[, merge])

Import multiple nodes into the graph.

import_nodes_as(nodes, new_ids[, merge])

Import multiple nodes into the graph with new ids.

index()

Indexes all node and edge properties.

latest()

Create a view of the GraphView including all events at the latest time.

layer(name)

Return a view of GraphView containing the layer "name" Errors if the layer does not exist

layers(names)

Return a view of GraphView containing all layers names Errors if any of the layers do not exist.

load_cached(path)

Load PersistentGraph from a file and initialise it as a cache file.

load_edge_deletions_from_pandas(df, time, ...)

Load edges deletions from a Pandas DataFrame into the graph.

load_edge_deletions_from_parquet(...[, ...])

Load edges deletions from a Parquet file into the graph.

load_edge_props_from_pandas(df, src, dst[, ...])

Load edge properties from a Pandas DataFrame.

load_edge_props_from_parquet(parquet_path, ...)

Load edge properties from parquet file

load_edges_from_pandas(df, time, src, dst[, ...])

Load edges from a Pandas DataFrame into the graph.

load_edges_from_parquet(parquet_path, time, ...)

Load edges from a Parquet file into the graph.

load_from_file(path)

Load PersistentGraph from a file.

load_node_props_from_pandas(df, id[, ...])

Load node properties from a Pandas DataFrame.

load_node_props_from_parquet(parquet_path, id)

Load node properties from a parquet file.

load_nodes_from_pandas(df, time, id[, ...])

Load nodes from a Pandas DataFrame into the graph.

load_nodes_from_parquet(parquet_path, time, id)

Load nodes from a Parquet file into the graph.

materialize()

Returns a 'materialized' clone of the graph view - i.e. a new graph with a copy of the data seen within the view instead of just a mask over the original graph.

node(id)

Gets the node with the specified id

persistent_graph()

rolling(window[, step])

Creates a WindowSet with the given window size and optional step using a rolling window.

save_to_file(path)

Saves the PersistentGraph to the given path.

save_to_zip(path)

Saves the PersistentGraph to the given path.

serialise()

Serialise PersistentGraph to bytes.

shrink_end(end)

Set the end of the window to the smaller of end and self.end()

shrink_start(start)

Set the start of the window to the larger of start and self.start()

shrink_window(start, end)

Shrink both the start and end of the window (same as calling shrink_start followed by shrink_end but more efficient)

snapshot_at(time)

Create a view of the GraphView including all events that have not been explicitly deleted at time.

snapshot_latest()

Create a view of the GraphView including all events that have not been explicitly deleted at the latest time.

subgraph(nodes)

Returns a subgraph given a set of nodes

subgraph_node_types(node_types)

Returns a subgraph filtered by node types given a set of node types

to_networkx([explode_edges, ...])

Returns a graph with NetworkX.

to_pyvis([explode_edges, edge_color, shape, ...])

Draw a graph with PyVis.

update_constant_properties(properties)

Updates static properties to the graph.

valid_layers(names)

Return a view of GraphView containing all layers names Any layers that do not exist are ignored

vectorise(embedding[, cache, ...])

Create a VectorisedGraph from the current graph

window(start, end)

Create a view of the GraphView including all events between start (inclusive) and end (exclusive)

write_updates()

Persist the new updates by appending them to the cache file.

Attributes:

earliest_date_time

DateTime of earliest activity in the graph

earliest_time

Timestamp of earliest activity in the graph

edges

Gets all edges in the graph

end

Gets the latest time that this GraphView is valid.

end_date_time

Gets the latest datetime that this GraphView is valid

latest_date_time

DateTime of latest activity in the graph

latest_time

Timestamp of latest activity in the graph

nodes

Gets the nodes in the graph

properties

Get all graph properties

start

Gets the start time for rolling and expanding windows for this GraphView

start_date_time

Gets the earliest datetime that this GraphView is valid

unique_layers

Return all the layer ids in the graph

window_size

Get the window size (difference between start and end) for this GraphView

add_constant_properties(properties)#

Adds static properties to the graph.

Parameters:

properties (dict) – The static properties of the graph.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

add_edge(timestamp, src, dst, properties=None, layer=None, secondary_index=None)#

Adds a new edge with the given source and destination nodes and properties to the graph.

Parameters:
  • timestamp (int) – The timestamp of the edge.

  • src (str | int) – The id of the source node.

  • dst (str | int) – The id of the destination node.

  • properties (dict) – The properties of the edge, as a dict of string and properties

  • layer (str) – The layer of the edge.

  • secondary_index (int, optional) – The optional integer which will be used as a secondary index

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

add_node(timestamp, id, properties=None, node_type=None, secondary_index=None)#

Adds a new node with the given id and properties to the graph.

Parameters:
  • timestamp (TimeInput) – The timestamp of the node.

  • id (str | int) – The id of the node.

  • properties (dict) – The properties of the node.

  • node_type (str) – The optional string which will be used as a node type

  • secondary_index (int, optional) – The optional integer which will be used as a secondary index

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

add_properties(timestamp, properties, secondary_index=None)#

Adds properties to the graph.

Parameters:
  • timestamp (TimeInput) – The timestamp of the temporal property.

  • properties (dict) – The temporal properties of the graph.

  • secondary_index (int, optional) – The optional integer which will be used as a secondary index

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

after(start)#

Create a view of the GraphView including all events after start (exclusive).

Parameters:

start (TimeInput) – The start time of the window.

Returns:

GraphView

at(time)#

Create a view of the GraphView including all events at time.

Parameters:

time (TimeInput) – The time of the window.

Returns:

GraphView

before(end)#

Create a view of the GraphView including all events before end (exclusive).

Parameters:

end (TimeInput) – The end time of the window.

Returns:

GraphView

cache(path)#

Write PersistentGraph to cache file and initialise the cache.

Future updates are tracked. Use write_updates to persist them to the cache file. If the file already exists its contents are overwritten.

Parameters:

path (str) – The path to the cache file

cache_view()#

Applies the filters to the graph and retains the node ids and the edge ids in the graph that satisfy the filters creates bitsets per layer for nodes and edges

Returns:

Returns the masked graph

Return type:

MaskedGraph

count_edges()#

Number of edges in the graph

Returns:

the number of edges in the graph

Return type:

int

count_nodes()#

Number of nodes in the graph

Returns:

the number of nodes in the graph

Return type:

int

count_temporal_edges()#

Number of edges in the graph

Returns:

the number of temporal edges in the graph

Return type:

int

create_node(timestamp, id, properties=None, node_type=None, secondary_index=None)#

Creates a new node with the given id and properties to the graph. It fails if the node already exists.

Parameters:
  • timestamp (TimeInput) – The timestamp of the node.

  • id (str | int) – The id of the node.

  • properties (dict) – The properties of the node.

  • node_type (str) – The optional string which will be used as a node type

  • secondary_index (int, optional) – The optional integer which will be used as a secondary index

Returns:

MutableNode

Raises:

GraphError – If the operation fails.

default_layer()#

Return a view of GraphView containing only the default edge layer :returns: The layered view :rtype: GraphView

delete_edge(timestamp, src, dst, layer=None, secondary_index=None)#

Deletes an edge given the timestamp, src and dst nodes and layer (optional)

Parameters:
  • timestamp (int) – The timestamp of the edge.

  • src (str | int) – The id of the source node.

  • dst (str | int) – The id of the destination node.

  • layer (str) – The layer of the edge. (optional)

  • secondary_index (int, optional) – The optional integer which will be used as a secondary index

Returns:

The deleted edge

Raises:

GraphError – If the operation fails.

static deserialise(bytes)#

Load PersistentGraph from serialised bytes.

Parameters:

bytes (bytes) – The serialised bytes to decode

Returns:

PersistentGraph

earliest_date_time#

DateTime of earliest activity in the graph

Returns:

the datetime of the earliest activity in the graph

Return type:

Optional[Datetime]

earliest_time#

Timestamp of earliest activity in the graph

Returns:

the timestamp of the earliest activity in the graph

Return type:

Optional[int]

edge(src, dst)#

Gets the edge with the specified source and destination nodes

Parameters:
  • src (str | int) – the source node id

  • dst (str | int) – the destination node id

Returns:

The edge with the specified source and destination nodes, or None if the edge does not exist

edges#

Gets all edges in the graph

Returns:

the edges in the graph

Return type:

Edges

end#

Gets the latest time that this GraphView is valid.

Returns:

The latest time that this GraphView is valid or None if the GraphView is valid for all times.

Return type:

Optional[int]

end_date_time#

Gets the latest datetime that this GraphView is valid

Returns:

The latest datetime that this GraphView is valid or None if the GraphView is valid for all times.

Return type:

Optional[Datetime]

event_graph()#

Get event graph

exclude_layer(name)#

Return a view of GraphView containing all layers except the excluded name Errors if any of the layers do not exist.

Parameters:

name (str) – layer name that is excluded for the new view

Returns:

The layered view

Return type:

GraphView

exclude_layers(names)#

Return a view of GraphView containing all layers except the excluded names Errors if any of the layers do not exist.

Parameters:

names (list[str]) – list of layer names that are excluded for the new view

Returns:

The layered view

Return type:

GraphView

exclude_nodes(nodes)#

Returns a subgraph given a set of nodes that are excluded from the subgraph

Parameters:

nodes (list[InputNode]) – set of nodes

Returns:

Returns the subgraph

Return type:

GraphView

exclude_valid_layer(name)#

Return a view of GraphView containing all layers except the excluded name :param name: layer name that is excluded for the new view :type name: str

Returns:

The layered view

Return type:

GraphView

exclude_valid_layers(names)#

Return a view of GraphView containing all layers except the excluded names :param names: list of layer names that are excluded for the new view :type names: list[str]

Returns:

The layered view

Return type:

GraphView

expanding(step)#

Creates a WindowSet with the given step size using an expanding window.

An expanding window is a window that grows by step size at each iteration.

Parameters:

step (int | str) – The step size of the window.

Returns:

A WindowSet object.

Return type:

WindowSet

filter_edges(filter)#

Return a filtered view that only includes edges that satisfy the filter

Arguments
filter (PropertyFilter): The filter to apply to the edge properties. Construct a

filter using Prop.

Returns:

The filtered view

Return type:

GraphView

filter_exploded_edges(filter)#

Return a filtered view that only includes exploded edges that satisfy the filter

Parameters:

filter (PropertyFilter) – The filter to apply to the exploded edge properties. Construct a filter using Prop.

Returns:

The filtered view

Return type:

GraphView

filter_nodes(filter)#

Return a filtered view that only includes nodes that satisfy the filter

Arguments
filter (PropertyFilter): The filter to apply to the node properties. Construct a

filter using Prop.

Returns:

The filtered view

Return type:

GraphView

find_edges(properties_dict)#

Get the edges that match the properties name and value :param property_dict: the properties name and value :type property_dict: dict[str, Prop]

Returns:

the edges that match the properties name and value

Return type:

list[Edge]

find_nodes(properties_dict)#

Get the nodes that match the properties name and value :param property_dict: the properties name and value :type property_dict: dict[str, Prop]

Returns:

the nodes that match the properties name and value

Return type:

list[Node]

get_all_node_types()#

Returns all the node types in the graph.

Returns:

A list of node types

has_edge(src, dst)#

Returns true if the graph contains the specified edge

Parameters:
  • src (str or int) – the source node id

  • dst (str or int) – the destination node id

Returns:

true if the graph contains the specified edge, false otherwise

Return type:

bool

has_layer(name)#

Check if GraphView has the layer “name”

Parameters:

name (str) – the name of the layer to check

Returns:

bool

has_node(id)#

Returns true if the graph contains the specified node

Parameters:

id (str or int) – the node id

Returns:

true if the graph contains the specified node, false otherwise

Return type:

bool

import_edge(edge, merge=False)#

Import a single edge into the graph.

This function takes an edge object and an optional boolean flag. If the flag is set to true, the function will merge the import of the edge even if it already exists in the graph.

Parameters:
  • edge (Edge) – An edge object representing the edge to be imported.

  • merge (bool) – An optional boolean flag indicating whether to merge the import of the edge. Defaults to False.

Returns:

The imported edge.

Return type:

Edge

Raises:

GraphError – If the operation fails.

import_edge_as(edge, new_id, merge=False)#

Import a single edge into the graph with new id.

This function takes a edge object, a new edge id and an optional boolean flag. If the flag is set to true, the function will merge the import of the edge even if it already exists in the graph.

Parameters:
  • edge (Edge) – A edge object representing the edge to be imported.

  • new_id (tuple) – The ID of the new edge. It’s a tuple of the source and destination node ids.

  • merge (bool) – An optional boolean flag indicating whether to merge the import of the edge. Defaults to False.

Returns:

The imported edge.

Return type:

Edge

Raises:

GraphError – If the operation fails.

import_edges(edges, merge=False)#

Import multiple edges into the graph.

This function takes a vector of edge objects and an optional boolean flag. If the flag is set to true, the function will merge the import of the edges even if they already exist in the graph.

Parameters:
  • edges (List[Edge]) – A vector of edge objects representing the edges to be imported.

  • merge (bool) – An optional boolean flag indicating whether to merge the import of the edges. Defaults to False.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

import_edges_as(edges, new_ids, merge=False)#

Import multiple edges into the graph with new ids.

This function takes a vector of edge objects, a list of new edge ids and an optional boolean flag. If the flag is set to true, the function will merge the import of the edges even if they already exist in the graph.

Parameters:
  • edges (List[Edge]) – A vector of edge objects representing the edges to be imported.

  • merge (bool) – An optional boolean flag indicating whether to merge the import of the edges. Defaults to False.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

import_node(node, merge=False)#

Import a single node into the graph.

This function takes a node object and an optional boolean flag. If the flag is set to true, the function will merge the import of the node even if it already exists in the graph.

Parameters:
  • node (Node) – A node object representing the node to be imported.

  • merge (bool) – An optional boolean flag indicating whether to merge the import of the node. Defaults to False.

Returns:

A nodeview object if the node was successfully imported, and an error otherwise.

Return type:

NodeView

Raises:

GraphError – If the operation fails.

import_node_as(node, new_id, merge=False)#

Import a single node into the graph with new id.

This function takes a node object, a new node id and an optional boolean flag. If the flag is set to true, the function will merge the import of the node even if it already exists in the graph.

Parameters:
  • node (Node) – A node object representing the node to be imported.

  • new_id (str|int) – The new node id.

  • merge (bool) – An optional boolean flag indicating whether to merge the import of the node. Defaults to False.

Returns:

A nodeview object if the node was successfully imported, and an error otherwise.

Return type:

NodeView

Raises:

GraphError – If the operation fails.

import_nodes(nodes, merge=False)#

Import multiple nodes into the graph.

This function takes a vector of node objects and an optional boolean flag. If the flag is set to true, the function will merge the import of the nodes even if they already exist in the graph.

Parameters:
  • nodes (List[Node]) – A vector of node objects representing the nodes to be imported.

  • merge (bool) – An optional boolean flag indicating whether to merge the import of the nodes. Defaults to False.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

import_nodes_as(nodes, new_ids, merge=False)#

Import multiple nodes into the graph with new ids.

This function takes a vector of node objects, a list of new node ids and an optional boolean flag. If the flag is set to true, the function will merge the import of the nodes even if they already exist in the graph.

Parameters:
  • nodes (List[Node]) – A vector of node objects representing the nodes to be imported.

  • new_ids (List[str|int]) – A list of node IDs to use for the imported nodes.

  • merge (bool) – An optional boolean flag indicating whether to merge the import of the nodes. Defaults to False.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

index()#

Indexes all node and edge properties. Returns a GraphIndex which allows the user to search the edges and nodes of the graph via tantivity fuzzy matching queries. Note this is currently immutable and will not update if the graph changes. This is to be improved in a future release.

Returns:

GraphIndex - Returns a GraphIndex

latest()#

Create a view of the GraphView including all events at the latest time.

Returns:

GraphView

latest_date_time#

DateTime of latest activity in the graph

Returns:

the datetime of the latest activity in the graph

Return type:

Optional[Datetime]

latest_time#

Timestamp of latest activity in the graph

Returns:

the timestamp of the latest activity in the graph

Return type:

Optional[int]

layer(name)#

Return a view of GraphView containing the layer “name” Errors if the layer does not exist

Parameters:

name (str) – then name of the layer.

Returns:

The layered view

Return type:

GraphView

layers(names)#

Return a view of GraphView containing all layers names Errors if any of the layers do not exist.

Parameters:

names (list[str]) – list of layer names for the new view

Returns:

The layered view

Return type:

GraphView

static load_cached(path)#

Load PersistentGraph from a file and initialise it as a cache file.

Future updates are tracked. Use write_updates to persist them to the cache file.

Parameters:

path (str) – The path to the cache file

Returns:

PersistentGraph

load_edge_deletions_from_pandas(df, time, src, dst, layer=None, layer_col=None)#

Load edges deletions from a Pandas DataFrame into the graph.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing the edges.

  • time (str) – The column name for the update timestamps.

  • src (str) – The column name for the source node ids.

  • dst (str) – The column name for the destination node ids.

  • layer (str) – A constant value to use as the layer for all edges (optional) Defaults to None. (cannot be used in combination with layer_col)

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None. (cannot be used in combination with layer)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_edge_deletions_from_parquet(parquet_path, time, src, dst, layer=None, layer_col=None)#

Load edges deletions from a Parquet file into the graph.

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files path containing node information.

  • src (str) – The column name for the source node ids.

  • dst (str) – The column name for the destination node ids.

  • time (str) – The column name for the update timestamps.

  • layer (str) – A constant value to use as the layer for all edges (optional) Defaults to None. (cannot be used in combination with layer_col)

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None. (cannot be used in combination with layer)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_edge_props_from_pandas(df, src, dst, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#

Load edge properties from a Pandas DataFrame.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing edge information.

  • src (str) – The column name for the source node.

  • dst (str) – The column name for the destination node.

  • constant_properties (List[str]) – List of constant edge property column names. Defaults to None. (optional)

  • shared_constant_properties (dict) – A dictionary of constant properties that will be added to every edge. Defaults to None. (optional)

  • layer (str) – The edge layer name (optional) Defaults to None.

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_edge_props_from_parquet(parquet_path, src, dst, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#

Load edge properties from parquet file

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files path containing edge information.

  • src (str) – The column name for the source node.

  • dst (str) – The column name for the destination node.

  • constant_properties (List[str]) – List of constant edge property column names. Defaults to None. (optional)

  • shared_constant_properties (dict) – A dictionary of constant properties that will be added to every edge. Defaults to None. (optional)

  • layer (str) – The edge layer name (optional) Defaults to None.

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_edges_from_pandas(df, time, src, dst, properties=None, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#

Load edges from a Pandas DataFrame into the graph.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing the edges.

  • time (str) – The column name for the update timestamps.

  • src (str) – The column name for the source node ids.

  • dst (str) – The column name for the destination node ids.

  • properties (List[str]) – List of edge property column names. Defaults to None. (optional)

  • constant_properties (List[str]) – List of constant edge property column names. Defaults to None. (optional)

  • shared_constant_properties (dict) – A dictionary of constant properties that will be added to every edge. Defaults to None. (optional)

  • layer (str) – A constant value to use as the layer for all edges (optional) Defaults to None. (cannot be used in combination with layer_col)

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None. (cannot be used in combination with layer)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_edges_from_parquet(parquet_path, time, src, dst, properties=None, constant_properties=None, shared_constant_properties=None, layer=None, layer_col=None)#

Load edges from a Parquet file into the graph.

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files path containing edges

  • time (str) – The column name for the update timestamps.

  • src (str) – The column name for the source node ids.

  • dst (str) – The column name for the destination node ids.

  • properties (List[str]) – List of edge property column names. Defaults to None. (optional)

  • constant_properties (List[str]) – List of constant edge property column names. Defaults to None. (optional)

  • shared_constant_properties (dict) – A dictionary of constant properties that will be added to every edge. Defaults to None. (optional)

  • layer (str) – A constant value to use as the layer for all edges (optional) Defaults to None. (cannot be used in combination with layer_col)

  • layer_col (str) – The edge layer col name in dataframe (optional) Defaults to None. (cannot be used in combination with layer)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

static load_from_file(path)#

Load PersistentGraph from a file.

Parameters:

path (str) – The path to the file.

Returns:

PersistentGraph

load_node_props_from_pandas(df, id, node_type=None, node_type_col=None, constant_properties=None, shared_constant_properties=None)#

Load node properties from a Pandas DataFrame.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing node information.

  • id (str) – The column name for the node IDs.

  • node_type (str) – A constant value to use as the node type for all nodes (optional). Defaults to None. (cannot be used in combination with node_type_col)

  • node_type_col (str) – The node type col name in dataframe (optional) Defaults to None. (cannot be used in combination with node_type)

  • constant_properties (List[str]) – List of constant node property column names. Defaults to None. (optional)

  • shared_constant_properties (dict) – A dictionary of constant properties that will be added to every node. Defaults to None. (optional)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_node_props_from_parquet(parquet_path, id, node_type=None, node_type_col=None, constant_properties=None, shared_constant_properties=None)#

Load node properties from a parquet file.

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files path containing node information.

  • id (str) – The column name for the node IDs.

  • node_type (str) – A constant value to use as the node type for all nodes (optional). Defaults to None. (cannot be used in combination with node_type_col)

  • node_type_col (str) – The node type col name in dataframe (optional) Defaults to None. (cannot be used in combination with node_type)

  • constant_properties (List[str]) – List of constant node property column names. Defaults to None. (optional)

  • shared_constant_properties (dict) – A dictionary of constant properties that will be added to every node. Defaults to None. (optional)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_nodes_from_pandas(df, time, id, node_type=None, node_type_col=None, properties=None, constant_properties=None, shared_constant_properties=None)#

Load nodes from a Pandas DataFrame into the graph.

Parameters:
  • df (DataFrame) – The Pandas DataFrame containing the nodes.

  • time (str) – The column name for the timestamps.

  • id (str) – The column name for the node IDs.

  • node_type (str) – A constant value to use as the node type for all nodes (optional). Defaults to None. (cannot be used in combination with node_type_col)

  • node_type_col (str) – The node type col name in dataframe (optional) Defaults to None. (cannot be used in combination with node_type)

  • properties (List[str]) – List of node property column names. Defaults to None. (optional)

  • constant_properties (List[str]) – List of constant node property column names. Defaults to None. (optional)

  • shared_constant_properties (dict) – A dictionary of constant properties that will be added to every node. Defaults to None. (optional)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

load_nodes_from_parquet(parquet_path, time, id, node_type=None, node_type_col=None, properties=None, constant_properties=None, shared_constant_properties=None)#

Load nodes from a Parquet file into the graph.

Parameters:
  • parquet_path (str) – Parquet file or directory of Parquet files containing the nodes

  • time (str) – The column name for the timestamps.

  • id (str) – The column name for the node IDs.

  • node_type (str) – A constant value to use as the node type for all nodes (optional). Defaults to None. (cannot be used in combination with node_type_col)

  • node_type_col (str) – The node type col name in dataframe (optional) Defaults to None. (cannot be used in combination with node_type)

  • properties (List[str]) – List of node property column names. Defaults to None. (optional)

  • constant_properties (List[str]) – List of constant node property column names. Defaults to None. (optional)

  • shared_constant_properties (dict) – A dictionary of constant properties that will be added to every node. Defaults to None. (optional)

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

materialize()#

Returns a ‘materialized’ clone of the graph view - i.e. a new graph with a copy of the data seen within the view instead of just a mask over the original graph

Returns:

Returns a graph clone

Return type:

GraphView

node(id)#

Gets the node with the specified id

Parameters:

id (str | int) – the node id

Returns:

The node with the specified id, or None if the node does not exist

nodes#

Gets the nodes in the graph

Returns:

the nodes in the graph

Return type:

Nodes

persistent_graph()#
properties#

Get all graph properties

Returns:

Properties paired with their names

Return type:

Properties

rolling(window, step=None)#

Creates a WindowSet with the given window size and optional step using a rolling window.

A rolling window is a window that moves forward by step size at each iteration.

Parameters:
  • window (int | str) – The size of the window.

  • step (int | str | None) – The step size of the window. step defaults to window.

Returns:

A WindowSet object.

Return type:

WindowSet

save_to_file(path)#

Saves the PersistentGraph to the given path.

Parameters:

path (str) – The path to the file.

save_to_zip(path)#

Saves the PersistentGraph to the given path.

Parameters:

path (str) – The path to the file.

serialise()#

Serialise PersistentGraph to bytes.

Returns:

bytes

shrink_end(end)#

Set the end of the window to the smaller of end and self.end()

Parameters:

end (TimeInput) – the new end time of the window

Returns:

GraphView

shrink_start(start)#

Set the start of the window to the larger of start and self.start()

Parameters:

start (TimeInput) – the new start time of the window

Returns:

GraphView

shrink_window(start, end)#

Shrink both the start and end of the window (same as calling shrink_start followed by shrink_end but more efficient)

Parameters:
  • start (TimeInput) – the new start time for the window

  • end (TimeInput) – the new end time for the window

snapshot_at(time)#

Create a view of the GraphView including all events that have not been explicitly deleted at time.

This is equivalent to before(time + 1) for EventGraph`s and `at(time) for `PersitentGraph`s

Parameters:

time (TimeInput) – The time of the window.

Returns:

GraphView

snapshot_latest()#

Create a view of the GraphView including all events that have not been explicitly deleted at the latest time.

This is equivalent to a no-op for EventGraph`s and `latest() for `PersitentGraph`s

Returns:

GraphView

start#

Gets the start time for rolling and expanding windows for this GraphView

Returns:

The earliest time that this GraphView is valid or None if the GraphView is valid for all times.

Return type:

Optional[int]

start_date_time#

Gets the earliest datetime that this GraphView is valid

Returns:

The earliest datetime that this GraphView is valid or None if the GraphView is valid for all times.

Return type:

Optional[Datetime]

subgraph(nodes)#

Returns a subgraph given a set of nodes

Parameters:

nodes (list[InputNode]) – set of nodes

Returns:

Returns the subgraph

Return type:

GraphView

subgraph_node_types(node_types)#

Returns a subgraph filtered by node types given a set of node types

Parameters:

node_types (list[str]) – set of node types

Returns:

Returns the subgraph

Return type:

GraphView

to_networkx(explode_edges=False, include_node_properties=True, include_edge_properties=True, include_update_history=True, include_property_history=True)#

Returns a graph with NetworkX.

Network X is a required dependency. If you intend to use this function make sure that you install Network X with pip install networkx

Parameters:
  • explode_edges (bool) – A boolean that is set to True if you want to explode the edges in the graph. By default this is set to False.

  • include_node_properties (bool) – A boolean that is set to True if you want to include the node properties in the graph. By default this is set to True.

  • include_edge_properties (bool) – A boolean that is set to True if you want to include the edge properties in the graph. By default this is set to True.

  • include_update_history (bool) – A boolean that is set to True if you want to include the update histories in the graph. By default this is set to True.

  • include_property_history (bool) – A boolean that is set to True if you want to include the histories in the graph. By default this is set to True.

Returns:

A Networkx MultiDiGraph.

to_pyvis(explode_edges=False, edge_color='#000000', shape=None, node_image=None, edge_weight=None, edge_label=None, colour_nodes_by_type=False, notebook=False, **kwargs)#

Draw a graph with PyVis. Pyvis is a required dependency. If you intend to use this function make sure that you install Pyvis with pip install pyvis

Args:

graph (graph): A Raphtory graph. explode_edges (bool): A boolean that is set to True if you want to explode the edges in the graph. Defaults to False. edge_color (str): A string defining the colour of the edges in the graph. Defaults to “#000000”. shape (str): An optional string defining what the node looks like. Defaults to “dot”.

There are two types of nodes. One type has the label inside of it and the other type has the label underneath it. The types with the label inside of it are: ellipse, circle, database, box, text. The ones with the label outside of it are: image, circularImage, diamond, dot, star, triangle, triangleDown, square and icon.

node_image (str, optional): An optional string defining the url of a custom node image. edge_weight (str, optional): An optional string defining the name of the property where edge weight is set on your Raphtory graph.

If not provided, the edge weight is set to 1.0 for all edges.

edge_label (str): An optional string defining the name of the property where edge label is set on your Raphtory graph. By default, an empty string as the label is set. notebook (bool): A boolean that is set to True if using jupyter notebook. Defaults to False kwargs: Additional keyword arguments that are passed to the pyvis Network class.

Returns:

A pyvis network

unique_layers#

Return all the layer ids in the graph

Returns:

list[str]

update_constant_properties(properties)#

Updates static properties to the graph.

Parameters:

properties (dict) – The static properties of the graph.

Returns:

This function does not return a value, if the operation is successful.

Return type:

None

Raises:

GraphError – If the operation fails.

valid_layers(names)#

Return a view of GraphView containing all layers names Any layers that do not exist are ignored

Parameters:

names (list[str]) – list of layer names for the new view

Returns:

The layered view

Return type:

GraphView

vectorise(embedding, cache=None, overwrite_cache=False, graph_template=None, node_template=None, edge_template=None, graph_name=None, verbose=False)#

Create a VectorisedGraph from the current graph

Parameters:
  • embedding (Callable[[list], list]) – the embedding function to translate documents to embeddings

  • cache (str) – the file to be used as a cache to avoid calling the embedding function (optional)

  • overwrite_cache (bool) – whether or not to overwrite the cache if there are new embeddings (optional)

  • graph_template (str) – the document template for the graphs (optional)

  • node_template (str) – the document template for the nodes (optional)

  • edge_template (str) – the document template for the edges (optional)

  • verbose (bool) – whether or not to print logs reporting the progress

Returns:

A VectorisedGraph with all the documents/embeddings computed and with an initial empty selection

window(start, end)#

Create a view of the GraphView including all events between start (inclusive) and end (exclusive)

Parameters:
  • start (TimeInput | None) – The start time of the window (unbounded if None).

  • end (TimeInput | None) – The end time of the window (unbounded if None).

Returns: r GraphView

window_size#

Get the window size (difference between start and end) for this GraphView

Returns:

Optional[int]

write_updates()#

Persist the new updates by appending them to the cache file.