graphi.graph_io.csv module

Utilities for loading and storing Graphs as csv

CSV Graph Format

The CSV graph format uses an optional header to define nodes, and a body storing the adjacency matrix. By default, a graph with n nodes is stored as a matrix literal of n columns and n+1 rows:

a  b  c  d
0  2  1  0
2  0  3  2
1  4  0  0
0  1  3  0

Separators and formatting are handled by the csv Dialect. Value conversion and interpretation is handled by the appropriate reader/writer.

Reading Graphs

Graphs can be read using graph_reader() from iterables of CSV lines, such as files or str.splitlines. The csv itself is parsed using a csv.reader(), which allows setting the CSV dialect.

from graphi.graph_io import csv

literal = """\
a, b, c
0, 2, 3
2, 0, 2
1, 2, 0
"""

graph = csv.graph_reader(
    literal.splitlines(),
    skipinitialspace=True,
)
for nodes in graph:
    print(repr(node), "=>", graph[nodes])
class graphi.graph_io.csv.DistanceMatrixLiteral

Bases: csv.Dialect

CSV dialect for a Graph Matrix Literal, suitable for numeric data and string literals

A graph with alphabetic node names and numeric values would look like this:

 a   b   c
 0   2 1.3
 2   0  .5
16  .5   1
delimiter = ' '
doublequote = False
escapechar = '\\'
lineterminator = '\n'
quotechar = "'"
quoting = 0
skipinitialspace = True
exception graphi.graph_io.csv.ParserError(error, row, column=None)

Bases: exceptions.Exception

Error during parsing of a graph from a csv

graphi.graph_io.csv.graph_reader(iterable, nodes_header=True, literal_type=<function stripped_literal>, valid_edge=<type 'bool'>, undirected=False, value_bound=None, *args, **kwargs)

Load a graph from files or iterables

Parameters:
  • iterable – an iterable yielding lines of CSV, such as an open file
  • nodes_header – whether and how to interpret a header specifying nodes
  • literal_type – type callable to evaluate literals
  • valid_edge – callable to test whether an edge should be inserted
  • undirected – whether to mirror the underlying matrix
  • value_bound – whether and how much the underlying edge values are bounded

The iterable argument can be any object that returns a line of input for each iteration step, such as a file object or a list of strings.

Nodes are created depending on the value of nodes_header:

False
Nodes are numbered 1 to len(iterable[0]). Elements in the first line of iterable are not consumed by this.
iterable
Nodes are read from node_header.
True
Nodes are taken as the elements of the first line of iterable. The first line is consumed by this, and not considered as containing graph edges. Nodes are read plainly of type :py:class:str, not using literal_type.
callable
Like True, but nodes are not taken as plain str() but individually interpreted via node_header(element).

The CSV is interpreted as a matrix, where the row marks the origin of an edge and the column marks the destination. Thus, loops are specified on the diagonal, while an asymmetric matrix creates different edge values for opposite directions. For an undirected graph, the matrix is automatically treated as symmetric. Trailing empty lines may be removed.

In the following example, the edges a:b and a:c are symmetric and there are no edges or self-loops a:a or b:b. In contrast, b:c is 3 whereas c:b is 4, and there is a self-loop c:c. The node d only has an ingoing edge b:d, but no outgoing edges:

a  b  c  d
0  2  1  0
2  0  3  2
1  4  1  0

If undirected evaluates to True, the upper right corner is mirrored to the lower left. Note that the diagonal must be provided. The following matrices give the same output if symmetric is True:

a  b  c    a  b  c    a  b  c
0  2  1    0  2  1    0  2  1
2  0  3       0  3    5  0  3
1  4  1          1    7     1

Each value is evaluated and filtered by literal_type and valid_edge:

graphi.graph_io.csv.literal_type(literal) → object

Fields read from the csv are passed to literal_type directly as the sole argument. The return value is considered as final, and inserted into the graph without further conversions.

graphi.graph_io.csv.valid_edge(object) → bool

Similarly, valid_edge is called on the result of literal_type. The default is bool(), which should work for most data types.

The default for literal_type is capable of handling regular python literals, e.g. int, float and str. In combination with valid_edge, any literal of non-True values signifies a missing edge: None, False, 0 etc.

See:All *args and **kwargs are passed on directly to csv.reader for extracting lines.
graphi.graph_io.csv.stripped_literal(literal)

evaluate literals, ignoring leading/trailing whitespace

This function is capable of handling all literals supported by ast.literal_eval(), even if they are surrounded by whitespace.