graphi.graph_io.csv module¶
Utilities for loading and storing Graphs as csv
CSV Graph Format¶
The CSV graph format uses an optional header to define nodes, and a body storing the adjacency matrix.
By default, a graph with n
nodes is stored as a matrix literal of n
columns and n+1
rows:
a b c d
0 2 1 0
2 0 3 2
1 4 0 0
0 1 3 0
Separators and formatting are handled by the csv Dialect
.
Value conversion and interpretation is handled by the appropriate reader/writer.
Reading Graphs¶
Graphs can be read using graph_reader()
from iterables of CSV lines, such as files or str.splitlines.
The csv itself is parsed using a csv.reader()
, which allows setting the CSV dialect.
from graphi.graph_io import csv
literal = """\
a, b, c
0, 2, 3
2, 0, 2
1, 2, 0
"""
graph = csv.graph_reader(
literal.splitlines(),
skipinitialspace=True,
)
for nodes in graph:
print(repr(node), "=>", graph[nodes])
-
class
graphi.graph_io.csv.
DistanceMatrixLiteral
¶ Bases:
csv.Dialect
CSV dialect for a Graph Matrix Literal, suitable for numeric data and string literals
A graph with alphabetic node names and numeric values would look like this:
a b c 0 2 1.3 2 0 .5 16 .5 1
-
delimiter
= ' '¶
-
doublequote
= False¶
-
escapechar
= '\\'¶
-
lineterminator
= '\n'¶
-
quotechar
= "'"¶
-
quoting
= 0¶
-
skipinitialspace
= True¶
-
-
exception
graphi.graph_io.csv.
ParserError
(error, row, column=None)¶ Bases:
exceptions.Exception
Error during parsing of a graph from a csv
-
graphi.graph_io.csv.
graph_reader
(iterable, nodes_header=True, literal_type=<function stripped_literal>, valid_edge=<type 'bool'>, undirected=False, value_bound=None, *args, **kwargs)¶ Load a graph from files or iterables
Parameters: - iterable – an iterable yielding lines of CSV, such as an open file
- nodes_header – whether and how to interpret a header specifying nodes
- literal_type – type callable to evaluate literals
- valid_edge – callable to test whether an edge should be inserted
- undirected – whether to mirror the underlying matrix
- value_bound – whether and how much the underlying edge values are bounded
The
iterable
argument can be any object that returns a line of input for each iteration step, such as a file object or a list of strings.Nodes are created depending on the value of
nodes_header
:False
- Nodes are numbered
1
tolen(iterable[0])
. Elements in the first line ofiterable
are not consumed by this. - iterable
- Nodes are read from
node_header
. True
- Nodes are taken as the elements of the first line of
iterable
. The first line is consumed by this, and not considered as containing graph edges. Nodes are read plainly of type :py:class:str
, not usingliteral_type
. - callable
- Like
True
, but nodes are not taken as plainstr()
but individually interpreted vianode_header(element)
.
The CSV is interpreted as a matrix, where the row marks the origin of an edge and the column marks the destination. Thus, loops are specified on the diagonal, while an asymmetric matrix creates different edge values for opposite directions. For an
undirected
graph, the matrix is automatically treated as symmetric. Trailing empty lines may be removed.In the following example, the edges
a:b
anda:c
are symmetric and there are no edges or self-loopsa:a
orb:b
. In contrast,b:c
is 3 whereasc:b
is4
, and there is a self-loopc:c
. The noded
only has an ingoing edgeb:d
, but no outgoing edges:a b c d 0 2 1 0 2 0 3 2 1 4 1 0
If
undirected
evaluates toTrue
, the upper right corner is mirrored to the lower left. Note that the diagonal must be provided. The following matrices give the same output ifsymmetric
isTrue
:a b c a b c a b c 0 2 1 0 2 1 0 2 1 2 0 3 0 3 5 0 3 1 4 1 1 7 1
Each value is evaluated and filtered by
literal_type
andvalid_edge
:-
graphi.graph_io.csv.
literal_type
(literal) → object¶ Fields read from the csv are passed to literal_type directly as the sole argument. The return value is considered as final, and inserted into the graph without further conversions.
-
graphi.graph_io.csv.
valid_edge
(object) → bool¶ Similarly, valid_edge is called on the result of literal_type. The default is
bool()
, which should work for most data types.
The default for
literal_type
is capable of handling regular python literals, e.g.int
,float
andstr
. In combination with valid_edge, any literal of non-True values signifies a missing edge: None, False, 0 etc.See: All *args
and**kwargs
are passed on directly tocsv.reader
for extracting lines.
-
graphi.graph_io.csv.
stripped_literal
(literal)¶ evaluate literals, ignoring leading/trailing whitespace
This function is capable of handling all literals supported by
ast.literal_eval()
, even if they are surrounded by whitespace.