graphi.graph_io.csv module¶
Utilities for loading and storing Graphs as csv
CSV Graph Format¶
The CSV graph format uses an optional header to define nodes, and a body storing the adjacency matrix.
By default, a graph with n nodes is stored as a matrix literal of n columns and n+1 rows:
a b c d
0 2 1 0
2 0 3 2
1 4 0 0
0 1 3 0
Separators and formatting are handled by the csv Dialect.
Value conversion and interpretation is handled by the appropriate reader/writer.
Reading Graphs¶
Graphs can be read using graph_reader() from iterables of CSV lines, such as files or str.splitlines.
The csv itself is parsed using a csv.reader(), which allows setting the CSV dialect.
from graphi.graph_io import csv
literal = """\
a, b, c
0, 2, 3
2, 0, 2
1, 2, 0
"""
graph = csv.graph_reader(
literal.splitlines(),
skipinitialspace=True,
)
for nodes in graph:
print(repr(node), "=>", graph[nodes])
-
class
graphi.graph_io.csv.DistanceMatrixLiteral¶ Bases:
csv.DialectCSV dialect for a Graph Matrix Literal, suitable for numeric data and string literals
A graph with alphabetic node names and numeric values would look like this:
a b c 0 2 1.3 2 0 .5 16 .5 1
-
delimiter= ' '¶ no explicit delimeter between fields
-
doublequote= False¶
-
escapechar= '\\'¶ use regular escaping
-
lineterminator= '\n'¶
-
quotechar= "'"¶ string values are written as “foo”, multi-values as ‘1,2,3’
-
quoting= 0¶
-
skipinitialspace= True¶ allow for alignment with arbitrary whitespace
-
-
exception
graphi.graph_io.csv.ParserError(error, row, column=None)¶ Bases:
exceptions.ExceptionError during parsing of a graph from a csv
-
graphi.graph_io.csv.graph_reader(iterable, nodes_header=True, literal_type=<function stripped_literal>, valid_edge=<type 'bool'>, undirected=False, *args, **kwargs)¶ Load a graph from files or iterables
Parameters: - iterable – an iterable yielding lines of CSV, such as an open file
- nodes_header – whether and how to interpret a header specifying nodes
- literal_type – type callable to evaluate literals
- valid_edge – callable to test whether an edge should be inserted
- undirected – whether to mirror the underlying matrix
The
iterableargument can be any object that returns a line of input for each iteration step, such as a file object or a list of strings.Nodes are created depending on the value of
nodes_header:False- Nodes are numbered
1tolen(iterable[0]). Elements in the first line ofiterableare not consumed by this. - iterable
- Nodes are read from
node_header. True- Nodes are taken as the elements of the first line of
iterable. The first line is consumed by this, and not considered as containing graph edges. Nodes are read plainly of type :py:class:str, not usingliteral_type. - callable
- Like
True, but nodes are not taken as plainstr()but individually interpreted vianode_header(element).
The CSV is interpreted as a matrix, where the row marks the origin of an edge and the column marks the destination. Thus, loops are specified on the diagonal, while an asymmetric matrix creates different edge values for opposite directions. For an
undirectedgraph, the matrix is automatically treated as symmetric. Trailing empty lines may be removed.In the following example, the edges
a:banda:care symmetric and there are no edges or self-loopsa:aorb:b. In contrast,b:cis 3 whereasc:bis4, and there is a self-loopc:c. The nodedonly has an ingoing edgeb:d, but no outgoing edges:a b c d 0 2 1 0 2 0 3 2 1 4 1 0
If
undirectedevaluates toTrue, the upper right corner is mirrored to the lower left. Note that the diagonal must be provided. The following matrices give the same output ifsymmetricisTrue:a b c a b c a b c 0 2 1 0 2 1 0 2 1 2 0 3 0 3 5 0 3 1 4 1 1 7 1
Each value is evaluated and filtered by
literal_typeandvalid_edge:-
graphi.graph_io.csv.literal_type(literal) → object¶ Fields read from the csv are passed to literal_type directly as the sole argument. The return value is considered as final, and inserted into the graph without further conversions.
-
graphi.graph_io.csv.valid_edge(object) → bool¶ Similarly, valid_edge is called on the result of literal_type. The default is
bool(), which should work for most data types.
The default for
literal_typeis capable of handling regular python literals, e.g.int,floatandstr. In combination with valid_edge, any literal of non-True values signifies a missing edge: None, False, 0 etc.See: All *argsand**kwargsare passed on directly tocsv.readerfor extracting lines.
-
graphi.graph_io.csv.stripped_literal(literal)¶ evaluate literals, ignoring leading/trailing whitespace
This function is capable of handling all literals supported by
ast.literal_eval(), even if they are surrounded by whitespace.