samedi 27 juin 2015

python graph-tool load csv file

I'm loading directed weighted graph from csv file into graph-tool graph in python. The organization of the input csv file is:

1,2,300

2,4,432

3,89,1.24

...

Where the fist two entries of a line identify source and target of an edge and the third number is the weight of the edge.

Currently I'm using:

g = gt.Graph()
e_weight = g.new_edge_property("float")
csv_network = open (in_file_directory+ '/'+network_input, 'r')
csv_data_n = csv_network.readlines()
for line in csv_data_n:
    edge = line.replace('\r\n','')
    edge = edge.split(delimiter)
    e = g.add_edge(edge[0], edge[1])
    e_weight[e] = float(edge[2])

However it takes quite long to load the data (I have network of 10 millions of nodes and it takes about 45 min). I have tried to make it faster by using g.add_edge_list, but this works only for unweighted graphs. Any suggestion how to make it faster?

Aucun commentaire:

Enregistrer un commentaire