I thought it would be a good idea to explain how to obtain the images of C. elegans in the first chapter. The data is taken from the Worm Atlas Project and drawn in Gephi 0.9.2 using the Fruchterman-Reingold layout.
The data includes a spreadsheet named NeuronConnect.xls, which lists 6418 links between 279 distinct neurons. Once duplicate edges are removed, 2993 distinct directed links remain. Of the 302 neurons in the C. elegans nervous system, 20 are part of the pharyngeal system and are not included, two have no connections, and one connects only to muscle tissue, leaving 279. T
he cited data source does not include data on the
pharyngeal system; that data can be obtained from WormWeb.
More details can be found in this GitHub repository, including links to the specific input data files used,
Jupyter notebooks for processing the data to make it ready to plot, and the
Gephi files used to create the plot itself.
The process starts with four input files obtained from two different sources.
- The first two files are called NeuronConnect.txt and name_neurons.txt, and they both come from wormweb.org. Links to both files and more information can be found in their details page. The file NeuronConnect.txt contains a list of the connections in the connectome – each line lists two neuron names and a connection type. The file name_neurons.txt has one line for each neuron, listing its name, group and type.
- The third file
is called synapse_AF.txt, and contains a list of connections within the worm’s
pharyngeal system. These connections are typically not included in the
connectome, so this file helps to identify them so that they can be removed.
- The last file is called CElegansNeuronTables-connectome.csv and was obtained from the CElegansNeuroML repository maintained by openworm, which is “An open-source project dedicated to creating a virtual C. elegans nematode in a computer.” The original file there is called CElegansNeuronTables.xls and has been updated since we first downloaded it. The file in the networks repository has been converted from xls (Excel) format to csv (Comma-separated values) to make it easier to read from a Jupyter notebook.
In the Jupyter notebook, we do a fair amount of bookkeeping
in order to resolve the actual number of neurons (302) and connections (2290 if
we regard the graph as undirected, 2993 if we regard it as directed) in the
worm’s non-pharyngeal connectome. We also generate four output files:
- non-pharyngeal-edges.csv, which lists the edges
in the non-pharyngeal system, assuming that the graph is undirected;
- non-pharyngeal-directed-edges.csv, which lists
the edges in the non-pharyngeal system, assuming that the graph is directed;
- in-degree-dist.csv, which contains the in-degree
distribution for the graph assuming that it is directed; and
- out-degree-dist.csv, which contains the out-degree distribution for the graph assuming that it is directed.
The first two files were used to create the Gephi files c-elegans-undirected.gephi and c-elegans-new-directed.gephi, the second of which was used to create the plot of the connectome graph in the book. The second two files were used to create the plots of the in-degree and out-degree distributions in the book; the actual Jupyter notebook used to generate the plots is also included in the repository.