Skip to content

Architecture and performance

Alberto Cottica edited this page Feb 23, 2016 · 3 revisions

There are three main stages to EdgeSense.

  1. Drupal runs a database query and saves the results as JSON files. This is done through Views Datasource.
  2. These JSON files are crunched to build the network and compute network analytics. This is performed by a Python script using the NetworkX library. The script builds and computes a network for each week of life of the community. The computed networks are saved as a new set of JSON files. This is done asynchronously as a cron job.
  3. Upon loading the page, a JavaScript script displays the result. A force layout algorithm (Force Atlas 2) is applied to the network for visualization purposes.

In our demo (1,200 users, 500 posts, 4,200 comments, 260 network nodes, 1600 network edges), step 2 takes about 90 seconds to compute. Other examples are a larger real-life community with about 400 nodes and over 2100 edges where the time to process the network is about 2 minutes, while for a smaller community with just over 100 nodes and 300 edges it takes just 20 seconds to produce the metrics from the raw json files.

N.B. step 3 takes about 10 seconds, and it is fixed: it's a time limited threshold choosen as a good trade off for the force layout algorithm (it should possibly be extracted to a configuration parameter to be installation-adptable)