Respawn

Respawn is a distributed time-series datastore which enables multi-resolution browsing and querying in Mortar.io. More information about Respawn and its performance can be found in the following paper: Respawn: A Distributed Multi-Resolution Time-Series Datastore.

Dependencies

Respawn requires the SleekXMPP Python library to be installed: http://github.com/fritzy/SleekXMPP/zipball/1.3.1

wget -O sleekxmpp.zip http://github.com/fritzy/SleekXMPP/zipball/1.3.1
unzip sleekxmpp.zip
cd fritzy-SleekXMPP
sudo python setup.py install

Running Respawn with a Virtual Sensor

This section explains how to start a Respawn instance and have it store a time-series stream from a virtual sensor. The code necessary for building and running Respawn can be found in the mio repository under the ~/mio/services/datastore directory. To deploy Respawn in a distributed storage configuration, simply repeat these instructions on multiple machines and use a shared XMPP server.

Building Respawn

Note it may be necessary to change your platform from linux-gcc in the last line of json-cpp-build.sh.

cd ~/mio/services/datastore/bt_datastore
./json-cpp-build.sh
make
cd ../

Running Respawn

Respawn requires access to an account on an XMPP server. Edit start-store.sh (uncomment the third line and add your XMPP login information). Then start the storage daemon.

./start-store.sh

Running Virtual Sensor

Create an event node with name fakenode and subscribe to it.

cd ~/mio/tools/py-tools/
./mio.py -j [USER]@[HOST] -p [PASSWORD] create fakenode
./mio.py -j [USER]@[HOST] -p [PASSWORD] subscribe fakenode

Start the dummy publisher, which will publish simulated sensor values to fakenode.

./dummy_publisher.py-j [USER]@[HOST] -p [PASSWORD] -m data fakenode

Browsing Data

Using a web browser (Firefox recommended for Respawn), navigate to port 4720 on the machine running Respawn and click Go button. (For example: http://localhost:4720)

Querying Respawn Meta-Information

Distributed storage in Respawn makes full use of the flexibility offered by the XMPP event node model. Multiple Respawn daemons, running on one or more machines are managed through the storage items of event nodes. The storage items can be queried directly for the addresses of active Respawn instances.

cd ~/mio/tools/py-tools/
./mio.py -j [USER]@[HOST] -p [PASSWORD] get fakenode storage
<item id="storage">
   <addresses>
      <address link="http://localhost:4720"/>
      <address link="http://backup.campus.edu:4720"/>
   </addresses>
</item>

The addresses in the storage item can be used to retrieve datastore meta-information in each instance's info.json file.

curl http://localhost:4720/info.json
{"channel_specs":{
   "fakenode.sine":{
      "channel_bounds":{
         "max_time":1415828037.216853,
         "min_time":1415827864.369920,
         "max_value":0.9945220,
         "min_value":-0.9945220
      }
   }
}}

Querying Data

Using channel bounds as a guide, data can be retrieved from a channel by requesting data tiles. A tile is uniquely addressed by the tuple (level, offset). The two values can be calculated from a desired Unix timestamp and a desired down-sampling period (in seconds).

level = log2(period)
offset = timestamp / (512 * period)

The level and offset must be integers. To retrieve the tile starting at timestamp=1415823360 and down-sampling-period=16 seconds, for example, one must request the tile with level=4 and offset=172830. This request is shown below.

curl http://localhost:4720/tiles/1/fakenode.sine/4.172830.json
{"data":[
   [1415827864.369920,0.0,0.0,1.0],
   [1415827865.590476,0.4067370,0.0,1.0],
   [1415827866.609964,0.7431449,0.0,1.0],
   ...
]}

dummynode.png (65.2 KB) Max Buevich, 11/12/2014 09:48 PM