Plumb Installation

Introduction

To create a working Plumb installation you’ll need a YARN/HDFS cluster (comes with Apache Hadoop, you’ll need version 3.1 or above) and a MySQL server (MariaDB is recommended). Please refer to the Hadoop/MariaDB documentation on how to install them.

In order to run Plumb you’ll need to set up two daemons:

  1. HLE (Hard-Link Emulation) daemon. Written in Python. We recommend using a Python virtual environment to install dependencies and run this daemon. In this guide we’ll refer to it as hled.

  2. Plumb Application Server. Written in Java. We’ll refer to it as plumbd. It can be run on the same or different system as hled.

For simplicity, we’ll run both daemons under the same user id as the one configured to run Hadoop. In this document, we’ll install code under /opt/plumb:

mkdir /opt/plumb
cd /opt/plumb
wget https://ant.isi.edu/software/plumb/plumb-1.5.2.tar.gz
tar zxvf plumb-1.52.tar.gz

The above will create /opt/plumb/lander_hard_link_emulation and /opt/plumb/BigDataProcessing directories containing HLE and Plumb codebases respectively.

Installing and Running HLED

Create a MySQL Database

A dedicated MySQL database that will house all of HLE records must be created. Run the following script on the server running the database giving three parameters:

lander_hard_link_emulation/scripts/create_database.sh <dbname> <username> <password>

When prompted, enter the server’s mysql root-user password.

Create Python Virtual Environment

We recommend installing HLED under python3 virtual environment. To set it up, you can run:

lander_hard_link_emulation/scripts/create_venv.sh [VENV_PATH]

This will create a virtual environment under VENV_PATH given (VENV_PATH defaults to ./venv).

If you’re installing under (existing directory) /opt/plumb, you’ll run:

cd /opt/plumb
lander_hard_link_emulation/scripts/create_venv.sh

Configure HLED System

Copy and edit the HLED system configuration:

cp lander_hard_link_emulation/wip/conf/sysConfInfo.py-template lander_hard_link_emulation/wip/conf/sysConfInfo.py
#edit lander_hard_link_emulation/wip/conf/sysConfInfo.py

Edit the values marked xxx to record the details of your installation. In particular, update the values of <dbname>, <username>, and <password> to the values you entered when you created the database above.

Configure HLED User Details

Copy and edit the HLED (admin) user configuration:

cp lander_hard_link_emulation/wip/conf/usrConfInfo.py-template lander_hard_link_emulation/wip/conf/usrConfInfo.py
#edit lander_hard_link_emulation/wip/conf/usrConfInfo.py

Edit the values marked xxx to record the details of your installation. (TODO: add more details here)

Initialize the Database

Run:

cd lander_hard_link_emulation/wip/conf/
VENV_PATH/bin/python3 sysInit.py

You’ll need to use the value for VENV_PATH used when creating the virtual environment.

Generate PLUMB keys

The system and users need keys to use the service. Key generation is described in a separate key generation document. All individual users of PLUMB as well as the admin user must have keys to be able to use the system.

Configure PLUMB System

Copy and edit the plumbd configuration:

cp BigDataProcessing/JobServer/ApplicationServer/config-template BigDataProcessing/JobServer/ApplicationServer/config
#edit BigDataProcessing/JobServer/ApplicationServer/config

Edit the values marked xxx to record the details of your installation.

Compile PLUMB Java Code

These commands build JARs necessary to run plumbd:

cd BigDataProcessing/JobServer/ApplicationServer
make

(This assumes you have java-jdk, maven, xxx-anything-else? installed OR should we distribute pre-built code?)

Systemd Daemon Setup

We recommend using systemd to start/stop hled and plumbd. Templates for these daemons are included in the distribution.

First, edit systemd/hled.service and systemd/plumbd.service files and fill in user and group names to those used to run plumb (we recommend using the same user/group as that of the Yarn/Hdfs superuser). When this is done:

  1. Put them into place:

    sudo cp systemd/*.service /etc/systemd/system/

  2. Reload systemd list of daemons:

    sudo systemctl daemon-reload

  3. Start HLE and Plumb daemons using systemd:

    sudo systemctl start hled sudo systemctl start plumbd