Creating a Beacon Server in the NeCTAR computer cloud

 

About Beacon

 

A local Beacon Server can be hosted on the NeCTAR Australian Research Computer Cloud. To set up a Beacon Server on NeCTAR, researchers need an allocation which can be requested here. Trial allocations are simple to obtain and are available to most Australian researchers for limited time periods.

 

Once you have a NeCTAR allocation you can launch a Virtual Machine (VM) with the latest Ubuntu version you will be able to deploy a Beacon server on this instance. You will be able to upload variant data to the server and convert the files into Beacon database. For security/privacy reasons do not store the original vcf files with variants on Beacon server.

 

Once set up, the Beacon server can be queried through the web interface. In addition, it can be connected to the Beacon Network. Note that the Beacon queries use 0-based positions while VCF files contain 1-based positions.

 

Detailed step-by-step protocol for deployment of a Beacon server on Ubuntu VM is provided below.  We use a Mac computer for communication with NeCTAR. Linux/Unix will be similar.

 

The Beacon installation procedure was split into four sections. Firstly, you will need to set up some security credentials for accessing your NeCTAR VM, followed by setting up and running a NeCTAR VM. Once those steps are complete you will install a Beacon Server on the VM. Finally, you will add your data to the Beacon Server for public querying.

 

1. Add your personal key to your NeCTAR project

You need  ssh keys to communicate with the VM. The procedure described below is for Mac users.  You may choose an alternative approach and create a new Key Pair in NeCTAR dashboard in Access & Security > Key Pairs. The private key will be downloaded to your computer. The downloaded private key can be used for communication with the VM deployed with the created Key Pair either via ssh (Mac, Unix) or ssh client such as PuTTY (Windows).

 

In the procedure described below we will first create a file with ssh key on a local computer and upload it to NeCTAR.

 

a) Open a terminal on your computer. You will be in your home directory (eg., /Users/john ). Type:

 

cd .ssh

ls

 

If there is no id_rsa.pub file on your computer in .ssh directory, you need to generate one by typing the following:

 

ssh-keygen -t rsa

 

Display the content of the file by typing:

 

more id_rsa.pub

 

Select and copy the content of id_rsa.pub file with right mouse click.

 

b) Log in into the NeCTAR dashboard using AAF credentials (your Australian University username, such as uqjcitiz and the password).

 

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 4.52.45 pm.png

Click Log In on the first screen.

 

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 4.50.34 pm.png

Select your organisation (e.g. The University of Queensland). Tick the first box only.

 

 

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 4.55.11 pm.png

 

On the AAF Authentication page type in your user name and the password. Then click LOGIN button.

 

You should see your NeCTAR project at this stage.

 

 

c) Create a new key pair.

In your project go to Access and Security section and click on Key Pairs tab

 

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 5.25.16 pm.png

 

 

Click Import Key Pair.

Type in name for the key pair, eg john_mac

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 5.34.45 pm.png

Paste in the content of id_rsa.pub file from the step a.

Click 'Import Key Pair' button.

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 5.35.56 pm.png

 

You should see a new key pair in Key Pairs table.

The new key pair will simplify communication between Ubuntu VM and your Mac computer.

 

 

2. Start a VM with the latest Ubuntu operating system

In your NeCTAR project select Instances tub.

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 5.38.41 pm.png

 

Click Launch Instance.

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 5.45.10 pm.png

 

Name the instance, eg john_beacon

Flavor: m1.medium (2 CPUs, 8 GB RAM, 60 GB ephemeral disk)

Image: Ubuntu Server 15.04 (Vivid Velvet) x86_64 (273.1 MB) (Use the latest available Ubuntu image)

 

Description: Macintosh HD:Users:igor:Desktop:Screen Shot 2015-10-13 at 5.54.41 pm.png

 

Access & Security

- choose your Key Pair created in step 3.

- add web access by selecting Web-Services.

 

Availability Zone

- Select  a NeCTAR node (tested in QRIScloud).

 

Click Launch.

 

Record the IP address of the deployed Ubuntu server.

 

NOTE: it takes can take time for the server to start, wait for a few minutes before attempting to access it.

 

 

 

 

 

 

3. Install Beacon server

For details see: https://github.com/maximilianh/ucscBeacon

 

Start a terminal window on your Mac and connect to the deployed Ubuntu VM as follows (where <IP address> is the IP address you recorded in the previous step).

 

ssh ubuntu@<IP address>

# no password required because the private key was used

 

The run the following commands.

 

sudo apt-get update

sudo apt-get install apache2 git

sudo a2enmod cgi

sudo service apache2 restart

cd /usr/lib/cgi-bin

sudo git clone https://github.com/maximilianh/ucscBeacon.git

 

 

4. Test the Beacon server

Start a new Terminal window on your Mac.

Paste in the test scripts below (replace <IP address> with the IP of your server)

 

curl 'http://<IP address>/cgi-bin/ucscBeacon/query?chromosome=1&position=10150&alternateBases=A&format=text'

 

curl 'http://<IP address>/cgi-bin/ucscBeacon/query?chromosome=1&position=4772339&alternateBases=T&format=text'

 

Both tests should return True.

 

NOTE: single quote marks ( ' ) should be used. Use a proper text editor such as Sublime. TextEdit may give you some headache: it uses different character ().   

 

 

5. Create web interface for the Beacon server

Access your server again from a Mac command line using:

ssh ubuntu@<IP address>

 

Then

cd /var/www/html/

 

Rename the existing index.html file using

sudo mv index.html index.html.original

 

Copy Beacon index.html file to /var/www/html/

sudo cp /usr/lib/cgi-bin/ucscBeacon/index.html index.html

 

You can modify index.html , for instance to put in some text to describe your Beacon server.

 

Now the Beacon server can be queried through the web interface

<IP address>/index.html

 

Check the service using the provided test data in section 4.

 

 

6. Adding your own data

For details see:  https://github.com/maximilianh/ucscBeacon

 

You can upload a VCF file to Beacon server and convert into Beacon database as follows.  VCF files can be deleted after conversions.

 

ssh to your Beacon server:

ssh ubuntu@<IP address>

 

Navigate to the Beacon server directory:

cd /usr/lib/cgi-bin/ucscBeacon

 

Rename the test Beacon database:

sudo mv beaconData.sqlite beaconData.sqlite.old

 

Create a directory for your data eg myData:

sudo mkdir myData

 

Change ownership for the new directory to Ubuntu:

sudo chown ubuntu myData

 

Open a new terminal window on your computer, navigate to directory with variant data and upload your VCF file:

scp myFile.vcf ubuntu@<IP address>:/usr/lib/cgi-bin/ucscBeacon/myData/

 

Convert VCF data into Beacon database format as follows.

Change to the Beacon Server directory if you are not already there:

cd /usr/lib/cgi-bin/ucscBeacon/
then

sudo ./query myvcf myData/myFile.vcf

(for details see: https://github.com/maximilianh/ucscBeacon)

 

Query the Beacon with a variant from the VCF file as above.

 

Note that VCF uses 1-based offset for position, while Beacon uses 0-based offset. Conversion also strips 'chr' characters from chromosome names.

 

Delete your VCF file after the conversion.

 

###########################

 

Comments

 

1. In the default configuration Beacon files are located on the NeCTAR VM root disk, that has a limited amount of disk space, ~10 GB in total. Generally VMs are deployed with a transient storage. Transient storage depends on VM size, eg VMs with 2 CPUs come with ~60 GB of transient storage located at /mnt/. The downside of transient storage is obvious from its name: it is a transient, and you may loose Beacon data if your VM was powered off.

 

For big datasets we recommend transient storage at /mnt/ because there are more disk space. Move Beacon directory into /mnt and create symlink in the default location. 

 

Log into Beacon server:

ssh ubuntu@[IP address]

cd /usr/lib/cgi-bin/

sudo mv ucscBeacon /mnt

sudo ln -s /mnt/ucscBeacon ucscBeacon

 

Upload your data to /mnt/ucscBeacon/

 

 

2. Conversion of VCF files into Beacon is memory intensive, and it may fail for big datasets. Splitting big datasets in several subsets may help. 

 

#########

 

Questions and comments are welcome. If you do use this tutorial to set up a Beacon server please let us know by emailing Igor at: i.makunin@uq.edu.au