Google Cloud Setup

General Information

Any pipeline we use will likely utilize three different products within the Google Cloud ecosystem; Compute Engine for genomics and microbiome (link TBA), Cloud Storage for genomics and microbiome (link TBA), and the Container Repository. As the names imply, the pipeline performs computations using the compute engine, stores data in cloud storage, and uses docker images stored in the container registry. If you are at the Broad Institute, you should be able to follow this link and use your Broad gmail account to sign into the platform. The Xavier Lab has projects called genomics-xavier and microbiome-xavier, and these are the projects under which all activities take place. This is also the level of organization at which billing takes place.

Web Access

Google cloud can be controlled through web interface. Click links above and below.

  • Create instance
  • Create bucket
  • Plz follow naming convention.
  • Cost: No need to worry unless using >1TB of bucket or hard disk, or running >16 cores or >64GB memory on average per month. If so, contact Ariel.

Command Line Access

Authentication

In order to access Google Cloud from the command line, you will need to authenticate and configure your Google Cloud account. You can do that through the following steps:

  1. Log onto the Broad cluster

  2. Carry out the following shell commands and follow the instructions:

    # load google cloud software development kit
    use .google-cloud-sdk
    
    # authenticate (this should be necessary only for the first time using the google cloud sdk)
    gcloud auth login
    
    # identify the google cloud project
    gcloud config set project <genomics-xavier, xavier_microbiome>
    

Congratulations! At this point, you should be able to interact with Google Cloud! It may be useful for you to check out the documentation for the Google Cloud command line tool (called gsutil) here.

Q&A

  • Root access. On official image:

    sudo bash
    
  • Disk expansion:
    1. Stop your instance
    2. On Google Compute Engine web interface, go to Disks (left).
    3. Click the disk you want to expand
    4. Click Edit
    5. Change the size and save
  • Other non-urgent questions. Please raise issue in this documentation or relevant repo.

Notes

  • Christian has noticed that using wildcard matching within gsutil does not work when using Z shell. He knows it works in bash, but can’t speak to any other shell options.