Terra/Google Cloud Interaction

As a starting point, it may be worth explaining a little about Terra and Google Cloud in relation to each other, and then explain how they interact.

Terra

Terra can serve a number of functions (which are probably best learned about through the support pages or the Terra team itself), but we use it primarily to run WDL+Cromwell pipelines. Pipelines are run within the purview of a Terra Workspace and output from pipelines is written to a cryptically named Google Bucket associated the Workspace. We do not recommend storing any data in this Google Bucket or using it for any purpose other than passively storing files generated by pipeline runs.

Google Cloud

Google Cloud can also be used for a number of purposes (storage, compute, etc.), but we will focus here on its storage capabilities. The lab has multiple Google Cloud Projects (with names like Genomics-Xavier), each which allow for users to create Google Buckets in which to store data.

Interaction

You can think of the interaction of the two as follows; Terra runs pipelines and by default stores output in its own Google Bucket, but users can specify pipeline inputs (and depending on the pipeline, outputs) as paths to files in Google Buckets not linked to Terra.