All Collections
Data Destinations
Querying Bigtable with BigQuery
Querying Bigtable with BigQuery

Use Google BigQuery to read information from a Bigtable.

B
Written by Ben Mills
Updated over a week ago

We've seen how to connect to a Google Bigtable and send your data to it.

Now let's say you want to run queries against that data with BigQuery. This tutorial will show you how to:

  1. Prepare a Dataset

  2. Create a Table from your Bigtable

  3. Query Your Table Using BigQuery

Important Note Before We Begin

Google's documentation warns that BigQuery is only available if your Bigtable instance exists in the following regions and zones:

  • us-central1-b

  • us-central1-c

  • europe-west1-b

  • europe-west1-c

If you plan to use BigQuery, your Bigtable instance must be set up accordingly.

Now that that's clear, we're ready!

Example Scenario

For this tutorial, I am going to pretend that I need to look up the VIN (Vehicle Identification Number) of the vehicle that my devices are plugged into. This tutorial showed you how to write a rule to capture the device ID and VIN from a Trip Start event.
The output of my rule is being sent to a Google Bigtable, and each event looks like this:

I'm going to use BigQuery to pull this information for all of my devices from my Bigtable.

Prepare a DataSet

Open the BigQuery console window.

On the left, you will see the name of the GCP project that is currently loaded. To switch to a different project, click on the project menu arrow, hover over Switch to project and then select the project where your Bigtable instance is located.

Click on the project menu arrow again and select Create new dataset

In the create form, specify an ID for your dataset. You can also change the location and expiration from their default values if you wish. Click OK to finish.

Create a Table to Run Queries Against

Your dataset will appear under the name of the GCP project on the left. Hover over it to reveal a menu arrow. Click the menu arrow, and select Create new table.

On the create table page, under the Source Data section, select "Google Cloud Bigtable" from the location dropdown menu. The file format will automatically change to "Cloud Bigtable." Then provide the location of your Bigtable in the form of a URL. 

The URL path is as follows:  

https://googleapis.com/bigtable/projects/{{project-id}}/instances/{{instance-id}}/tables/{{table-name}} 

  • {{project-id}}  is the project containing your Cloud Bigtable instance.

  • {{instance-id}}  is the Cloud Bigtable instance ID.

  • {{table-name}} is the name of the table you're querying.

Assuming that you're querying a table that is receiving data from the BitBrew platform, you can retrieve all of this information in the Destinations table in the Dashboard app.

Under the Destination Table section, provide a name for the table you're going to use to query (I just made mine match my Bigtable name). 

Under the Column Families section, enter "bitbrew" in the Column Family and Qualifiers box and ensure that Type is set to "Bytes", and Encoding is set to "Binary". 

This creates a top-level group for the columns we're going to create and assigns a default type and encoding to any columns not specifically assigned. You can choose the value for "Only Read Latest" based on your use case.

Click the + icon next to the Column Family and Qualifiers box to create a new row. Here, you can specify a column to extract from the information in your particular table. 

In this example, I am going to specify the three fields I'm interested in, which are nested inside of the "message" column of my Bigtable.

Select a Type for the values of the data that will be in your column, but the Encoding is always "Binary".

Choose from a few options under the Options section, then Click Create.

Query Your Table Using BigQuery

Click the Query Table button in the top right of the screen, which will open the "New Query" text box. This uses a SQL-like syntax. 

In the text box that appears, create your query statement. 

My query selects the three columns, renames them for ease of reading, and does some flattening since all of the fields are 'repeated.'

The query is typed out below in case you want to copy and paste for testing purposes. 

SELECT
  bitbrew.message_deviceId.cell.value AS DeviceId,
  bitbrew.message_vin.cell.value AS VIN,
  bitbrew.message_timestamp.cell.value AS Timestamp,
FROM (FLATTEN(
         FLATTEN(
           [bitbrew-bigtable-demo:BitBrew_Data.device_id_vin],
           bitbrew.message_vin.cell.value),
         bitbrew.message_timestamp.cell.value
        )
       )

Information about how to form queries can be found in the BigQuery documentation.

The result of my query is a table that contains three columns: DeviceId, VIN, and Timestamp.

Recap

Here's what we've learned:

  • When using BigQuery, you must first create a table from your Bigtable to run queries against.

  • BigQuery uses SQL syntax (kind of like our rule syntax).

Thanks for following along, and feel free reach out with any questions!

Did this answer your question?