This repository contains an example of how to leverage Cloud Dataflow and BigQuery to view Dialogflow interactions.
The Pipeline Steps are as follows:
- Dialogflow Interactions are logged to Google Cloud Logging
- A Cloud Logging sink sends the log messages to Cloud Pub/Sub
- Dataflow process the textpayload and streams it to BigQuery
- Access to the log interactions are now available in BigQuery
Note: Dialogflow Interactions Logging is sent to Cloud Logging as a Text Payload, this code will parse the Text Payload to a structured format within BigQuery which is defined in the Dataflow code.
You can change the schema as required in the Dataflow code to include other key:value pairs extracted from Cloud Logging. Here is a reference to the current schema:
| Field name | Type |
|---|---|
| session_id | STRING |
| trace | STRING |
| caller_id | STRING |
| STRING | |
| timestamp | TIMESTAMP |
| receiveTimestamp | TIMESTAMP |
| resolved_query | STRING |
| string_value | STRING |
| speech | STRING |
| is_fallback_intent | STRING |
| webhook_for_slot_filling_used | STRING |
| webhook_used | STRING |
| intent_name | STRING |
| intent_id | STRING |
| action | STRING |
| source | STRING |
| error_type | STRING |
| code | STRING |
| insertId | STRING |
| logName | STRING |
| lang | STRING |
| textPayload | STRING |
-
Enable the Dataflow API
gcloud services enable dataflow -
Create a Storage Bucket for Dataflow Staging
gsutil mb gs://[BUCKET_NAME]/
-
Create a folder in the newly created bucket in the Google Cloud Console Storage Browser called tmp
-
Create a Pub/Sub Topic
gcloud pubsub topics create [TOPIC_NAME]
-
Create a Cloud Logging sink
gcloud logging sinks create [SINK_NAME] pubsub.googleapis.com/projects/[PROJECT_ID]/topics/[TOPIC_NAME] --log-filter="resource.type=global" -
Install the Apache Beam GCP Library
python3 -m virtualenv tempenv source tempenv/bin/activate pip install apache-beam[gcp] -
Create BigQuery dataset
-
Deploy Dataflow Job
python3 stackdriverdataflowbigquery.py --project=[YOUR_PROJECT_ID] \ --input_topic=projects/[YOUR_PROJECT_ID]/topics/[YOUR_TOPIC_NAME] \ --runner=DataflowRunner --temp_location=gs://[YOUR_DATAFLOW_STAGING_BUCKET]/tmp \ --output_bigquery=[YOUR_BIGQUERY_DATASET.YOUR BIGQUERY_TABLE] --region=us-central1
-
Enable Dialogflow Logs to Cloud Logging
Enable Log interactions to Dialogflow and Google Cloud https://cloud.google.com/dialogflow/docs/history#access_all_logs
Once you enable Enable Log interactions, your new Dialogflow interactions will be available in BigQuery
This is not an officially supported Google product
