Skip to content

A command-line tool to enrich your database schema with metadata (primarily column comments) and easily manage those comments. It's designed to help you document your database schema directly within the database itself, making it easier to understand and maintain.

License

Notifications You must be signed in to change notification settings

GoogleCloudPlatform/db-context-enrichment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is not an officially supported Google product. This project is not eligible for the Google Open Source Software Vulnerability Rewards Program, Google Cloud Platform/SecOps Terms of Service, How Gemini for Google Cloud uses your data. This tool is provided "as is" without warranty of any kind. Users are solely responsible for understanding and managing the tool's interaction with their databases. Use of this tool constitutes acceptance of all risks associated with database access, reading, usage, and modifications.

DB Schema Enricher

A command-line tool to enrich your database schema with metadata (primarily column comments) and easily manage those comments. It's designed to help you document your database schema directly within the database itself, making it easier to understand and maintain. It supports multiple database dialects, including PostgreSQL, MySQL, and SQL Server, with specific support for Google Cloud SQL.

Key Features

  • Add Comments: Generates SQL statements to add descriptive comments to your database columns. These comments can be generated automatically (using the tool's logic and optionally, additional context) or customized.
  • Get Comments: Retrieves existing column comments from your database and outputs them to the console or a file.
  • Delete Comments: Removes comments added by this tool (specifically, comments within <gemini> tags), allowing you to clean up or revert changes.
  • Apply Comments: Executes SQL statements from a file to apply comments to your database. This is useful for applying the SQL generated by the add-comments or delete-comments command.
  • Dry Run Mode: Preview the changes without actually modifying your database. This is enabled by default.
  • Multiple Database Dialects: Supports PostgreSQL, MySQL, SQL Server, and their respective Google Cloud SQL variants.
  • Targeted Table/Column Selection: Control which tables and columns are processed.
  • Customizable Enrichments: Choose which types of enrichments to include (e.g., examples, distinct values, etc.).
  • Contextual Descriptions: Provide additional knowledge/context files to influence the generation of descriptions for tables and columns. Descriptions are only generated for tables/columns present in the provided context.

Installation

From Source (Recommended)

  1. Prerequisites:

    • Go 1.21 or later installed.
    • make utility.
  2. Clone the repository:

    git clone https://github.com/GoogleCloudPlatform/db-context-enrichment.git
    cd db-context-enrichment
  3. Build the binary:

    make build

    This will create the db_schema_enricher executable in the current directory.

Google Cloud Authentication

Before using db_schema_enricher with Cloud SQL, you must authenticate with Google Cloud.

Run the following command and follow the instructions to log in with your Google Cloud account:

gcloud auth application-default login

This command obtains credentials and stores them locally, allowing db_schema_enricher (and other tools using Application Default Credentials) to access your Cloud SQL instances.

Usage

The general command structure is:

./db_schema_enricher <command> [flags]

Global Flags

These flags apply to all commands:

Flag Description Default
--dry-run Preview changes without modifying the database. Enabled by default. true
--dialect string Database dialect. (See supported dialects below)
--host string Database host (for non-Cloud SQL connections).
--port int Database port (for non-Cloud SQL connections).
--username string Database username.
--password string Database password.
--database string Database name.
--cloudsql-instance-connection-name string Cloud SQL instance connection name (required for Cloud SQL).
--cloudsql-use-private-ip Use the private IP address for the Cloud SQL connection. false
--gemini-api-key Gemini API key. Required for generating descriptions using additional context. Can also be set via the GEMINI_API_KEY environment variable.

Supported Dialects:

  • postgres
  • mysql
  • sqlserver
  • cloudsqlpostgres
  • cloudsqlmysql
  • cloudsqlsqlserver

Commands

add-comments

Generates SQL statements to add comments to database columns. By default, it outputs these statements to a file (e.g., your_database_comments.sql). You can then review the generated SQL and apply it using the apply-comments command.

It can use additional context provided via the --context flag to generate descriptions. To generate descriptions based on the provided context, you must set either the --gemini-api-key flag or the GEMINI_API_KEY environment variable.

Command-Specific Flags:

Flag Description Default
--out_file -o Path to the output SQL file. <database_name>_comments.sql
--tables Comma-separated list of tables and columns to include (e.g., 'table1[col1,col2],table2,table3[col4]'). If omitted, all tables and columns are processed.
--enrichments Comma-separated list of enrichments to include (e.g., 'description','examples,distinct_values,null_count'). If omitted, all enrichments are included.
--update_existing string How to handle existing comments: overwrite or append. overwrite
--context Comma-separated list of file paths containing additional context for generating descriptions. Descriptions are only generated for tables/columns mentioned in these files.

Example (Cloud SQL PostgreSQL - Dry Run):

./db_schema_enricher add-comments \
  --dialect=cloudsqlpostgres \
  --username='DB_USER' \
  --password='YOUR_PASSWORD' \
  --database=db_financial \
  --cloudsql-instance-connection-name=your-project:region:your-postgres-instance \
  --out_file=./financial_db_comments.sql \
  --context="context1.txt,context2.json" \
  --dry-run=true
  • context1.txt might contain: "The transactions table stores financial transaction records. The transaction_id is a unique identifier, and amount represents the transaction value."
  • context2.json might contain: { "users": { "user_id": "Unique user identifier", "username": "The user's login name" } }

Review the generated SQL file financial_db_comments.sql. Notice that comments will only be generated for the transactions and users tables and their specified columns, and the descriptions will be based on the content of context1.txt and context2.json, combined with the tool's other enrichments.

Then, apply the changes:

./db_schema_enricher apply-comments \
  --dialect=cloudsqlpostgres \
  --username='DB_USER' \
  --password='YOUR_PASSWORD' \
  --database=db_financial \
  --cloudsql-instance-connection-name=your-project:region:your-postgres-instance \
  --in_file=./financial_db_comments.sql

get-comments

Retrieves existing column comments from the database.

Command-Specific Flags:

Flag Description Default
--out_file -o Path to the output file to save the comments. <database_name>_comments.txt

Example (Cloud SQL Postgres):

db_schema_enricher get-comments \
  --dialect=cloudsqlpostgres \
  --username=db_user \
  --password='YOUR_PASSWORD' \
  --database=inventory_db \
  --cloudsql-instance-connection-name=your-project:region:your-postgres-instance \
  --out_file=./inventory_comments.txt

delete-comments

Generates SQL statements to remove comments added by this tool (specifically, comments within <gemini> tags). This allows you to selectively remove the comments added by the enricher without affecting other comments. The generated SQL is written to a file, and you should review it before applying it with apply-comments.

Command-Specific Flags:

Flag Description Default
--out_file -o Path to the output SQL file. <database_name>_comments.sql
--tables Comma-separated list of tables and columns to include for comment deletion (e.g., 'table1[col1,col2],table2,table3[col4]'). If omitted, affects all tables.

Example (SQL Server - Dry Run):

db_schema_enricher delete-comments \
  --dialect=cloudsqlsqlserver \
  --username=sqlserver_user \
  --password='YOUR_PASSWORD' \
  --database=sales_db \
  --cloudsql-instance-connection-name=your-project:region:your-sqlserver-instance \
  --out_file=./delete_sales_comments.sql \
  --tables="orders,customers[customer_id,email]" \
  --dry-run=true # recommended for delete comment

apply-comments

Executes SQL statements from a file to apply comments (or other SQL) to your database. This is typically used to apply the SQL generated by add-comments or delete-comments.

Command-Specific Flags:

Flag Description Default
--in_file -i Path to the input SQL file. <database_name>_comments.sql

Example (Applying Comments):

# After running add-comments/delete-comments and reviewing the output file:
db_schema_enricher apply-comments \
  --dialect=cloudsqlpostgres \
  --username=db_user \
  --password='YOUR_PASSWORD' \
  --database=financial_db \
  --cloudsql-instance-connection-name=your-project:region:your-postgres-instance \
  --in_file=./financial_db_comments.sql

About

A command-line tool to enrich your database schema with metadata (primarily column comments) and easily manage those comments. It's designed to help you document your database schema directly within the database itself, making it easier to understand and maintain.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •