Database Connections

Sidemantic supports multiple database engines for executing semantic layer queries. Connection strings are specified in your YAML configuration files or via CLI flags.

Supported Databases

Database Production Ready Notes
DuckDB Default, fully supported
MotherDuck Cloud DuckDB, fully supported
PostgreSQL Full feature support
BigQuery Google Cloud Platform
Snowflake Cloud data warehouse
ClickHouse OLAP database
Databricks Lakehouse platform
Spark SQL Apache Spark (via Thrift server)

Connection String Format

In YAML Files

Add a connection field to your semantic layer YAML:

# semantic_layer.yml
connection: duckdb:///path/to/database.duckdb

models:
  - name: orders
    table: orders
    primary_key: order_id
    # ...

With CLI

Override connection via --connection flag:

# Query command
sidemantic query models/ --connection "postgres://localhost:5432/db" \
  --sql "SELECT revenue FROM orders"

# Workbench
sidemantic workbench models/ --connection "bigquery://project/dataset"

# PostgreSQL server
sidemantic serve models/ --connection "snowflake://account/db/schema"

Database-Specific Configuration

DuckDB (Default)

No installation required.

In-memory (default):

connection: duckdb:///:memory:

Persistent file:

connection: duckdb:///data/analytics.duckdb

Read-only:

connection: duckdb:///data/warehouse.duckdb?access_mode=read_only

CLI example:

sidemantic query models/ --connection "duckdb:///data.duckdb" \
  --sql "SELECT * FROM orders"

MotherDuck

Cloud DuckDB with no additional installation required (uses DuckDB client).

Format:

connection: duckdb://md:database_name

Example:

connection: duckdb://md:my_analytics_db

Authentication:

Set your MotherDuck service token as an environment variable:

export MOTHERDUCK_TOKEN=your_service_token_here

Get your token from motherduck.com.

CLI example:

export MOTHERDUCK_TOKEN=your_token
sidemantic query models/ --connection "duckdb://md:my_db" \
  --sql "SELECT * FROM orders"

Features:

  • Cloud-native DuckDB with data persistence
  • Share databases across teams
  • Access from anywhere
  • Same DuckDB SQL syntax and features
  • Automatic scaling

See also: MotherDuck example

PostgreSQL

Install: pip install sidemantic[postgres]

Basic:

connection: postgres://username:password@host:port/database

Example:

connection: postgres://analyst:secret@localhost:5432/analytics

With SSL:

connection: postgres://user:pass@db.example.com:5432/analytics?sslmode=require

CLI example:

sidemantic query models/ \
  --connection "postgres://analyst:secret@localhost:5432/analytics" \
  --sql "SELECT revenue FROM orders"

Features:

  • Full semantic layer support
  • Symmetric aggregates for fan-out joins
  • Arrow format for performance

BigQuery

Install: pip install sidemantic[bigquery]

Format:

connection: bigquery://project-id/dataset-id

Example:

connection: bigquery://my-gcp-project/analytics_dataset

CLI example:

sidemantic query models/ \
  --connection "bigquery://my-project/my-dataset" \
  --sql "SELECT revenue FROM orders"

Authentication:

Local development:

gcloud auth application-default login

Service account:

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

Features:

  • Full semantic layer support
  • Uses FARM_FINGERPRINT for symmetric aggregates
  • Arrow format via BigQuery Storage API

Snowflake

Install: pip install sidemantic[snowflake]

Format:

connection: snowflake://user:password@account/database/schema?warehouse=wh&role=role

Example:

connection: snowflake://analyst:pass@xy12345.us-east-1/ANALYTICS/PUBLIC?warehouse=COMPUTE_WH&role=ANALYST

CLI example:

sidemantic query models/ \
  --connection "snowflake://user:pass@account/db/schema?warehouse=wh" \
  --sql "SELECT revenue FROM orders"

Parameters:

  • account: Snowflake account (e.g., xy12345.us-east-1)
  • database: Database name
  • schema: Schema name
  • warehouse: Compute warehouse (query parameter)
  • role: Role to use (query parameter)

Features:

  • Full semantic layer support
  • Optimized symmetric aggregates
  • Arrow format for performance

ClickHouse

Install: pip install sidemantic[clickhouse]

Format:

connection: clickhouse://user:password@host:port/database

Example:

connection: clickhouse://default:clickhouse@localhost:8123/default

With SSL:

connection: clickhouse://user:pass@host:8443/db?secure=true

CLI example:

sidemantic query models/ \
  --connection "clickhouse://default:clickhouse@localhost:8123/default" \
  --sql "SELECT revenue FROM orders"

Features:

  • Full semantic layer support
  • Uses halfMD5 for symmetric aggregates
  • Optimized for OLAP workloads

Databricks

Install: pip install sidemantic[databricks]

Format:

connection: databricks://token@server-hostname/http-path?catalog=catalog&schema=schema

Example:

connection: databricks://dapi12345@dbc-a1b2c3d4.cloud.databricks.com/sql/1.0/warehouses/abc123?catalog=main&schema=default

CLI example:

sidemantic query models/ \
  --connection "databricks://dapi123@server/sql/1.0/warehouses/abc?catalog=main" \
  --sql "SELECT revenue FROM orders"

Parameters:

  • token: Personal access token (as username in URL)
  • server-hostname: Databricks workspace URL
  • http-path: SQL warehouse HTTP path
  • catalog: Unity Catalog (query parameter)
  • schema: Schema name (query parameter)

Features:

  • Full semantic layer support
  • Uses xxhash64 for symmetric aggregates
  • Optimized for lakehouse queries

Spark SQL

Install: pip install sidemantic[spark]

Format:

connection: spark://host:port/database

Example:

connection: spark://localhost:10000/default

With authentication:

connection: spark://username:password@host:10000/database

CLI example:

sidemantic query models/ \
  --connection "spark://localhost:10000/default" \
  --sql "SELECT revenue FROM orders"

Requirements:

  • Spark Thrift server must be running
  • Uses HiveServer2 protocol

Features:

  • Full semantic layer support
  • Uses xxhash64 for symmetric aggregates
  • Compatible with Apache Spark 3.x

Environment Variables

Store credentials in environment variables for security. Never commit credentials to version control.

Environment variables can be used in: - YAML configuration files - Using ${VAR} syntax - CLI flags - Using shell variables like $VAR

In YAML Files

Sidemantic supports environment variable substitution in YAML files using standard syntax:

Syntax: - ${ENV_VAR} - Substituted with environment variable value - ${ENV_VAR:-default} - Use default if variable not set - $ENV_VAR - Simple form (uppercase variables only)

Example YAML:

# semantic_layer.yml
connection: postgres://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:5432/${DB_NAME}

models:
  - name: orders
    table: ${SCHEMA_NAME:-public}.orders
    primary_key: order_id

Set environment variables:

export DB_USER=analyst
export DB_PASSWORD=secret
export DB_HOST=localhost
export DB_NAME=analytics
export SCHEMA_NAME=prod

Load configuration:

sidemantic query semantic_layer.yml --sql "SELECT revenue FROM orders"

The environment variables are substituted when the YAML file is loaded.

In CLI Flags

Use shell environment variables in CLI commands:

export DATABASE_URL=postgres://user:pass@localhost:5432/analytics

sidemantic query models/ --connection "$DATABASE_URL" \
  --sql "SELECT revenue FROM orders"

Full Connection Strings

Store complete connection string in an environment variable:

Example .env file:

# PostgreSQL
DATABASE_URL=postgres://user:pass@localhost:5432/analytics

# BigQuery
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
BIGQUERY_URL=bigquery://my-project/my-dataset

# Snowflake
SNOWFLAKE_URL=snowflake://user:pass@account/db/schema?warehouse=wh

# ClickHouse
CLICKHOUSE_URL=clickhouse://user:pass@localhost:8123/default

# Databricks
DATABRICKS_URL=databricks://token@server/http-path?catalog=main

# Spark
SPARK_URL=spark://localhost:10000/default

Use with CLI:

# Load .env file (many tools auto-load)
export $(cat .env | xargs)

# Use environment variable
sidemantic query models/ --connection "$DATABASE_URL" \
  --sql "SELECT revenue FROM orders"

sidemantic workbench models/ --connection "$SNOWFLAKE_URL"

sidemantic serve models/ --connection "$BIGQUERY_URL"

Component Environment Variables

Build connection strings from individual components:

Example .env file:

# PostgreSQL components
POSTGRES_HOST=db.example.com
POSTGRES_PORT=5432
POSTGRES_USER=analyst
POSTGRES_PASSWORD=secret
POSTGRES_DB=analytics

# Snowflake components
SNOWFLAKE_ACCOUNT=xy12345.us-east-1
SNOWFLAKE_USER=analyst
SNOWFLAKE_PASSWORD=secret
SNOWFLAKE_DATABASE=ANALYTICS
SNOWFLAKE_SCHEMA=PUBLIC
SNOWFLAKE_WAREHOUSE=COMPUTE_WH

# BigQuery components
BIGQUERY_PROJECT=my-gcp-project
BIGQUERY_DATASET=analytics_dataset
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

Build connection string in shell:

# PostgreSQL
CONNECTION="postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT}/${POSTGRES_DB}"
sidemantic query models/ --connection "$CONNECTION" --sql "SELECT revenue FROM orders"

# Snowflake
CONNECTION="snowflake://${SNOWFLAKE_USER}:${SNOWFLAKE_PASSWORD}@${SNOWFLAKE_ACCOUNT}/${SNOWFLAKE_DATABASE}/${SNOWFLAKE_SCHEMA}?warehouse=${SNOWFLAKE_WAREHOUSE}"
sidemantic workbench models/ --connection "$CONNECTION"

# BigQuery
CONNECTION="bigquery://${BIGQUERY_PROJECT}/${BIGQUERY_DATASET}"
sidemantic serve models/ --connection "$CONNECTION"

Security Best Practices

Never commit credentials:

# Add to .gitignore
echo ".env" >> .gitignore
echo "*.secrets" >> .gitignore

Use secret management:

# AWS Secrets Manager
export DATABASE_URL=$(aws secretsmanager get-secret-value \
  --secret-id prod/database-url \
  --query SecretString \
  --output text)

# Google Secret Manager
export DATABASE_URL=$(gcloud secrets versions access latest \
  --secret="database-url")

# HashiCorp Vault
export DATABASE_URL=$(vault kv get -field=url secret/database)

Restrict file permissions:

chmod 600 .env

Per-Environment Configuration

Development:

# .env.development
DATABASE_URL=duckdb:///dev.duckdb

Staging:

# .env.staging
DATABASE_URL=postgres://user:pass@staging-db:5432/analytics

Production:

# .env.production
DATABASE_URL=snowflake://user:pass@prod-account/db/schema?warehouse=wh

Load based on environment:

ENV=${ENV:-development}
export $(cat .env.${ENV} | xargs)

sidemantic query models/ --connection "$DATABASE_URL" \
  --sql "SELECT revenue FROM orders"

Testing Connections

Test your connection string:

# Create a simple test model
cat > test.yml <<EOF
connection: postgres://localhost:5432/test

models:
  - name: test
    table: (SELECT 1 as id, 100 as value)
    primary_key: id
    metrics:
      - name: total
        agg: sum
        sql: value
EOF

# Run a query
sidemantic query test.yml --sql "SELECT total FROM test"
# Should output: 100

CLI Connection Examples

Query Command

# DuckDB file
sidemantic query models/ \
  --connection "duckdb:///data.duckdb" \
  --sql "SELECT revenue FROM orders"

# PostgreSQL
sidemantic query models/ \
  --connection "postgres://localhost:5432/analytics" \
  --sql "SELECT revenue, status FROM orders"

# BigQuery
sidemantic query models/ \
  --connection "bigquery://my-project/analytics" \
  --sql "SELECT revenue FROM orders WHERE status = 'completed'"

Workbench

# Local DuckDB
sidemantic workbench models/ --connection "duckdb:///warehouse.duckdb"

# Snowflake
sidemantic workbench models/ \
  --connection "snowflake://user:pass@account/db/schema"

# ClickHouse
sidemantic workbench models/ \
  --connection "clickhouse://localhost:8123/default"

PostgreSQL Server

# Serve DuckDB over PostgreSQL protocol
sidemantic serve models/ \
  --connection "duckdb:///data.duckdb" \
  --port 5433

# Serve BigQuery over PostgreSQL protocol
sidemantic serve models/ \
  --connection "bigquery://project/dataset" \
  --port 5433 \
  --username admin \
  --password secret

Troubleshooting

Connection Errors

PostgreSQL:

psycopg.OperationalError: connection failed
  • Verify host, port, username, password
  • Check PostgreSQL is running
  • Test with psql first: psql -h host -p 5432 -U user -d database

BigQuery:

google.auth.exceptions.DefaultCredentialsError
  • Run gcloud auth application-default login
  • Or set GOOGLE_APPLICATION_CREDENTIALS environment variable

Snowflake:

snowflake.connector.errors.DatabaseError: 250001
  • Verify account identifier format
  • Check username/password
  • Ensure warehouse is running

Invalid Connection String

ValueError: Unsupported connection URL
  • Check connection string format matches examples above
  • Ensure required extras installed: pip install sidemantic[database]