Database Connections
Sidemantic supports multiple database engines for executing semantic layer queries. Connection strings are specified in your YAML configuration files or via CLI flags.
Supported Databases
| Database | Production Ready | Notes |
|---|---|---|
| DuckDB | ✅ | Default, fully supported |
| MotherDuck | ✅ | Cloud DuckDB, fully supported |
| PostgreSQL | ✅ | Full feature support |
| BigQuery | ✅ | Google Cloud Platform |
| Snowflake | ✅ | Cloud data warehouse |
| ClickHouse | ✅ | OLAP database |
| Databricks | ✅ | Lakehouse platform |
| Spark SQL | ✅ | Apache Spark (via Thrift server) |
Connection String Format
In YAML Files
Add a connection field to your semantic layer YAML:
# semantic_layer.yml
connection: duckdb:///path/to/database.duckdb
models:
- name: orders
table: orders
primary_key: order_id
# ...With CLI
Override connection via --connection flag:
# Query command
sidemantic query models/ --connection "postgres://localhost:5432/db" \
--sql "SELECT revenue FROM orders"
# Workbench
sidemantic workbench models/ --connection "bigquery://project/dataset"
# PostgreSQL server
sidemantic serve models/ --connection "snowflake://account/db/schema"Database-Specific Configuration
DuckDB (Default)
No installation required.
In-memory (default):
connection: duckdb:///:memory:Persistent file:
connection: duckdb:///data/analytics.duckdbRead-only:
connection: duckdb:///data/warehouse.duckdb?access_mode=read_onlyCLI example:
sidemantic query models/ --connection "duckdb:///data.duckdb" \
--sql "SELECT * FROM orders"MotherDuck
Cloud DuckDB with no additional installation required (uses DuckDB client).
Format:
connection: duckdb://md:database_nameExample:
connection: duckdb://md:my_analytics_dbAuthentication:
Set your MotherDuck service token as an environment variable:
export MOTHERDUCK_TOKEN=your_service_token_hereGet your token from motherduck.com.
CLI example:
export MOTHERDUCK_TOKEN=your_token
sidemantic query models/ --connection "duckdb://md:my_db" \
--sql "SELECT * FROM orders"Features:
- Cloud-native DuckDB with data persistence
- Share databases across teams
- Access from anywhere
- Same DuckDB SQL syntax and features
- Automatic scaling
See also: MotherDuck example
PostgreSQL
Install: pip install sidemantic[postgres]
Basic:
connection: postgres://username:password@host:port/databaseExample:
connection: postgres://analyst:secret@localhost:5432/analyticsWith SSL:
connection: postgres://user:pass@db.example.com:5432/analytics?sslmode=requireCLI example:
sidemantic query models/ \
--connection "postgres://analyst:secret@localhost:5432/analytics" \
--sql "SELECT revenue FROM orders"Features:
- Full semantic layer support
- Symmetric aggregates for fan-out joins
- Arrow format for performance
BigQuery
Install: pip install sidemantic[bigquery]
Format:
connection: bigquery://project-id/dataset-idExample:
connection: bigquery://my-gcp-project/analytics_datasetCLI example:
sidemantic query models/ \
--connection "bigquery://my-project/my-dataset" \
--sql "SELECT revenue FROM orders"Authentication:
Local development:
gcloud auth application-default loginService account:
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"Features:
- Full semantic layer support
- Uses FARM_FINGERPRINT for symmetric aggregates
- Arrow format via BigQuery Storage API
Snowflake
Install: pip install sidemantic[snowflake]
Format:
connection: snowflake://user:password@account/database/schema?warehouse=wh&role=roleExample:
connection: snowflake://analyst:pass@xy12345.us-east-1/ANALYTICS/PUBLIC?warehouse=COMPUTE_WH&role=ANALYSTCLI example:
sidemantic query models/ \
--connection "snowflake://user:pass@account/db/schema?warehouse=wh" \
--sql "SELECT revenue FROM orders"Parameters:
account: Snowflake account (e.g.,xy12345.us-east-1)database: Database nameschema: Schema namewarehouse: Compute warehouse (query parameter)role: Role to use (query parameter)
Features:
- Full semantic layer support
- Optimized symmetric aggregates
- Arrow format for performance
ClickHouse
Install: pip install sidemantic[clickhouse]
Format:
connection: clickhouse://user:password@host:port/databaseExample:
connection: clickhouse://default:clickhouse@localhost:8123/defaultWith SSL:
connection: clickhouse://user:pass@host:8443/db?secure=trueCLI example:
sidemantic query models/ \
--connection "clickhouse://default:clickhouse@localhost:8123/default" \
--sql "SELECT revenue FROM orders"Features:
- Full semantic layer support
- Uses halfMD5 for symmetric aggregates
- Optimized for OLAP workloads
Databricks
Install: pip install sidemantic[databricks]
Format:
connection: databricks://token@server-hostname/http-path?catalog=catalog&schema=schemaExample:
connection: databricks://dapi12345@dbc-a1b2c3d4.cloud.databricks.com/sql/1.0/warehouses/abc123?catalog=main&schema=defaultCLI example:
sidemantic query models/ \
--connection "databricks://dapi123@server/sql/1.0/warehouses/abc?catalog=main" \
--sql "SELECT revenue FROM orders"Parameters:
token: Personal access token (as username in URL)server-hostname: Databricks workspace URLhttp-path: SQL warehouse HTTP pathcatalog: Unity Catalog (query parameter)schema: Schema name (query parameter)
Features:
- Full semantic layer support
- Uses xxhash64 for symmetric aggregates
- Optimized for lakehouse queries
Spark SQL
Install: pip install sidemantic[spark]
Format:
connection: spark://host:port/databaseExample:
connection: spark://localhost:10000/defaultWith authentication:
connection: spark://username:password@host:10000/databaseCLI example:
sidemantic query models/ \
--connection "spark://localhost:10000/default" \
--sql "SELECT revenue FROM orders"Requirements:
- Spark Thrift server must be running
- Uses HiveServer2 protocol
Features:
- Full semantic layer support
- Uses xxhash64 for symmetric aggregates
- Compatible with Apache Spark 3.x
Environment Variables
Store credentials in environment variables for security. Never commit credentials to version control.
Environment variables can be used in: - YAML configuration files - Using ${VAR} syntax - CLI flags - Using shell variables like $VAR
In YAML Files
Sidemantic supports environment variable substitution in YAML files using standard syntax:
Syntax: - ${ENV_VAR} - Substituted with environment variable value - ${ENV_VAR:-default} - Use default if variable not set - $ENV_VAR - Simple form (uppercase variables only)
Example YAML:
# semantic_layer.yml
connection: postgres://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:5432/${DB_NAME}
models:
- name: orders
table: ${SCHEMA_NAME:-public}.orders
primary_key: order_idSet environment variables:
export DB_USER=analyst
export DB_PASSWORD=secret
export DB_HOST=localhost
export DB_NAME=analytics
export SCHEMA_NAME=prodLoad configuration:
sidemantic query semantic_layer.yml --sql "SELECT revenue FROM orders"The environment variables are substituted when the YAML file is loaded.
In CLI Flags
Use shell environment variables in CLI commands:
export DATABASE_URL=postgres://user:pass@localhost:5432/analytics
sidemantic query models/ --connection "$DATABASE_URL" \
--sql "SELECT revenue FROM orders"Full Connection Strings
Store complete connection string in an environment variable:
Example .env file:
# PostgreSQL
DATABASE_URL=postgres://user:pass@localhost:5432/analytics
# BigQuery
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
BIGQUERY_URL=bigquery://my-project/my-dataset
# Snowflake
SNOWFLAKE_URL=snowflake://user:pass@account/db/schema?warehouse=wh
# ClickHouse
CLICKHOUSE_URL=clickhouse://user:pass@localhost:8123/default
# Databricks
DATABRICKS_URL=databricks://token@server/http-path?catalog=main
# Spark
SPARK_URL=spark://localhost:10000/defaultUse with CLI:
# Load .env file (many tools auto-load)
export $(cat .env | xargs)
# Use environment variable
sidemantic query models/ --connection "$DATABASE_URL" \
--sql "SELECT revenue FROM orders"
sidemantic workbench models/ --connection "$SNOWFLAKE_URL"
sidemantic serve models/ --connection "$BIGQUERY_URL"Component Environment Variables
Build connection strings from individual components:
Example .env file:
# PostgreSQL components
POSTGRES_HOST=db.example.com
POSTGRES_PORT=5432
POSTGRES_USER=analyst
POSTGRES_PASSWORD=secret
POSTGRES_DB=analytics
# Snowflake components
SNOWFLAKE_ACCOUNT=xy12345.us-east-1
SNOWFLAKE_USER=analyst
SNOWFLAKE_PASSWORD=secret
SNOWFLAKE_DATABASE=ANALYTICS
SNOWFLAKE_SCHEMA=PUBLIC
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
# BigQuery components
BIGQUERY_PROJECT=my-gcp-project
BIGQUERY_DATASET=analytics_dataset
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.jsonBuild connection string in shell:
# PostgreSQL
CONNECTION="postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT}/${POSTGRES_DB}"
sidemantic query models/ --connection "$CONNECTION" --sql "SELECT revenue FROM orders"
# Snowflake
CONNECTION="snowflake://${SNOWFLAKE_USER}:${SNOWFLAKE_PASSWORD}@${SNOWFLAKE_ACCOUNT}/${SNOWFLAKE_DATABASE}/${SNOWFLAKE_SCHEMA}?warehouse=${SNOWFLAKE_WAREHOUSE}"
sidemantic workbench models/ --connection "$CONNECTION"
# BigQuery
CONNECTION="bigquery://${BIGQUERY_PROJECT}/${BIGQUERY_DATASET}"
sidemantic serve models/ --connection "$CONNECTION"Security Best Practices
Never commit credentials:
# Add to .gitignore
echo ".env" >> .gitignore
echo "*.secrets" >> .gitignoreUse secret management:
# AWS Secrets Manager
export DATABASE_URL=$(aws secretsmanager get-secret-value \
--secret-id prod/database-url \
--query SecretString \
--output text)
# Google Secret Manager
export DATABASE_URL=$(gcloud secrets versions access latest \
--secret="database-url")
# HashiCorp Vault
export DATABASE_URL=$(vault kv get -field=url secret/database)Restrict file permissions:
chmod 600 .envPer-Environment Configuration
Development:
# .env.development
DATABASE_URL=duckdb:///dev.duckdbStaging:
# .env.staging
DATABASE_URL=postgres://user:pass@staging-db:5432/analyticsProduction:
# .env.production
DATABASE_URL=snowflake://user:pass@prod-account/db/schema?warehouse=whLoad based on environment:
ENV=${ENV:-development}
export $(cat .env.${ENV} | xargs)
sidemantic query models/ --connection "$DATABASE_URL" \
--sql "SELECT revenue FROM orders"Testing Connections
Test your connection string:
# Create a simple test model
cat > test.yml <<EOF
connection: postgres://localhost:5432/test
models:
- name: test
table: (SELECT 1 as id, 100 as value)
primary_key: id
metrics:
- name: total
agg: sum
sql: value
EOF
# Run a query
sidemantic query test.yml --sql "SELECT total FROM test"
# Should output: 100CLI Connection Examples
Query Command
# DuckDB file
sidemantic query models/ \
--connection "duckdb:///data.duckdb" \
--sql "SELECT revenue FROM orders"
# PostgreSQL
sidemantic query models/ \
--connection "postgres://localhost:5432/analytics" \
--sql "SELECT revenue, status FROM orders"
# BigQuery
sidemantic query models/ \
--connection "bigquery://my-project/analytics" \
--sql "SELECT revenue FROM orders WHERE status = 'completed'"Workbench
# Local DuckDB
sidemantic workbench models/ --connection "duckdb:///warehouse.duckdb"
# Snowflake
sidemantic workbench models/ \
--connection "snowflake://user:pass@account/db/schema"
# ClickHouse
sidemantic workbench models/ \
--connection "clickhouse://localhost:8123/default"PostgreSQL Server
# Serve DuckDB over PostgreSQL protocol
sidemantic serve models/ \
--connection "duckdb:///data.duckdb" \
--port 5433
# Serve BigQuery over PostgreSQL protocol
sidemantic serve models/ \
--connection "bigquery://project/dataset" \
--port 5433 \
--username admin \
--password secretTroubleshooting
Connection Errors
PostgreSQL:
psycopg.OperationalError: connection failed
- Verify host, port, username, password
- Check PostgreSQL is running
- Test with
psqlfirst:psql -h host -p 5432 -U user -d database
BigQuery:
google.auth.exceptions.DefaultCredentialsError
- Run
gcloud auth application-default login - Or set
GOOGLE_APPLICATION_CREDENTIALSenvironment variable
Snowflake:
snowflake.connector.errors.DatabaseError: 250001
- Verify account identifier format
- Check username/password
- Ensure warehouse is running
Invalid Connection String
ValueError: Unsupported connection URL
- Check connection string format matches examples above
- Ensure required extras installed:
pip install sidemantic[database]