Configuration

Sidemantic configuration is specified in YAML files and can be overridden via CLI flags.

YAML File Structure

Basic semantic layer YAML file:

# semantic_layer.yml

# Database connection (optional, defaults to in-memory DuckDB)
# Supports environment variable substitution
connection: duckdb:///data/analytics.duckdb

# Models define your tables and metrics
models:
  - name: orders
    table: orders
    primary_key: order_id

    dimensions:
      - name: status
        type: categorical
        sql: status

      - name: order_date
        type: time
        sql: created_at
        granularity: day

    metrics:
      - name: revenue
        agg: sum
        sql: amount

      - name: order_count
        agg: count

    relationships:
      - name: customers
        type: many_to_one
        foreign_key: customer_id

# Optional: Graph-level metrics
metrics:
  - name: total_revenue
    sql: orders.revenue

Connection Configuration

Specify database connection in YAML:

# DuckDB (default)
connection: duckdb:///:memory:
connection: duckdb:///path/to/database.duckdb

# PostgreSQL
connection: postgres://user:pass@host:5432/database

# BigQuery
connection: bigquery://project-id/dataset-id

# Snowflake
connection: snowflake://user:pass@account/database/schema?warehouse=wh

# ClickHouse
connection: clickhouse://user:pass@host:8123/database

# Databricks
connection: databricks://token@server/http-path?catalog=main

# Spark SQL
connection: spark://host:10000/database

Environment Variables in YAML

Use environment variables for sensitive credentials:

# Use ${ENV_VAR} syntax
connection: postgres://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:5432/${DB_NAME}

# With default values
connection: duckdb:///${DB_FILE:-/tmp/default.duckdb}

# Simple form (uppercase vars only)
connection: $DATABASE_URL

Supported syntax:

  • ${ENV_VAR} - Substituted with environment variable value
  • ${ENV_VAR:-default} - Use default if variable not set
  • $ENV_VAR - Simple form (uppercase variables only)

Example:

# Set environment variables
export DB_USER=analyst
export DB_PASSWORD=secret
export DB_HOST=localhost
export DB_NAME=analytics
# semantic_layer.yml
connection: postgres://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:5432/${DB_NAME}

models:
  - name: orders
    table: ${SCHEMA_NAME:-public}.orders
    primary_key: order_id

See Database Connections for complete connection string reference.

Override via CLI

# Use --connection flag to override YAML
sidemantic query models.yml \
  --connection "postgres://localhost:5432/analytics" \
  --sql "SELECT revenue FROM orders"

Model Configuration

Models define tables, dimensions, metrics, and relationships.

Required Fields

models:
  - name: orders              # Model name (required)
    table: orders             # Table name or SQL (required)
    primary_key: order_id     # Primary key column (required)

Optional Fields

models:
  - name: orders
    table: orders
    primary_key: order_id

    # Optional: Description
    description: "Order transactions"

    # Optional: SQL instead of table name
    sql: |
      SELECT *
      FROM raw_orders
      WHERE deleted_at IS NULL

    # Optional: Dimensions for grouping
    dimensions: [...]

    # Optional: Metrics for aggregation
    metrics: [...]

    # Optional: Relationships to other models
    relationships: [...]

    # Optional: Reusable filters
    segments: [...]

See Models for complete model configuration.

Dimension Configuration

Dimensions define columns for grouping and filtering.

dimensions:
  # Categorical dimension
  - name: status
    type: categorical
    sql: status
    description: "Order status"

  # Time dimension
  - name: order_date
    type: time
    sql: created_at
    granularity: day

  # Numeric dimension
  - name: amount
    type: number
    sql: amount

See Models for dimension types and options.

Metric Configuration

Metrics define aggregations and calculations.

Model-Level Metrics

Simple aggregations defined within a model:

models:
  - name: orders
    # ...
    metrics:
      # Sum
      - name: revenue
        agg: sum
        sql: amount

      # Count
      - name: order_count
        agg: count
        sql: order_id

      # Average
      - name: avg_order_value
        agg: avg
        sql: amount

      # Count distinct
      - name: customer_count
        agg: count_distinct
        sql: customer_id

      # With filter
      - name: completed_revenue
        agg: sum
        sql: amount
        filters: ["{model}.status = 'completed'"]

Graph-Level Metrics

Complex metrics defined at the top level:

# Graph-level metrics
metrics:
  # Simple reference
  - name: total_revenue
    sql: orders.revenue

  # Ratio
  - name: conversion_rate
    type: ratio
    numerator: orders.completed_revenue
    denominator: orders.revenue

  # Derived formula
  - name: profit_margin
    type: derived
    sql: "(revenue - cost) / revenue"

  # Cumulative
  - name: running_total
    type: cumulative
    sql: orders.revenue
    window: "7 days"

See Metrics for metric types and advanced features.

Relationship Configuration

Relationships define joins between models.

models:
  - name: orders
    # ...
    relationships:
      # Many-to-one (most common)
      - name: customers
        type: many_to_one
        foreign_key: customer_id

      # One-to-many
      - name: line_items
        type: one_to_many
        foreign_key: order_id  # In line_items table

      # One-to-one
      - name: invoice
        type: one_to_one
        foreign_key: order_id

See Relationships for join configuration.

Segment Configuration

Segments are reusable named filters.

models:
  - name: orders
    # ...
    segments:
      - name: completed
        sql: "{model}.status = 'completed'"
        description: "Only completed orders"

      - name: high_value
        sql: "{model}.amount > 100"

      - name: recent
        sql: "{model}.created_at >= CURRENT_DATE - 30"

Use in queries:

sidemantic query models.yml \
  --sql "SELECT revenue FROM orders WHERE {segments.completed}"

CLI Configuration

Global Flags

Available for most commands:

sidemantic COMMAND [OPTIONS]

Options:
  --connection TEXT    Database connection string
  --help              Show help message

Query Command

sidemantic query PATH [OPTIONS]

Options:
  --sql, -q TEXT           SQL query to execute
  --connection TEXT        Database connection override
  --output, -o TEXT        Output file (defaults to stdout)

Examples:
  # Query to stdout
  sidemantic query models/ --sql "SELECT revenue FROM orders"

  # Save to file
  sidemantic query models/ -q "SELECT * FROM orders" -o results.csv

  # Override connection
  sidemantic query models/ \
    --connection "postgres://localhost:5432/db" \
    --sql "SELECT revenue FROM orders"

Workbench Command

sidemantic workbench PATH [OPTIONS]

Options:
  --connection TEXT    Database connection override
  --demo              Run with demo data

Examples:
  # Local models
  sidemantic workbench models/

  # Demo mode
  sidemantic workbench --demo

  # Custom connection
  sidemantic workbench models/ \
    --connection "bigquery://project/dataset"

Serve Command

sidemantic serve PATH [OPTIONS]

Options:
  --connection TEXT    Database connection override
  --port INTEGER       Port number (default: 5432)
  --host TEXT          Host address (default: 127.0.0.1)
  --username TEXT      Authentication username
  --password TEXT      Authentication password
  --demo              Run with demo data

Examples:
  # Basic server
  sidemantic serve models/ --port 5433

  # With auth
  sidemantic serve models/ \
    --username admin \
    --password secret

  # Custom connection
  sidemantic serve models/ \
    --connection "snowflake://account/db/schema"

Validate Command

sidemantic validate PATH

Examples:
  # Validate all models
  sidemantic validate models/

  # Validate specific file
  sidemantic validate semantic_layer.yml

Info Command

sidemantic info PATH

Examples:
  # Show summary
  sidemantic info models/

See CLI for complete CLI reference.

File Organization

Single File

Simple setup with one YAML file:

semantic_layer.yml
# semantic_layer.yml
connection: duckdb:///data.duckdb

models:
  - name: orders
    # ...

  - name: customers
    # ...

metrics:
  - name: total_revenue
    # ...

Multiple Files

Organize models into separate files:

models/
├── orders.yml
├── customers.yml
└── products.yml

Each file can contain one or more models:

# models/orders.yml
models:
  - name: orders
    table: orders
    primary_key: order_id
    # ...

Load directory with CLI:

sidemantic query models/ --sql "SELECT revenue FROM orders"

Mixed Formats

Combine different semantic layer formats:

semantic_models/
├── cube/
│   └── Orders.yml          # Cube format
├── dbt/
│   └── metrics.yml         # MetricFlow format
└── native/
    └── customers.yml       # Sidemantic format

Sidemantic auto-detects format:

sidemantic query semantic_models/ --sql "SELECT revenue FROM orders"

Environment Variables

Store sensitive values in environment variables:

# .env file
DATABASE_URL=postgres://user:pass@host:5432/analytics
SNOWFLAKE_WAREHOUSE=COMPUTE_WH

Reference in CLI:

sidemantic query models/ \
  --connection "$DATABASE_URL" \
  --sql "SELECT revenue FROM orders"

Validation

Sidemantic validates YAML files automatically:

# Validate before querying
sidemantic validate models/

# Common validation errors:
# - Missing required fields (name, table, primary_key)
# - Invalid metric aggregation types
# - Undefined relationship references
# - Invalid dimension types

Fix validation errors before running queries.

Best Practices

Use Descriptive Names

models:
  - name: orders              # Good: clear, singular
    # vs
  - name: ord                 # Bad: unclear abbreviation

Document Your Models

models:
  - name: orders
    description: "Customer orders from the e-commerce platform"
    table: prod.orders
    # ...

    metrics:
      - name: revenue
        description: "Total order revenue excluding refunds"
        # ...

Organize by Domain

models/
├── sales/
│   ├── orders.yml
│   ├── line_items.yml
│   └── returns.yml
├── customers/
│   ├── customers.yml
│   └── segments.yml
└── products/
    ├── products.yml
    └── categories.yml

Use Segments for Common Filters

# Define once
segments:
  - name: completed
    sql: "{model}.status = 'completed'"

# Reuse everywhere

Keep Connection Strings Secure

# Bad: Hardcoded credentials
connection: postgres://admin:password123@prod-db:5432/analytics

# Good: Environment variable
# Then use CLI flag: --connection "$DATABASE_URL"