dbt hub

dbt test: expect_column_value_lengths_to_equal

USE CASE

String format

APPLIES TO

Column

This page introduces the expect_column_value_lengths_to_equal test from the dbt-expectations package, which checks that all string values in a specified column have a particular length. This test is critical for ensuring that data entries meet format standards or specific data entry requirements.

How it Works

The expect_column_value_lengths_to_equal test assesses the length of the string values within a designated column to ensure each is exactly the length specified by the user.

Steps and Conditions:

  1. Column Selection: Choose the column whose string values will be evaluated.
  2. Define Length Requirement: Set the exact number of characters each string in the selected column must contain using the value argument.
  3. Optional Configuration:
    • Row Condition: The row_condition argument can be utilized to include only those rows that meet certain conditions (e.g., "id is not null").
  4. Execution: The test proceeds to check the length of each string value in the column, applying any row conditions if specified.
  5. Outcome:
    • Pass: The test passes if every string value in the column is of the length defined.
    • Fail: The test fails if any string value does not meet the specified length, pointing out entries that do not comply with the set requirements.

Example Usage: E-commerce

For an E-commerce platform, maintaining standardization in product identifiers or SKU (Stock Keeping Unit) numbers is crucial. These identifiers often need to adhere to a fixed-length format to integrate seamlessly with inventory and POS (Point of Sale) systems.

Consider a scenario in the products table where each product is associated with an SKU number contained in the sku column:


models:
  - name: products
    columns:
        - name: sku
          tests:
            - dbt_expectations.expect_column_value_lengths_to_equal:
                value: 8 # Assuming the SKU numbers must always be 8 characters long
                row_condition: "sku is not null"

Here, the expect_column_value_lengths_to_equal test is applied to ensure that every SKU number in the products table consists of exactly 8 characters. This format consistency is essential for system integration and avoiding mismanagement in inventory tracking and sales processes.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph