dbt hub

dbt test: expect_column_values_to_match_like_pattern_list

USE CASE

String matching

APPLIES TO

Column

This page explains the expect_column_values_to_match_like_pattern_list test from the dbt-expectations package, which checks that string entries in a specified column conform to any one or multiple provided SQL LIKE patterns. Particularly suited for validating formats or specific string content in a column, this test is a powerful tool for ensuring data consistency and reliability.

How it Works

The expect_column_values_to_match_like_pattern_list test applies one or several LIKE patterns to string values in a column to confirm that each entry matches expected string formats. This test is useful for verifying that data adheres to format specifications, such as containing specific symbols or patterns.

Steps and Conditions:

  1. Column and Patterns Specification: Choose the column you want to validate and specify the list of LIKE patterns that the values should match.
  2. Pattern Matching:
    • Match Mode: Decide whether the values should match any pattern in the list (any) or all of them (all). The typical setting is any, which is used unless specified otherwise.
    • Optional Configurations:
      • Row Condition: Include a condition to filter the rows that this test will apply to (e.g., exclude null values).
  3. Execution: The test filters the data based on any row conditions, then checks each value in the selected column against the specified LIKE patterns using the chosen match mode.
  4. Outcome:
    • Pass: All checked values in the column match the specified patterns according to the chosen match mode, indicating conformity with expected string formats.
    • Fail: If one or more values do not comply with the specified patterns, the test fails, signaling possible inconsistencies in data entry or processing.

Example Usage: Fintech Company

For a Fintech company, ensuring accurate and consistent formatting in transaction data, like transaction IDs or references, is crucial for data tracking and integrity.

Consider a scenario where the transaction_data table records details about user transactions, including a transaction_reference column that stores unique identifiers for each transaction. It's important that these identifiers maintain a specific format for ease of tracking and consistency.


models:
  - name: transaction_data
    columns:
        - name: transaction_reference
          tests:
            - dbt_expectations.expect_column_values_to_match_like_pattern_list:
                like_pattern_list: ["TRX-%", "%-2023"]
                match_on: any

In this example, the expect_column_values_to_match_like_pattern_list test verifies that each transaction reference follows one of the set formats, either starting with "TRX-" or ending with "-2023". This ensures that all entries are in line with format expectations, facilitating reliable transaction identification and processing.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph