dbt hub

dbt test: expect_column_values_to_not_match_like_pattern

USE CASE

String matching

APPLIES TO

Column

This page details the expect_column_values_to_not_match_like_pattern test from the dbt-expectations package. The purpose of this test is to ensure that values in a specific column do not match a given SQL LIKE pattern, which is critical for enforcing specific data formatting rules and maintaining the integrity of the data.

How it Works

The expect_column_values_to_not_match_like_pattern test verifies that no entries in a specified column match a provided SQL LIKE pattern. This check is crucial for preventing the presence of unwanted characters or patterns within data values, which could lead to errors in data processing or analysis.

Steps and Conditions:

  1. Column Selection: Choose the column that you want to validate against the LIKE pattern.
  2. Pattern Specification: Define the LIKE pattern that values in the selected column should not match.
  3. Optional Configurations:
    • Row Condition: Define a condition to filter rows that should be included in the check. This can be helpful for focusing the test on a subset of the data that meets certain criteria.
  4. Execution: The test scans the specified column in the rows that meet the row condition, checking each value to ensure it does not match the LIKE pattern.
  5. Outcome:
    • Pass: If no values in the column match the pattern, the test passes, confirming that the data in the column complies with the specified pattern restrictions.
    • Fail: If any value in the column matches the pattern, the test fails, indicating the presence of unwanted data patterns that need to be addressed.

Example Usage: Fintech Company

For a Fintech company, ensuring that sensitive data fields such as account numbers or transaction codes do not contain potentially harmful or incorrect characters (e.g., special characters) is essential for maintaining data security and integrity.

Imagine a scenario where a transactions table includes a column transaction_code that should consist only of alphanumeric characters and should not include any special characters like % or &.


tests:
  - dbt_expectations.expect_column_values_to_not_match_like_pattern:
      column: transaction_code
      like_pattern: '%&%'
      row_condition: "transaction_type = 'external'"

In this setup, the expect_column_values_to_not_match_like_pattern test ensures that no transaction_code values in transactions marked as external contain the & character. This check is critical to avoiding data entry errors or potential security issues caused by unexpected special characters in transaction codes.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph