This page outlines the expect_column_values_to_match_like_pattern
test from the dbt-expectations package, designed to ensure string entries in a specified column conform to a set SQL LIKE
pattern. This type of test is vital for validating data formats, such as email addresses, URLs, or any other text that should follow a specific pattern.
How it Works
The expect_column_values_to_match_like_pattern
test verifies that every entry in a designated column matches an SQL LIKE
pattern. By specifying a pattern, you can enforce consistency and correctness for text data formats within your database.
Steps and Conditions:
- Column Selection: Choose the column you wish to validate.
- Pattern Specification: Provide the SQL
LIKE
pattern that entries in the selected column must match. - Optional Configuration:
- Row Condition: Use the
row_condition
option to apply the test only to rows that meet certain conditions, thereby excluding others.
- Row Condition: Use the
- Execution: The test processes the specified column, checking each value to determine if it fits the provided pattern.
- Outcome:
- Pass: All column values match the like pattern, indicating that they adhere to the required text format.
- Fail: Any column value does not match the like pattern, signaling a potential error or inconsistency that might need correction.
Example Usage: B2B SaaS
For a B2B SaaS company, ensuring that data entries like customer email addresses are stored correctly is crucial for communication and data management. The expect_column_values_to_match_like_pattern
test can be applied to an email
column in the customer_contacts
table to validate the format of each stored email address.
Consider an instance where the customer_contacts
table includes essential contact information for business clients, and you need to verify the correctness of these email addresses.
In this example, the expect_column_values_to_match_like_pattern
test checks that each email address in the customer_contacts
table contains an '@' character, which is fundamental for a valid email address format. Using this pattern, the company ensures that all email data in their system is formatted properly before any email communication or analysis is executed.