This page describes the expect_column_value_lengths_to_be_between
test from the dbt-expectations package. This test confirms the length of string values in a given column fall within a specified range. It is essential for ensuring data conforms to expected formats, potentially influencing data processing or user interface design where specific string lengths are required.
How it Works
The expect_column_value_lengths_to_be_between
test verifies that all string lengths in a designated column lie between an inclusive minimum and maximum value. This verification prevents errors related to data format mismatches, like truncation or misalignment in reports and data feeds.
Steps and Conditions:
- Column Selection: Choose the column whose string values are to be tested for length.
- Set Length Parameters: Define the acceptable range for string lengths using the
min_value
andmax_value
parameters. - Optional Configuration:
- Row Condition: Apply
row_condition
to filter which rows are to be included in the test, based on specific criteria. - Strict Comparison: By setting
strictly
tofalse
(default), the test includes the boundary values in the comparison. Ifstrictly
istrue
, the boundaries are exclusive, not including the minimum and maximum lengths.
- Row Condition: Apply
- Execution: During test execution, it evaluates each string's length in the specified column against the defined range.
- Outcome:
- Pass: The test passes if all string lengths are within the defined range, ensuring consistency in data entry.
- Fail: The test fails if any string length is outside the specified range, indicating an anomaly that might need correction or further investigation.
Example Usage: Fintech
For a Fintech company, ensuring accuracy in data such as account identifiers, which are often formatted as strings of specific lengths, is crucial. Incorrectly formatted account numbers can lead to processing errors affecting transactions and customer satisfaction.
Imagine the customer_accounts
table includes a column account_number
intended to store account identifiers as 10-digit strings.
In this scenario, the expect_column_value_lengths_to_be_between
test checks the length of the account_number
for all active customer accounts to ensure each is exactly 10 characters long. This setup helps in maintaining the integrity and usability of account data across the company's processing systems.