dbt hub

dbt test: expect_row_values_to_have_recent_data

USE CASE

Freshness

APPLIES TO

Column

This page contains information on the expect_row_values_to_have_recent_data test within the dbt-expectations package. This test is specifically designed to check that the rows in a given column contain data dates that are recent, as defined by a specified interval prior to the current timestamp. This ensures that your models are continuously updated and reflect the most current data available.

How it Works

The expect_row_values_to_have_recent_data test verifies that the dates in a specified column are within a certain interval from the current date and time. This test is vital for monitoring the timeliness of data updates in your models, ensuring that data used for reporting or analysis is up-to-date.

Steps and Conditions:

  1. Column Selection: Identify the column which contains date or datetime values that should be checked for recency.
  2. Define Recency Threshold: Specify the interval (such as days, hours) within which the data should be recent relative to the current timestamp.
  3. Configuration Options:
    • Row Condition: Optional parameter to specify a condition to filter the rows that should be considered in the test.
  4. Execution: The test evaluates each row in the selected column to confirm the date is within the defined interval of the present moment.
  5. Outcome:
    • Pass: If all selected rows have dates that meet the recency requirement, the test passes, endorsing the freshness of the data.
    • Fail: If any row contains data outside the specified interval, the test fails, highlighting the presence of potentially outdated information.

Example Usage: Fintech

In a Fintech company, ensuring data, such as transaction or user activity logs, is updated in real-time or near real-time is crucial for maintaining operational efficiency and making timely decisions.

Consider a model, user_transactions, which logs every transaction detail. It's critical for such a model to contain the most up-to-date transaction information for timely processing and analysis.


models:
  - name: user_transactions
    columns:
        - name: transaction_date
          tests:
            - dbt_expectations.expect_row_values_to_have_recent_data:
                datepart: day
                interval: 1
                row_condition: 'transaction_status = "processed"'

This example demonstrates using the expect_row_values_to_have_recent_data test on the transaction_date column of the user_transactions model. By setting the interval to 1 day and the row condition to only include processed transactions, this setup helps ensure that the Fintech company's data for processed transactions remains extremely current, facilitating prompt and accurate financial operations and reporting.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph