Data Preview
    • Dark
      Light

    Data Preview

    • Dark
      Light

    Article summary

    In this article, we'll guide you through the process of improving data quality using Preview.

    What is Data Preview and how does it benefit?

    With Data Preview you can validate data quality faster with streaming data previews which return an initial result set quickly while continuing to process the full preview.

    • Understand the progress of your preview with a % complete indication, enabling you to decide between using the initial preview results or waiting for the full preview results to become available.

    • Avoid waiting for long-running previews to complete by now having the ability to cancel previews once the initial result set is available.

    • Validate data quality iteratively with incremental loading of results.


    Accessing Preview

    Initiating the Preview Workflow

    To begin ensure you are viewing the Draft version:

    Step 1: Start the preview workflow by selecting the port of an operator. This action triggers the preview configuration.

    Configuring the Preview

    Choose between two options to run the preview: Date Range or Number of Records.  

    1. Date Range: Process records within a specified date range based on the timestamp of the data from the source

    2. Number of Records: The Preview will run until the port selected has processed the number of records configured.

    If choosing the date range option, select a start and end date range.

    Note: Be aware of the time zone selected. It's possible to pick a personal time zone, UTC, or other time zones configured in the environment.

    Step 4: Configure a timeout in terms of number of seconds. If the data doesn't process within the selected date range, it'll stop after the timeout period.

    Step 5: Once ready, select Preview Data to preview results.

    Streaming Preview Results

    Preview results will load iteratively and are available to view even before the preview is complete.

    The initial set of results will load automatically. This step is quick and offers an early glimpse into the data.

    NOTE:

    Timestamps are in the user's selected time zone.

    Step 7: You can always "Update Results" as the preview progresses to get the most up to date view of the Preview results.

    Step 8: Once the final preview is complete, a preview complete status will be displayed. You can choose to "Update Results" to view the completed preview results.

    Re-Accessing Preview Results

    It's possible to access existing preview results after exiting the preview.

    Reopen the preview by choosing a port that has a preview result set available. Look for the screen indication.

    Initiating a New Preview

    To rerun a preview or initiate a new preview, choose "Reset Preview".  

    Canceling a Preview

    Stop a running preview whenever needed.  Choose the "Cancel" option.  This action keeps all visible results and any processed data since the last table update.


    Preview Results

    The preview result set offers several options to manipulate and filter preview results in a table format.

    Preview Statistics: To quickly identify columns with potential data issues, each column includes statistics including % NULL and, for continuous data fields, MIN and MAX values.  You can easily find occurrences of each by selecting the chip, and scrolling forward or backward to the next instance.  

    NOTE:

    Preview statistics are based on the result set after filters are applied and re-run as filters are updated.


    Expand Preview Panel Height: To expand or reduce the height of the preview table, drag the table and it will resize vertically.


    Adjust Selected Columns: You can adjust the visible columns by either adding/ removing columns or re-ordering.


    Full-Page Preview Results: You can open preview results in a new tab.



    Download Results: Download the preview results to a CSV file.


    Filter Column Results

    Timestamp Instant Type: Timestamp Instants can be filtered by selecting either absolute or relative time ranges.




    Strings: Strings can be filtered with several operators including null comparisons.



    Preview Filters

    In this guide, we'll walk you through improving preview performance times by filtering out unnecessary data.  This is particularly useful when dealing with large volumes of data.

    Step 1: Initiate your preview by selecting the preview icon or directly selecting the operator port.


    Step 2: Instead of processing all the data and then filtering out the unnecessary records, filter out the data before the preview data is returned by adding a preview filter.


    Step 3: Configure your preview filter using the same syntax as the "Apply Filter by Expression" syntax.  In this example, we're removing records where the Gross_Weight is less than 33.


    Step 4: Select Apply to save your Preview Filter.


    Step 5: Run or refresh your preview.















    Now, the preview only returns records with gross tons of less than thirty-three. This upfront removal of data early in the process significantly improves preview performance times.