- Print
- DarkLight
Pipeline Builder
- Print
- DarkLight
Overview
Pipeline Builder is a feature in Factory Build that automatically converts raw data in various formats and from various sources into a real-time data stream. Built upon a data foundation, it accommodates for late, missing, and out-of-order data in real time.
Critical to real-time data processing, Stateful Processing retains knowledge of previous records and makes calculations (and recalculations) to the existing state accordingly, as new data arrives.
As late or out-of-order data come in, data models are refreshed in real time (hot path), without the need to rerun through a batch process (cold path).
Accessing Pipeline Builder
This tutorial outlines the process of accessing the pipeline builder in Factory BUILD.
Step 1: Open the top navigation and navigate to Factory BUILD.
Step 2: From the displayed workspaces, select the one containing the desired pipeline.
Step 3: Proceed to access the pipeline. Note that if there's a deployed version, it'll be displayed by default. In case the pipeline hasn't been deployed before, the draft version will appear.
Basic Navigation
Explore the basic navigation of the pipeline builder in this guide.
Factory BUILD Homepage: Select the Workspace Icon to navigate back to the Factory BUILD page as needed.
Auto-Save: The system saves your work automatically. However, always manually save after making changes to ensure the persistence of a save.
Workspace Navigation: Use the provided dropdown to switch between the pipeline builder and environment builder.
Draft vs Deployed: If available, access the deployed version of the pipeline using the dropdown.
Options: Deploy the pipeline, navigate to JSON mode, manage extensions to upload new operators, and access a full revision history using the options menu.
Error Console: Use the centralized error console to navigate errors within the pipeline context. Selecting an error will select the applicable operator on the canvas as well as open the operator configuration.
Search: Use the pipeline search feature to search the pipeline. The tooltip provides examples of more complex search criteria including the use of booleans.
Operator Library: The operator library contains a full listing of operators that can be added to your pipeline, management of Templates to copy/delete, and the option to upload new operators.
Central Error Console
This guide explains how to navigate pipeline errors using the error console. This feature is crucial when dealing with large pipelines, as it helps identify and understand potential errors within the operators on the canvas.
Start by locating the error console in the upper right corner of the screen.
Step 1: Select the error console.
This action reveals all errors within the context of the pipeline. It's possible to search for operator name and error description, or simply scroll through the list.
Step 2: Search for a specific operator name or error description.
Upon selecting an error, the corresponding operator on the canvas is automatically selected and the operator configuration drawer opens.
Step 3: Select an error from the list.
If operators are part of a template, the template will also expand. The following example illustrates this.
Explore the dynamic use of pipelines in this guide, which incorporates JINJA syntax, Data Dictionaries, and unpacked JSON configuration.
Step 1: Open an operator that uses JINJA syntax.
Step 2: Unpack the JSON configuration and evaluate it in JSON.
Step 4: Access the tables using the data dictionary option.
Step 5: Within the data dictionary option, navigate to the appropriate Table name using the dropdown.