Configurations in FactoryTX

This article contains the following sections:

Introduction

You need to understand the basics of the FactoryTX configuration before you start using the tool. The diagram below shows the basic flow in the FTX process.

FTX Configuration Flow

JSON Format for Configuration

JavaScript Object Notation (JSON) is a lightweight data-interchange format. We use JSON as the format for the FTX configuration file.

For specifics on the JSON syntax, refer to: http://json.org

Overview of the Configuration File Structure

There are three sections of a FactoryTX configuration file:

  • Data Receiver: This section configures a component (in this case, a Data Receiver) to pull data into FactoryTX from a variety of sources, breaks it apart into "streams," and converts it into Pandas DataFrames.
    NOTE: Pandas is an open-source Python library used for data analysis. A DataFrame is a two-dimensional, labeled data structure comprised of rows and columns (like a spreadsheet or SQL table). You can think of a DataFrame as a group of series objects that share an index (i.e., the column names).
  • Transforms: This section conditions data with any operation that takes a DataFrame as input, and returns a DataFrame as output. For example, set indexes, reorder data, generate new columns, rename columns, etc.
  • Data Transmit: This section configures a component (in this case, a Data Transmit) to send the data to a cloud environment, or forwards it to another edge device.

Data Receiver Section

Every data receiver will have the following keys:

Key Name

Description

connections

Pull data from a local directory or remote server, as follows:

· For file-based protocol, specify a root_path and remote server/share information.

· For SQL protocol, specify a SQL server connection.

· For OPC UA or SMB protocol, provide the appropriate information.

NOTE: One connections key can support multiple streams/queries.

parsers

Convert files to Pandas DataFrames if the data you are pulling in is not already in a table or data frame format (i.e., CSV or Excel).

protocol

Type of receiver (e.g., localfile, SQL, OPC UA, SMB).

data_receiver_name

Used by the back end to store state and information about progress. You can change the data_receiver_name to whatever you want, as long as it is unique.

streams

List of data streams that will flow through and out of FTX. Streams are a critical element to FactoryTX, as they define how the data will be broken up by type and by asset. A stream is always defined by the following elements:

· file_filter: Wildcard search to only pull certain files from a root_path.

· stream_type: In situations where the data for a given Machine comes from multiple streams, you will want to define in the AI Data Pipeline how each of those streams is analyzed and then blended. The Stream Type is the identifier that will be used.

· asset: This is what is referred to as a Machine in the Sight Machine platform.

Data Receiver Section Sample

The following is a brief sample of the data_receiver section of code in the FTX configuration file. This sample shows a file-based protocol.

Note the following parameters:

  • connections: All of the files/resources to which you are going to connect.
  • data_receiver_name: How this data receiver appears in the user interface.
  • delete_completed -> streams: Data receiver-specific configuration parameters.
  • poll_interval: How frequently the data is checked in seconds.
  • protocol: The type of receiver.

About Polling

Typically, FTX is a polling rather than a real-time data pipeline. Each receiver has its own independent polling rate, expressed in seconds (can be sub-second). You can adjust the polling rates as necessary, based on the applications/sources being polled:

For real-time data sources, follow the basic formula of signal theory (AKA, sampling theorem): use a minimum of 2x the sample rate for signals that you are trying to poll. For more detailed information about this topic, go here:

For archival data sources (historians, databases, etc.) that can handle multiple messages, polling is more about the tradeoff between latency (i.e., how much time it takes for a packet of data to get from one designated point to another) vs. efficiency (i.e., moving the highest possible volume of data through the network).

Additional Data Receiver Requirements

Each data receiver may have additional requirements, depending on its protocol. For full details and complete configuration file samples for the various protocols, see the following sections in Configuring a Data Receiver:

  • Configuring a SQL Receiver
  • Configuring an OPC UA Receiver
  • Configuring a File-Based Receiver

Transforms Section

FactoryTX currently supports the set_timestamp and timestamp transforms, which you can use as-is or adjust slightly. If the data has multiple timestamps or date/time fields, the set_timestamp transform helps you identify which one will be used in the data modeling later.

NOTE: In the future, we will support extensibility on FactoryTX transforms: you will be able to write your own transforms in Python and then load them into FactoryTX.

Transforms Section Sample

The following is a brief sample of the transforms section of code in the FTX configuration file.

Stream Filters

In order to control which streams are processed by which transforms, FactoryTX has a filter_stream key that you define on each transform.

Stream filters are always described as a list of streams, with each stream described as an asset and stream_type.

Place each stream in quotation marks, using a colon to separate the asset and the stream_type. Separate multiple streams with commas. You can also replace any asset or stream_type with a star ("*") to imply all assets, or all stream types. For example:

Filter

Stream Filter

Filter on a specific stream.

filter_stream: ["ArmRobot_1:plcdata"]

Filter on multiple specific streams.

filter_stream: ["ArmRobot_1:plcdata", "ArmRobot_1:qualitylab"]

Filter on a stream type.

filter_stream: ["*:plcdata"]

Filter on an asset.

filter_stream: ["ArmRobot_1:*"]

Apply transform to all streams.

filter_stream: ["*"]

Data Transmit Section

Typically, you will want to configure the transmit to send data to a Sight Machine cloud environment. In that case, these are the required keys:

Key Name

Description

API_key_ID

The FactoryTX Username found in the AI Data Pipeline in the Sight Machine platform. Used for security purposes, to ensure that FTX has permission to post to the specified cloud environment.

For example:

"API_key_ID": "factory_1234567890ABCDEF@sightmachine_ftx.com",
		

To access the API keys, in the AI Data Pipeline, on the Edge tab, click API Keys.
For more details, see Managing Edge Devices in Contextualizing Data Using the AI Data Pipeline.

API_key

The FactoryTX API Key found in the AI Data Pipeline in the Sight Machine platform. Used for security purposes, to ensure that FTX has permission to post to the specified cloud environment.

For example:

"API_key": "1234567890ABCDEF1234567890ABCDEF",
		

To access the API keys, in the AI Data Pipeline, on the Edge tab, click API Keys.
For more details, see Managing Edge Devices in Contextualizing Data Using the AI Data Pipeline.

base_url

URL of the cloud environment that will receive raw data.

For example:

"base_url": "https://XXXXXX.sightmachine.io",
		

transmit_name

Auto-generated based on the type, and only used in the front end. You can change the transmit_name to whatever you want, as long as it is unique.

For example:

"transmit_name": "transmit_to_SM_cloud"
		


Data Transmit Section Sample

The following is a brief sample of the data_transmit section of code in the FTX configuration file.

Validating the Configuration

After you edit the FTX configuration file, the system validates it for you and will tell you if any information is missing. You can do the following:

  • Validate the schema: The configuration file will not be loaded by the application if there are errors. If any field is wrong or if any required field is not set properly, an indicator appears both on the left side of the code line in question, and in the lower-right corner. In addition, the Submit button will be disabled until you resolve all the errors. You can click the lower error link to open the error console, which will list all errors currently in the configuration.
  • Submit: When you click Submit, the FactoryTX application stops, applies the new configuration, and then starts up again.The Sight Machine platform starts processing data as soon as FactoryTX is running again.
  • Test the configuration: Check the logs for data receiver services to make sure that the connection works.

Managing Streams

After you edit the FactoryTX configuration and define data streams within the data_receiver section, the configured streams are listed in a stream table on the Streams screen.

The Streams screen allows you to do the following:

  • Validate streams configuration: The streams table includes all of the streams configured on the Configuration screen. It allows you to quickly confirm that the configuration file was interpreted correctly to reflect the correct list of data streams. There is no explicit action for validation, only user validation based on the Configuration screen.

  • Restream: For each configured stream, you can apply an action called Restream to achieve the following:

    Type of Data Receiver
         Results
    localfile or SMB
    • Clear all processed data that has not been transmitted.
    • Restart the data receiving process and acquire any raw data available in the localfile directory or the shared folder directory.
      NOTE: If the configuration option delete_completed is set to true, the receiver cannot process data that has already been processed and deleted unless it is copied back into the directory.
    SQL
    • Clear all processed data that has not been transmitted.
    • If the optional field initial_value is not set under state_fields for a specific SQL stream, data collection restarts from the top of the SQL table. Otherwise, data collection restarts from the starting point defined in initial_value.
    OPC UA
    • Clear all processed data that has not been transmitted.
      NOTE: Processing resumes for all new data streaming through the OPC UA receiver.