Checksum_record Transform
- Print
- DarkLight
Checksum_record Transform
- Print
- DarkLight
Article summary
Did you find this summary helpful?
Thank you for your feedback
The checksum_record transform creates a checksum based on values in a record and saves it in the factorytx_checksum column. This field will be saved outside of the data.fieldvalues of a record and moved outside of a record’s data structure upon being received by a MDP environment. It is recommended to place this transform last in the list, so that the record has all of its data as well as an explicit timestamp column.
Example:
If we want to create a checksum from the asset and timestamp columns of a record, our configuration will look something like this:
{
"transform_name": "Checksum",
"transform_type": "checksum_record",
"filter_stream": ["*"],
"record_keys": ["asset", "timestamp"]
}
Configuration:
Required and optional properties that can be configured for a hash_record transform.
- transform_name: Unique name for the transform.
- transform_type: Type of transform to apply. Should be checksum_record.
- filter_stream: List of data streams to transform. Each stream can either be * (all) or asset:stream.
- record_keys: List of keys in a record for generating a checksum of the record. Supported keys are: asset, stream_type, timestamp, and fieldvalues (values in a record). Please refer to Caveats section for more implementation details.
Caveats:
- record_keys order: The order of the specified keys in the record_keys config setting will affect the checksum. For example, checksums from [‘asset’, ‘stream_type’] will not equal checksums from [‘stream_type’, ‘asset’].
- timestamp hash: The value in the timestamp column will be converted into a string for hashing. The data type of the value will affect the hash value. For example, a datetime object (datetime(2019,
- fieldvalues hash: Values in a record will be sorted in alphanumerical order by their keys and then serialized into a JSON string, using this conversion table.