# Analytics Tools

This section provides a detailed description of Sight Machine's analytics tools. It contains the following sections:

# Working with Analytics

EMA includes all of the visibility tools that are available in the EMV application, as well as more sophisticated exploratory data analysis and monitoring tools built on Sight Machine’s data models, such as the Data Discovery Toolkit.

There is also an available software development kit (SDK).

# Using the Data Discovery Toolkit

The Enterprise Manufacturing Analytics (EMA) application includes the Data Discovery Toolkit, Sight Machine’s toolkit for Exploratory Data Analysis . This toolkit allows you to evaluate the thousands of sensors and set points involved in the production process to identify those that may be impacting quality and efficiency.

The Data Discovery Toolkit consists of the following tools, differentiated by function:

Tool Function |
Tool |
Description |

Analyze Parameter Variability |
Variance Analysis | Surfaces parameters involved in the production process that have high variance and may be impacting quality and efficiency. See Tools that Analyze Parameter Variability. |

Analyze Parameter Variability |
Descriptive Statistics | Allows you to perform an in-depth analysis of one parameter to investigate its impacts on production. See Tools that Analyze Parameter Variability. |

Evaluate Parameter Relationships |
Correlation Heatmap | Allows you to investigate the linear pairwise relationships between any number of parameters. See Tools that Evaluate the Relationship Between Parameters. |

Evaluate Parameter Relationships |
Time-Series Correlation | Allows you to investigate which parameters have the highest level of correlation to a given parameter. See Tools that Evaluate the Relationship Between Parameters. |

Evaluate Parameter Relationships |
Curve Fit Analysis | Provides you with an in-depth look at the relationship between a pair of parameters. See Tools that Evaluate the Relationship Between Parameters. |

Evaluate Parameter Relationships |
Timeline Analysis | Allows you to apply contextual modeling to understand event sequence relationships between multiple parameters See Tools that Evaluate the Relationship Between Parameters. |

It is important to remember that all of these exploratory analytic tools are built on the data modeled by Sight Machine’s AI Data Pipeline. The AI Data Pipeline uses machine learning and AI to dissect raw data and remove variability that happens outside the cycle that could interfere with the analysis, such as anomalies during planned downtime. In addition, the AI Data Pipeline models summarize parameter readings for each cycle, eliminating noise from higher granularity readings that may be too detailed for an exploratory analysis.

## Tools that Analyze Parameter Variability

The two following tools in the Data Discovery Toolkit are focused on analyzing parameter behavior or variability:

### Analyzing Variances in the Parameters

The Variance Analysis tool is an exploratory data tool that identifies parameters that have high levels of variance and are therefore good candidates for additional analysis. It is often one of the first steps in performing data analysis.

**NOTE:** This tool only works on numeric fields, not categorical fields such as Pass/Fail.

This tool is built on a Cycles model so it ignores variability that happens outside the cycle (non-production data), which could interfere with the analysis vs. traditional variance analysis done via spreadsheets.The Cycles model summarizes parameter readings for each cycle, eliminating noise from higher granularity readings that may be too detailed for an exploratory analysis. Raw data can be analyzed if needed using the Raw Data Visualization tools. For more information, see * Raw Data Visualization *in Enterprise Manufacturing Visibility (EMV).

On the Analysis tab, begin by clicking **Data Discovery: Variance Analysis**.

You can choose your options on the left to generate your chart.

The Variance Analysis options include:

**Model:**You can analyze only cycles for a specific asset (AKA, machine). You cannot change this option.**Assets:**You can select one or more assets of the same type to analyze.**Timeframe:**You can define a date range of data points to analyze for those assets. For example, the last 7 days. This establishes the boundaries for the near real-time data that will be part of your analysis.**Update:**Click this button to generate your chart.

**Tool Output:**

- The Variance Analysis tool displays and ranks the 10 parameters from this asset that exhibit the highest amount of variability.
- You will see a histogram for each parameter to visualize the overall distribution of its values, and a box plot for each parameter to visualize the overall distribution, quartiles, and the standard deviation.
- This helps you determine the top parameters that are good candidates for further investigation.

### Running an In-Depth Parameter Analysis with Descriptive Statistics

The Descriptive Statistics tool allows you to perform an in-depth analysis of one single parameter.

**NOTE: **This tool works on both numeric and categorical fields.

This tool is built on a Cycles model so it ignores variability that happens outside the cycle (non-production data), which could interfere with the analysis vs. traditional variance analysis done via spreadsheets. The Cycles model summarizes parameter readings for each cycle, eliminating noise from higher granularity readings that may be too detailed for an exploratory analysis. Raw data can be analyzed if needed using the Raw Data Visualization tools. For more information, see * Raw Data Visualization *in Enterprise Manufacturing Visibility (EMV).

On the Analysis tab, begin by clicking **Data Discovery: Descriptive Statistics**.

You can choose your options on the left to generate your chart.

The Descriptive Statistics options include:

**Model:**You can analyze only cycles for a specific asset (AKA, machine). You cannot change this option.**Assets:**You can select one or more assets of the same type to analyze.**Timeframe:**You can define a date range of data points to analyze for those assets. For example, the last 7 days.**Data Field:**You can select the parameter that you want to investigate.**Update:**Click this button to generate your chart.

**Tool Output:**

- The Descriptive Statistics tool displays a histogram for the selected parameter to visualize the overall distribution of its values, and a box plot to visualize the overall distribution, quartiles, and the standard deviation.
- Underneath, you will see box plots for the selected parameter broken down by week to illustrate potential drift over time, as well as a visualization of the records captured per day. This allows you to discover anomalies in the frequency of data capture.

## Tools that Evaluate the Relationship Between Parameters

The following four tools in the Data Discovery Toolkit are focused on evaluating the relationship between manufacturing parameters:

### Investigating Relationships Between Parameters

The Correlation Heatmap tool is an exploratory data tool that allows you to investigate the relationships between any number of parameters.

**NOTE: **This tool only works on numeric fields, not categorical fields such as Pass/Fail.

On the Analysis tab, begin by clicking **Data Discovery: Correlation Heatmap**.

You can choose your options on the left to generate your chart.

The Correlation Heatmap options include:

**Model:**You can evaluate two types of parameter relationships:**Cycles:**Determine if there is a correlation between parameters in the same asset or asset type.**Parts:**For discrete manufacturers, determine if there is a correlation between parameters across different machines producing the same part type.

**Assets:**You can select a given asset, or multiple assets of the same type, to monitor. For example, you can select the Cycles model to look at the relationships between the parameters of specific machines. If you are using the Parts model, you need to select a part type.**Timeframe:**You can define a date range of data points to analyze for those assets. For example, select the last 7 days. This establishes the boundaries for the near real-time data that will be part of your analysis.**Data Field:**You can select the specific parameters of interest. There is no limit to the number of parameters that you can select.**Update:**Click this button to generate your chart.

**Tool Output:**

- The Correlation Heatmap tool displays a heatmap of the correlation coefficient between all possible pairs of parameters associated with the selected machine type. It provides you with a visualization of the nature of correlation using color (positive in blue or negative in red), and the magnitude of the relationship using intensity or hue.
- The value of correlation is represented by Pearson-r coefficient displayed in each cell.
- This is an excellent way for you to discover which parameters are correlated to each other.

### Taking an In-Depth Look at the Relationship Between a Pair of Parameters

The Curve Fit Analysis tool allows you to take a more in-depth look at the relationship between a pair parameters.

**NOTE:** This tool only works on numeric fields, not categorical fields such as Pass/Fail.

On the Analysis tab, begin by clicking ** Data Discovery: Curve Fit Analysis**.

You can choose your options on the left to generate your chart.

The Curve Fit Analysis options include:

**Model:**You can select:**Cycles:**Examine the correlation between two parameters in the same asset or asset type.**Parts:**Examine the correlation between two parameters across different machines producing the same part type.

**Assets:**You can select assets to monitor. For example, to perform a Cycles analysis, you need to select an asset to analyze. You can select 2 specific machines. If you are using the Parts model, you need to select a part type.**Live Data:**You can define a date range of data points to analyze for those parameters. For example, the last 7 days.**X Axis:**You can select a parameter for the asset to be plotted along the X axis.**Y Axis:**You can select a second parameter for the asset to be plotted along the Y axis.**Stratification:**You can to select a stratification. This allows you to introduce a third dimension by classifying the relationship between data points in different colors for that third parameter. This enables you to determine if this third parameter has a relationship with your previously selected parameters. For example, you can select the stratification of machines so that the output is differentiated for each machine.**Update:**Click this button to generate your chart.

**Tool Output:**

- The Curve Fit tool outputs a scatter plot with a regression line and two histograms, one for each parameter. This enables you to gain a deeper understanding of the relationship between the two parameters.
- For example, the stratification that you set applied a different color for the data points for each machine, allowing you to see if the machine impacted the relationship.
- The value of correlation is represented by Pearson-r coefficient, and you can see other useful statistical metrics displayed, such as r-squared, p-value, and standard error.

### Identifying Parameters with the Highest Level of Correlation

The Time-Series Correlation tool allows you to investigate which parameters have the highest level of correlation to a given parameter.

**NOTE: **Correlation is limited to continuous/numeric parameters.

On the Analysis tab, begin by clicking ** Data Discovery: Time-Series Correlation**.

You can choose your options on the left to generate your chart.

The Time-Series Correlation options include:

**Model:**You can select:**Cycles:**Examine the correlation between two parameters in the same asset or asset type.**Parts:**Examine the correlation between two parameters across different machines producing the same part type.

**Assets:**You can select assets to monitor. For example, to perform a Cycles analysis, you need to select an asset (or multiple assets of same type) to analyze. You can select 2 specific machines. If you are using the Parts model, you need to select a part type.**Timeframe:**You can define a date range of data points to analyze for those assets. For example, the last 7 days.**Y Axis:**You can select the parameter that you want to analyze.**Update:**Click this button to generate your chart.

**Tool Output:**

- The Time Series Correlation tool computes the correlation between the selected parameter (displayed on the Y axis) and all other continuous/numeric parameters. It then displays the top 10 parameters that have the highest correlation coefficient and produces 11 time series charts, as follows:
- The first chart displays all 11 parameters: the selected parameter and the top 10 most correlated parameters.
- Below that are 10 additional time series plots, one for each parameter. These visualizations and associated shapes allow you to gain a deeper understanding of the relationship between two parameters.

### Identifying the Relationship Among Multiple Parameters

The Timeline Analysis tool is an exploratory data tool that displays the relationship among multiple parameters.

On the Analysis tab, begin by clicking **Data Discovery: Timeline Analysis**.

You can choose your options on the left to generate your chart.

The Timeline Analysis options include:

**Model:**You can select:**Cycles:**Examine the correlation between two parameters in the same asset or asset type.**Parts:**Examine the correlation between two parameters across different machines producing the same part type.

**Assets:**You can select assets to monitor. For example, to perform a Cycles analysis, you need to select an asset to analyze. For example, 3 specific machines. If you are using the Parts model, you need to select a part type.**Timeframe:**You can define a date range of data points to analyze for those assets. For example, the last 7 days.**Y Axis:**You can select up to 5 parameters to analyze.**Update:**Click this button to generate your chart.

**Tool Output:**

- The Timeline Analysis tool displays 5 time series charts sharing an X axis (i.e., time) that provide a deeper understanding of the relationship among multiple parameter behavior.
- For all of these, you can drag and zoom to focus in on a certain time period.
- If you select multiple machines, each will be represented by a line. You can toggle the display of each machine’s values by clicking the legend to turn them on off.