Curve Fit Analysis

Prev Next

Overview

Curve Fit Analysis is a multivariate tool used to investigate the relationship between two parameters in exploratory processes. It uses linear least-squares regression and scatterplots, providing metrics like Pearson-R correlation coefficient and r-squared. It can be performed on Cycle and Part models, enabling high-granularity and lower-granularity comparisons. This data-driven decision ensures accurate predictions and efficient process optimization in manufacturing operations.

Before You Begin

Ensure you have:

  • Access to the Application tab

  • Permission to view selected assets or part types

  • Two numeric parameters available (X and Y)

  • At least 30 observations in the selected date range (100+ recommended)

Create a Curve Fit Analysis

1. Open Curve Fit Analysis

  • Navigate to the Application tab.

  • Select Curve Fit Analysis.

  • The configuration panel opens.

2. Select a Data Model

Select how you want the tool to group data:

  • Cycles – cycle-level parameters from the same asset type

  • Parts – part-level data across machines producing the same part type

💡 Tip: Use Cycles for equipment behavior and Parts for final product characteristics.

3. Select Assets

  • Choose an Asset (Cycles) or Part Type (Parts).

  • Select one or more assets of the same type.

  • Confirm selections in the left panel.

📝 Note: Using multiple similar assets increases data volume and regression stability

4. Set the Time Range

Choose:

  • Relative ranges (Last 7/30/90 days)

  • Absolute ranges (manual dates)

📝 Note: Larger ranges provide more data but may include process changes.

5. Select X and Y Parameters

  • Choose the X-axis parameter (independent variable).

  • Choose the Y-axis parameter (dependent variable).

Both must be numeric fields.

Example:

  • X: Oven Temperature

  • Y: Cure Time

6. Add Stratification (Optional)

To compare different conditions:

  • Select a categorical field (e.g., Shift, Product Type).

  • Each category generates its own regression line.

  • Leave blank for a single-line regression.

7. Configure Carry-Forwards

Choose how forward-filled values should be handled:

  • Keep All (default)

  • First

  • Last

8. Generate Results

  • Select Update.

  • Wait for the scatter plot and regression line to load.

  • Review statistical metrics and visuals

How Linear Regression Works

Least-Squares Method

Curve Fit Analysis uses linear least-squares regression, which fits the line that minimizes the squared distance between each data point and the regression line:

  • y = mx + b

  • m = slope

  • b = intercept

Calculation Formulas

Slope (m):

m = (n × Σ(xy) - Σx × Σy) / (n × Σ(x²) - (Σx)²)

Intercept (b):

b = (Σy - m × Σx) / n

Statistical Metrics Provided

1. Pearson-R

Measures linear correlation strength.

Range: –1.0 to +1.0

  • ≥ 0.7 strong

  • 0.4–0.7 moderate

  • < 0.4 weak

💡 Tip: Pearson-R only measures linear patterns.

2. R-squared (R²)

Explains how much of Y’s variance is predicted by X.

Range: 0.0–1.0

  • ≥ 0.7 strong predictor

  • 0.4–0.7 moderate

  • < 0.4 weak

Formula: R² = r²

3. P-value

Indicates statistical significance.

p < 0.05 → significant

p < 0.01 → highly significant

p ≥ 0.05 → not significant

⚠️ Warning: Significance ≠ causation.

4. Standard Error

Measures how far points deviate from the regression line.

Lower values = better fit.

Data Requirements

Numeric Fields Only

  • X and Y must be continuous numeric fields.

  • Categorical fields can only be used for stratification.

Cycles vs. Parts

  • Cycles → machine cycle data

  • Parts → part-level comparisons

Date Ranges: Use relative or absolute windows.

💡 Tip: Be cautious of long ranges that include process or equipment changes.

Interpreting Results

Scatter Plot: Shows:

  • Distribution of data points

  • Regression line alignment

  • Concentration or spread

  • Outliers

Histograms: Display data distributions for X and Y:

  • Identify skew

  • Spot outliers

  • Check data ranges

Common Use Cases

1. Predict Outcomes: Use regression equations to estimate results (e.g., predict Cycle Time from Temperature).

2. Identify Drivers: Understand how inputs relate to outputs.

3. Validate Expectations: Confirm whether expected relationships hold true.

4. Compare Conditions: Use stratification to test differences across shifts, operators, products, or machines.

Limitations & Considerations

Linearity Requirement: Relationships must be linear. Non-linear patterns require other methods.

Outliers: Large deviations can distort the regression line. Always inspect visually.

Causation: Regression describes correlation; it does not prove cause-and-effect.

Sample Size

  • Minimum: 30 points

  • Recommended: 100+

 ⚠️ Small datasets can produce misleading slopes or R².

Feature Benefits

  • Detailed Relationship Validation: Allows for in-depth investigation of the relationship between two specific parameters, confirming whether a high correlation value is meaningful or just a data artifact.

  • Linear Regression Computation: Computes and visualizes a linear least-squares regression between the two parameters, providing a precise, quantifiable fit line.

  • Comprehensive Scatterplot Visualization: Provides a clear visual representation via a scatterplot, complete with histograms along each axis to simultaneously show the individual distributions of the two parameters.

  • Stratification Capability: Allows users to select a categorical field for stratification, enabling the breakdown of data points into separate traces to analyze how the relationship differs across various categories (e.g., product type).

  • Validation Metrics (R-squared & Pearson-R): Delivers a table of key regression metrics, including the Pearson-R correlation coefficient and the R-squared value (see note below). The R-squared value provides a simple estimate of how well the independent variable determines the dependent variable (closer to 1.0 indicates a stronger fit).
    NOTE: Statisticians can speak volumes on the finer points of interpreting the r-squared value, but it is basically an estimate of how well the independent variable (the variable on the X axis of your chart) determines the dependent variable (the variable on the Y axis of your chart). Small numbers close to zero indicate a weak relationship. Bigger numbers close to one indicate a stronger relationship.

  • Strategic Exploratory Tool: Most helpful later in the exploratory process; serves as an excellent follow-up tool after identifying key pairs using the Correlation Heatmap or Time-Series Correlation.

  • Model Flexibility (Cycle & Part): Supports multivariate analysis on Cycle and Part data models, enabling high-granularity analysis within a machine (Cycle model) or cross-machine comparison (Part model).

  • Integrated Summary Statistics: Includes a table of summary statistics (Count, Standard Deviation, Min, Max) for both selected parameters, providing immediate contextual data for the analysis.

Summary

The Curve Fit tool outputs a scatterplot with a regression line and two histograms, one for each parameter. This enables you to gain a deeper understanding of the relationship between the two parameters.

For example, the stratification that you set applied a different color for the data points for each machine, allowing you to see if the machine impacted the relationship.

The value of correlation is represented by the Pearson-R correlation coefficient, and you can see other useful statistical metrics displayed, such as r-squared (see note below), p-value, and standard error.NOTE: Statisticians can speak volumes on the finer points of interpreting the r-squared value, but it is basically an estimate of how well the independent variable (the variable on the X axis of your chart) determines the dependent variable (the variable on the Y axis of your chart). Small numbers close to zero indicate a weak relationship. Bigger numbers close to one indicate a stronger relationship.