Use Cases

This article describes some typical tasks that you can complete using the SDK. It contains the following sections:

Using the SDK to Export Data from the Platform

The SDK can retrieve data from the platform and export this data so that it can be stored in other formats, used with other visualization tools, or integrated with external data sources, such as financial data.

To export data from the platform:

  1. Run:
  2. twin = cli.get_twin('Machine', MACHINE_TYPE)
    
    QUERY = {   
        'endtime' : {'$gte' : DATE_START, '$lt' : DATE_END},   
        'machine.source_type' : MACHINE_TYPE
    }
    
    df_cycle = twin.fetch_data(cli.session, 'cycle', QUERY, normalize=True)
    	
  3. The data retrieved from the platform is stored in memory as a pandas dataframe. Pandas provides utilities for exporting a dataframe to formats such as CSV:
  4. df_cycle.to_csv(‘cycles.csv’)
    	

For more details about the CSV export, see: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html

You can reference the complete list of pandas export options at: https://pandas.pydata.org/pandas-docs/stable/api.html#id12

Using the SDK for Exploratory Data Analysis

Between the SDK’s built-in visualization tools and the structure provided by pandas dataframes, the SDK supports powerful exploratory analyses.

After downloading data, you can visualize it in several ways including:

  • Creating a histogram of a parameter, stratified by asset.
  • Creating box plots of a parameter, stratified by asset.
  • Creating a correlation heatmap to explore relationships between parameters.

To create a histogram of a parameter, stratified by asset:

  1. Run:
  2. df_tmp = df_cycle[['temperature', ‘machine’]]
    
    plt = cli.get_plot('histogram', df_tmp)
    
    iplot(plt.plot())
    	

To create box plots of a parameter, stratified by asset:

  1. Run:
  2. df_tmp = df_cycle[[‘model_number’, 'temperature', 'machine']]
    
    plt = cli.get_plot('box', df_tmp)
    
    iplot(plt.plot())
    	

To create a correlation heatmap to explore relationships between parameters:

  1. Run:
  2. df_tmp = df_cycle[[
       'temperature',
       'pressure',
       'velocity',
       'flowrate',
       'current’,
       ‘voltage’
    ]]
    
    plt = sm.Client.get_plot('heatmap', 
        df_tmp.corr().unstack().to_frame().reset_index())
    
    iplot(plt.plot())
    	

Using the SDK to Develop a Model

You can use data retrieved with the SDK to train a model to predict, for example, how a process will perform as an input is adjusted.

The following illustrates the process of applying a Linear Regression from scikit-learn to data retrieved using the SDK:

from sklearn.cross_validation import train_test_split
from sklearn.linear_model import LinearRegression

x = df_cycle[[‘voltage’, ‘current’]]
y = df_cycle[‘temperature’]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)

reg = LinearRegression()
reg.fit(x_train, y_train)

You can then explore the model and apply it using methods such as:

reg.coef_
reg.predict(x_test)

For more information, see: http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression

Using the SDK to Develop Analytics

Because the SDK is an extension of the Sight Machine platform, you can turn analyses developed in the SDK into repeatable analytics that live inside the platform behind a user interface, analogous to the Data Discovery Tools.

Conveniently, the SDK retrieves data from the platform in the same format that platform analytics use, and the visualization tools are powered by Plotly, just as custom analyses are.

An analysis built using the SDK that is structured to be repeatable is a good candidate to publish in the Sight Machine platform. For more information about the structure and how the pieces fit together, see Structuring an Analysis for Use in the Sight Machine Platform in Reference Information.