Configure and Initially Train a New Machine Learning Script

These steps describe how to configure a new machine learning script and then train it to create the initial set of training data.

Note: A new machine learning script is one that was not created by copying a script that was delivered as part of the GE Digital APM baseline Catalog content, and therefore it does not include an initial set of training data. If you did create the script by copying a baseline script, you should instead follow the steps to configure a machine learning script for which an initial set of training data already exists.

Steps

  1. Access the script that you want to configure and initially train.
  2. Configure the script as described in the following table.

    Section Description Required / Optional
    Standard List Select the standard list that is relevant to the script that you are testing. Required
    Query or Dataset

    Select to browse the Catalog and select the query or dataset that you want to use to train the script.

    The query or dataset that you select must meet the following requirements: 

    • Contains data that is relevant to the script.

    • Contains data in which you have a high degree of confidence.
    • Contains labeled data where the labels correspond to values in a standard list.

    • Includes a significant number of records to ensure that the script has a robust set of features.

    Notes:

    • Fields that are used in a query but hidden from display will be ignored by the script.
    • Hyperlinks defined for columns in a query will appear on the Test Results tab. This feature allows you to easily access associated records while testing a script. For example, a column containing work order numbers could be configured with a link to the full work order.
    • Ensure that the query you use is a Select query. Other types of queries will cause errors to occur.

    Tip: For example, to test the IsAFailure.py script, the query should include the short and long descriptions of the work history event, the work order priority, and the breakdown indicator.

    Required
    Field to Classify

    Select a column from the query or dataset that contains the correct values to use to improve the script's predicted values.

    Tip: For example, when testing the IsAFailure.py script, select the breakdown indicator column as the result column.

    Required
    Standard List Reference Field

    Select a column from the query or dataset that contains values that identify sub-sets of the specified Standard List (e.g., Equipment Class values).

    When the script is run, values in the specified field are compared to values in the List Reference field in Classifier Standard List records to determine which sub-set(s) of the script’s Standard List to use.

    Optional
    Classifier Input Fields

    Select the check box for each column in the query that you want to use as an input to the script.

    Tip: By default, the field selected in the Standard List Reference Field box is not included as an input to the script. If you want the script to process values in this field, include the column twice in your query or dataset.

    Required

    The script is configured.

    Configure Machine Learning Script

  3. Select .

  4. Select .

    The Continue with Full Train dialog box appears.

  5. Select OK.

    The script processes the data and the initial set of training data is created.

    Tip: When the training process is complete, you can select  to test the script, and then view the results on the Test Results workspace to ensure that they are as expected. See Test A Machine Learning Script for details about the information that appears in the Test Results tab.

  6. In the Workbench workspace, in the Query or Dataset box, modify the query or dataset to one which includes test data for which you want to predict a certain value.

  7. Select .

What's Next? 

Copyright © 2018 General Electric Company. All rights reserved.