Challenges & Best Practices in Industrial Artificial Intelligence

Author Sticky

Cobus van Heerden

Senior Product Manager, Analytics, AI and Machine Learning Software

GE Vernova’s Proficy® Software & Services

Cobus van Heerden has 25 years of experience in developing, implementing and commercializing industrial analytics software globally with expertise across manufacturing industries. He specializes in helping industrial organizations realize transformational productivity gains through applying digital technology, advanced analytics and machine learning.

Dec 23, 2024 Last Updated

How quickly are machine learning (ML) and artificial intelligence (AI) technologies moving? Take the case of AlphaGo. In 2015, AlphaGo – powered by ML – became the first computer program to beat a professional Go player (considered one of the most complex strategy games in existence). In 2017, AlphaGo Master, the next generation of the program, beat the No. 1 ranked player in the world at that time. Both AlphaGo and AlphaGo Master demonstrated that ML could surpass human performance.

Many people get excited and attach ML/AI to every aspect of human activity, claiming these technologies will replace the majority of human jobs – even professional ones.

However, there are many challenges when applying ML to real world applications. Imagine if AlphaGo could only see part of the Go board, or there were hidden rules not defined upfront. In the real world, applications are not usually defined as clearly as the black or white stones on the Go board. And in the industrial space, the environment is infinitely more complicated, where human behavior and machine operations are tangled with physical, chemical, and biological processes on mechanical, electrical, and electronic equipment. It introduces specific challenges to industrial AI applications from both algorithm and data perspectives.

Algorithm Challenges

An analogy: AI/ML to the digital revolution is like the steam-gas powered engine to the industrial revolution. Imagine each AI/ML algorithm is an engine. You need different kinds of engines for different applications; there is no one-size-fits-all. For example, an engine designed for a Ferrari is not the best one suited for a tractor used on a farm.

High fidelity requirement: There are some specific requirements to the AI engine for industrial applications. The most critical one is the high expectation around sensitivity and specificity. Take, for example, the online shopping recommendation system or a movie recommendation system. If you found a couple of items you liked in the listed recommendations, you may think this AI-driven functionality is amazing. But in a machine-failure prediction system, missing one failure will likely cause you to question the reliability of the system, even though it catches the other 99% of failures. Why? One false prediction in an industrial environment may cause production loss, labor cost, and project delay, or even a catastrophic failure, which could cause millions of dollars in damages or lost production, environmental impacts or severe injuries.
Explainable/actionable result requirement: Due to the high stakes on the table, engineers and technicians who have been working in the field for many years may not trust black box recommendations if they cannot explain how the predictions were made because there are real-world consequences for each action taken (or not taken). In order to build trust, AI output must be explainable and actionable.
Domain boundary requirement: AI/ML has to provide useful information within the boundary of domain knowledge. AI/ML rely on data and data are collected from physical systems which follow physical laws. Often I hear domain experts say, “Do not tell me something I already know or something does not make sense in my domain or violate certain ethics/standards/guidelines.”

Data Challenges

If we think of AI/ML as a gas engine, then data is the oil to power the AI/ML algorithms. Owning data is more valuable and crucial than owning algorithms, but there are many specific challenges associated with data in the industrial space.

Engines cannot consume crude oil, so an oil refinery is necessary to transform crude oil to clean gasoline. Industrial data has to go through a similar refining or cleaning process to be consumed by ML algorithms. During this process, domain knowledge is the key, and it’s that knowledge that decides how data is processed.

Dirty data: First and foremost is the “dirty data” problem – every data scientist’s headache. This is not unique to industrial applications, but it is more complex than missing or redundant data. The data is collected with lots of noise (data is corrupted, or distorted, or has a low Signal-to-Noise Ratio, or other meaningless information) and varies from source to source. It’s one of the most challenging parts of the process due to environmental issues, budget constraints, human factors, and other limitations.
Class unbalancing: For most ML algorithms to work, they need to be taught with examples; this is called training data. Training data includes all possible patterns with clearly labeled outcomes. For failure detection in industrial applications, there are no gold standards, and normal/faulty patterns are usually context dependent. The reasons for machine failure continue to evolve, and there is no black and white boundary to create clear distinctions in each algorithm. Furthermore, failures are rare in industrial environment due to all of the safety designs and features. This has two consequences: (1) not enough failure patterns in your training data set and (2) not all failures have data.
Data labeling: Even if you have enough raw data, building a labeled training data set for ML algorithms is still challenging. For ML algorithms to learn, the dataset needs to be categorized into a good/bad class or multiple classes. However, there are not a lot of industrial experts available to find the failure patterns in data, and they are expensive. Not every company has the capacity like GE to have a team of domain experts with decades of experiences in industry.
Tacit knowledge: Another big challenge is context and situational understanding. Not everything is recorded in a standardized data format. There is context information, tacit knowledge, and domain specific information. For example, in maintenance logs in CMMS systems, engineers may use jargon and abbreviations to record failure modes, symptoms, and repair actions. Without proper domain knowledge, one may not fully understand the information in a maintenance log.

In summary, industrial data sets and industrial requirements raise challenges to AI/ML. Success depends on understanding these specific challenges and associated best practices to address the challenges for industrial AI application design. Check out how our Proficy CSense software can help you jumpstart your projects – including six hours of free consulting.

Challenges and Best Practices in Industrial Artificial Intelligence (AI) applications

Author Sticky

Cobus van Heerden

Algorithm Challenges

Data Challenges

Author Section

Author

Cobus van Heerden

Senior Product Manager, Analytics, AI and Machine Learning Software

GE Vernova’s Proficy® Software & Services