Machine Learning Models for Nitrogen Footprint Reduction
About Our Client
The Client is a global provider of gas turbine solutions designed to reduce the environmental impact of fossil fuels.
Challenge to Minimize the Nitrogen Footprint of Gas Turbines
The Client wanted to minimize the nitrogen oxide (NOx) emissions of its popular gas turbine solutions by determining their optimal operating conditions (e.g., constant vs. changing load, various fuel compositions). To find dependencies between these factors and determine how they affect the turbines’ nitrogen footprint, the Client decided to utilize machine learning (ML) technology.
The company chose ScienceSoft as a reliable provider of data science and machine learning services with a solid oil and gas industry background.
Developing 3 Different ML Models to Address Turbine Data Complexity
ScienceSoft’s Senior Data Scientist analyzed the data sets provided by the Client. The data included turbine telemetry (e.g., rotation speed, load, fuel consumption, temperature) and NOx emission data. Our expert started with explorative data analysis (EDA) and identified the following trends in the data:
- Possible time-series, non-linear, and long-term dependencies between telemetry variables and emission values (e.g., emission volume growth proportional to fuel consumption increase).
- Substantial volume of outliers and noisy data (e.g., a sudden spike in turbine temperature or random fluctuations in emission values).
Since the data was complex and diverse, our expert proposed developing three different ML models, each optimized for a particular dependency type. This way, the Client would be able to test all three models on the actual turbine data from its own facilities and pick the most fitting one. The Client approved this approach, and our data scientist proceeded with building the following models:
XGBoost model
This model identifies complex non-linear relationships among variables. For instance, a change in turbine temperature may not trigger a proportional shift in emission values.
Recurrent neural network (RNN) model
The RNN model is optimal for capturing time-series dependencies. For example, it can analyze how the fluctuations in hourly fuel consumption rates affect NOx emissions over time.
While creating the model, our expert performed cross-validation of time-series data to avoid model overfitting (when the model memorizes the training data and performs poorly on different data sets) and underfitting (when a model is too simple to identify dependencies).
Long short-term memory (LSTM) model
The LSTM model was developed as an advanced option of the RNN model to discover long-term dependencies in the data. For example, how the air-pressure ratio of the turbine at a certain point in time will influence NOx emission values at a later date.
Our expert optimized all three models to efficiently handle irregular and unexpected variables caused by outliers and noisy data.
3 Machine Learning Models Ready Within 2 Months
Within two months, the Client received three ML models (XGBoost, RNN, and LSTM) for identifying complex dependencies between gas turbine operating conditions and their nitrogen emissions. The Client appreciated the comprehensive approach of our data scientist and his dedication to tailoring the models to the turbine data peculiarities.
The Client proceeded to deploy all three ML models in its IT infrastructure to continue nitrogen emissions research and reduce the nitrogen footprint of its gas turbine solutions.