Features and innovations
In this article, we explain how our data science expert Timofej goes about developing the AI model for workforce planning. You can find out why he is working on AI-based workforce planning for the warehouse in the interview "AI-based workforce planning: the invisible helper for the warehouse. (Part 1/3)".
We asked Timofej how he processes the data from the logs.
This involves several steps. I start by extracting relevant data from the SuPCIS-L8 warehouse management system. I then clean and correct it where necessary. I then use the processed data for machine learning by developing and training AI models. Finally, I evaluate the models and carry out further iterations if necessary.
Thissounds easy, but there's much more to it than that. We explain the individual steps in more detail.
The first step is to extract the relevant data from various files. These files store information in text form and document actions in the SuPCIS-L8 warehouse management system. Automated scripts take over the extraction.
After extraction, the data is cleansed where necessary. Incorrect, incomplete or irrelevant data is removed or corrected. This is crucial, as the quality of the data influences the accuracy of the predictions and models derived from it. Formatting is also standardized to ensure consistent data formats and units.
Other data sources are also used, which are then linked to the extracted data. For this purpose, the data records are reconciled using common keys. These links provide a complete picture of workflows and resource utilization.
In the next step, features for the prediction models are created from the cleansed and integrated data. The aim is to prepare the data in such a way that it can be used for machine learning.
This data can then be used to carry out exploratory analyses in order to get to the bottom of the underlying dynamics. This identifies patterns and trends that point to opportunities for improvement. This step is crucial to understanding which factors influence workload and workforce planning.
The analyzed data and the developed features are used to feed the AI models. This is done using machine learning techniques such as decision trees, random forests or neural networks. These models are trained to recognize patterns and make predictions, such as the number of employees required for upcoming shifts.
After development, the models are evaluated to check their accuracy and effectiveness. This is done through tests with real data. Based on these results, the models are adapted and refined to improve their performance.
Now we will look at the analysis of time series to see how patterns are recognized.
The first step is to collect data. Data points such as sales figures, stock levels, weather data and other relevant values are recorded at regular intervals.
After collection, the data is cleansed, outliers are removed, missing values are added and the consistency of the data is ensured. An exploratory data analysis (EDA) is then carried out. The data is then examined visually in order to gain initial insights. Time series diagrams are created to identify patterns, trends and seasonalities.
An important step is the decomposition of the time series. This involves breaking it down into trends, seasonality and a random component. These steps help to understand the long-term direction of the data, identify recurring patterns and isolate irregular fluctuations.
Established modeling approaches such as ARIMA (AutoRegressive Integrated Moving Average), seasonal ARIMA (SARIMA) and newer methods such as fbprophet and LSTM (Long Short-Term Memory) networks from the field of deep learning exist. These models are used to analyze patterns and dependencies in the data and predict future values. However, we use a combination of proven approaches and our own models, which are specially tailored to the individual requirements of our applications.
Once we have developed the models, we check them. We use methods such as cross-validation and look at the residuals. Residuals are the differences between the real data and the predictions of the model. These differences should not have any particular patterns and should be evenly distributed.
We use the models to make predictions. We regularly check the models and update them with new data to make the forecasts more accurate. This method is particularly helpful in areas such as finance, weather forecasts and now also in warehouse management.
Examples of samples
A trend in a time series shows a long-term change in data values. This could be a continuous increase in sales figures over the years or a gradual decline in the use of an old service. Trends can be linear or non-linear.
Seasonality describes regular fluctuations in a time series that are repeated at fixed intervals, such as daily, weekly or annually. Retailers often observe higher purchasing behavior during the Christmas period, while hotels record more bookings in the summer.
In contrast to seasonality, cyclical fluctuations do not follow a fixed calendar pattern and can occur more irregularly. These fluctuations are often linked to economic cycles.
Irregularities, also known as "outliers", are unexpected spikes or dips in the data that are not explained by the usual trend or seasonal patterns. They are often caused by unforeseen events such as a viral marketing campaign or supply bottlenecks.
A level shift occurs when a time series suddenly moves to a new value range and remains there. This can be caused by structural changes such as the introduction of a new product line or a change in business strategy.
The variance of a time series can change over time. This is the case, for example, when periods of stability are replaced by volatility, as often occurs in financial markets.
In Part 2, we explain the advantages and possibilities in more detail: AI-based workforce planning: Optimum scheduling with just a few clicks (Part 2/3).
We use data from our customer Hermann Müller Elektrogrosshandel GmbH to develop the tool. You can find out more in this article.