Dear Dr. Berk!
My name is Irina Chuchueva. I’m a power analyst and have dedicated more than a decade to the practical side of electricity price and consumption forecast problems. I would like to ask you and your colleagues a question about the paper “Probabilistic forecasting of industrial electricity load with regime switching behavior.”
Firstly, I beg your pardon for being emotional, which is not recommended for a letter like this. I justify myself by arguing that it will make my letter less boring. Secondly, I suspect that you, kind of, became hostages of the paper’s success. Perhaps many scientists and practitioners are asking questions and making commentaries nowadays. I opened the paper because I wanted to learn how to write the best paper in my area from the best. Otherwise, I don’t think I’d pay much attention to the industrial load forecast problem right now being occupied with an algorithmic power trading problem.
Speaking plainly, my reaction to the paper is the following, from your permission: “Dear Doctors, nice job, but what about the boostings (XGBoost, LightGBM, CatBoost)?”
My question is why did you leave aside highly efficient nonlinear models considering the high nonlinearity of the problem of the industrial electricity load forecast?
In the paper, nonlinear models like decision trees and neural network are barely mentioned, just quickly referenced in the overview section. The fact of ignoring decision trees and neural network leads to doubts around achieved efficiencies (scores), especially considering the significant difference in the scores between compared models. Being a practitioner, I’m used to a constant struggle for every 0.1% of improvement, and from that perspective, a two-fold improvement looks pretty much like trickery.
The question above is my main one, but I also have a few less important questions and commentaries.
- Units kW/h. In the paper, the mentioned units are used, but I’ve never faced this before. When talking about instant power, kW is used; and when talking about average power within 1-hour interval, kWh/h is used. So what kW/h does mean in that respect?
- The forecast horizon is considered as medium-term and economic effects are referred to imbalance costs. The imbalance costs are a consequence of errors in the short term, or even ultrashort term forecast, and inefficient day ahead and intraday trading. I might suggest that a medium-term forecast can be used to reduce the overall cost of power for an industrial consumer using medium-term financial instruments (OTC contracts, futures, etc.) and is barely related to imbalance costs.
- Four consumers in research. In the paper, there is no explicit mention of why and how the four consumers in question were chosen from 75 available (after discrimination between switching and non-switching regime consumers).
- Feels like scores of D31, D35, D41 models contribute neither to the practical nor theoretical side of the problem because they are obvious. Is it a good practice to include such numbers in the paper?
- Why did you choose MATLAB as your tool? I’ve been coding in MATLAB for 10 years and moved to Python in April 2019. I do realize that here I’m the one who is late because I’m restricted by my client’s framework and can’t choose. Since the Python era, I’ve been catching up with best world practices in the area of time series processing. Of course, MATLAB doesn’t provide any “boostings.”
Lately, I’ve been feeling that there is a huge gap between the “scientific community” (highly titled institutes staff) and the “practical community” (data analysts, some of them with Kaggle experience). That feeling found a consonant vibe in the recent fabulous “Energy Forecasting: A Review and Outlook” by Tao Hong et al. On one side, many doctors strive to be practical and avoid looking into a complex business practice; on the other side, many practitioners know little about the theoretical background of the models and avoid reading confusing scientific papers.
After 12 years dedicated to modeling in the power market, I had a tough entrance in the “practical community” via Kaggle. It was the most sobering experience in my professional life. But it has encouraged me to dare to ask you such an unpleasant question.
This letter was sent to Dr. Berk on 1 December 2020. It remained unanswered.