M5 competition






prize money

The M5 Competition ran from 2 March to 30 June 2020. It differed from the previous four ones in six important ways, some of which were suggested by the discussants of the M4 Competition.

  • It used hierarchical sales data, generously made available by Walmart, starting at the item level and aggregating to that of departments, product categories and stores in three geographical areas of the US: California, Texas, and Wisconsin.
  • Besides the time series data, it also included explanatory variables such as price, promotions, day of the week, and special events (e.g. Super Bowl, Valentine’s Day, and Orthodox Easter) that affect sales which are used to improve forecasting accuracy.
  • The distribution of uncertainty was assessed by asking participants to provide information on four indicative prediction intervals and the median.
  • The majority of the 42,840 time series display intermittency (sporadic sales including zeros).
  • Instead of a single competition to estimate both the point forecasts and the uncertainty distribution, there were two parallel tracks using the same dataset, the first requiring 28 days ahead point forecasts and the second 28 days ahead probabilistic forecasts for the median and four prediction intervals (50%, 67%, 95%, and 99%).
  • For the first time, it focused on series that display intermittency, i.e., sporadic demand including zeros.

M5 competition aim

The aim of the M5 Competition was similar to the previous four: that is to identify the most appropriate method(s) for different types of situations requiring predictions and making uncertainty estimates. Its ultimate purpose was to advance the theory of forecasting and improve its utilization by businesses and non-profit organizations. Its other goal was to compare the accuracy/uncertainty of ML and DL methods vis-à-vis those of standard statistical ones, and assess possible improvements versus the extra complexity and higher costs of using the various methods.

Expectations & Methods Content

Given the success of the previous four M Competitions, the considerable number of participants attracted, and the significant contributions made, fundamentally changing the field of forecasting, higher achievements have been from the M5 Competition aimed at the fast-growing data science community which will have easy access to the M5 dataset. The M5 was running using the Kaggle Platform, attracting close to 6,000 participants.

M5 competition sponsors

Lead Sponsor

Official Data Provider

Platform Sponsor

The M5 Forecasting Competition ended on the 30 of June, with an amazing number of around 100,000 submissions from more than 100 countries , making it one of the largest of its kind and the fourth most popular Kaggle competition ever. Following the M5 Competition, the M5 Conference presented and discussed the findings of the most accurate winning forecasting methods as well as suggested how what has been learned from the competition can be implemented into future forecasting methods.