If we know one thing, it’s that the Dutch love to cycle. It’s quick, easy, and low cost. Even better, it’s great for the environment! But what if you’re out and about without your bike? Our friends over at NS (Nederlandse Spoorwegen — Dutch Railways) offer a handy service for that situation: OV-fiets bicycle rentals. All you need is a Dutch public transport card and you can pick up a bike in just a couple of minutes, for just a few euros per day. It’s an awesome concept, but how can they make sure there’s enough bicycles available for everyone that wants to use an OV-fiets? 🤔
That’s where the Summer of AI student team comes in! Summer of AI is an annual hackathon for students in the Netherlands and Portugal, allowing them to hone their skills on real cases. Led by Data Scientists from NS, the NS team includes Alexis Arrey Bhakor, Koen Kraaijveld, Sannerien van der Toorn, Sihan Zhu, Yahan Ke, and Zeynep Metin.
These ambitious AI-focused graduates have spent their summer working on defining, measuring and forecasting OV-fiets availability. At the closing ceremony on 16 August, the students involved presented their projects and the best teams received awards. The NS team received the award for “Most Responsible AI Solution” 🏆, as well as an honorable mention in the category “Best Pitch”!
Predicting whether a bicycle will be available in any location at a given moment is no easy task. There are many factors to consider; different rental locations have different numbers of bikes in their inventories, for instance, and the demand could be influenced by the time of day, day of the week, or even the weather. To get a firm grasp on the problem, the NS team started by interviewing OV-fiets customers at Utrecht Centraal, as well as the employees working at the rental locations. They also invested time to clean and preprocess the OV-fiets availability data before they started their analyses.
The students decided to focus on a subset of OV-fiets rental locations based on certain criteria. The locations included had at least 25 OV-fiets bicycles available, were currently open (i.e. not closed for maintenance), and featured diverse rental patterns. With this subset of the data selected, the team carried out statistical EDA (Exploratory Data Analysis) to better understand the data. Among other things, they looked into autocorrelation and seasonal decomposition.
With the groundwork done, it was time to start experimenting!👩🔬 The students implemented the baseline model that was already available (weighted moving average) to easily compare their improvements with the baseline performance. Then they tested the suitability of various models, including XGBoost, LightGBM, Prophet, SARIMA, DeepAR and LSTM. They also experimented with multivariate models, such as SARIMAX, taking into account the time series of other locations, as well as additional features that were generated based on the time series data, such as lags and weighted averages.
After only a couple of weeks, the team produced promising results. However, they realised that they needed a solid evaluation framework to fairly compare the performance of the different models to the baseline model. The team implemented a nested cross-validation process, best suited for time-series forecasting. In the end, the results showed that XGBoost showed the most potential for forecasting OV-fiets availability.
After fine-tuning their chosen model, the NS team had a working solution ready for implementation that can greatly improve accuracy 💪 compared to the existing forecasts. They also uncovered some fascinating (and useful) insights along the way: for instance, public holidays featured very different availability patterns compared to weekdays and weekends. And surprisingly, the weather had little impact on people’s decision to use a bicycle!
We’ll be keeping an eye out for enhancements in the OV-fiets availability feature in the NS app. Major kudos to everyone on the NS Summer of AI team for their excellent work! Did you know that a student team from Kickstart AI also participated in a Summer of AI challenge? Check out our blog post about their project to predict food insecurity using data.