The ability to forecast breakdowns allows train manufacturers to schedule additional maintenance, resulting not only in lower servicing costs, but also in even higher standards of safety and reliability for both train operators and their passengers.



A centralised place to manage permissions and security with full audit

The goal of the project was to build a machine learning system capable of identifying which train components are likely to break down in the near future. A dataset was used consisting of diagnostic codes and failure notifications generated by 38 trains over three years. SherlockML provided the customisation needed to solve this very specific problem.



Millions of data points

The dataset consisted of millions of lines and was too large to be handled comfortably by a laptop. SherlockML allowed the team to scale computational resources depending on the needs of the project, and to iterate quickly.



Pioneering deep learning

The interpretation of diagnostic codes was not provided. Using phylogenetic analysis, the diagnostic codes could be categorised into ‘families’ with specific meanings.

A neural-network was then developed to predict failures of train components across the fleet. This pioneered the use of deep learning for forecasting breakdowns in the rail industry.


Improved safety and reliablility

  • Lower servicing costs
  • Higher safety and reliability standards

Meaning in data

The model enables the interpretation of 1.8 million diagnostic codes every year, whose meaning was previously not known or understood.

Upgraded safety and reliability

For each train, this is equivalent to 5,000 additional data-points that can be used to upgrade the safety and reliability of the fleet.

£1 million a year savings

The number of doors that needed to be inspected to find a fault reduced from 10,000 to 2. Estimated savings are £1 million per year in passenger compensations alone.

Power your Data Science with SherlockML