Challenge | Improve debt collection effectiveness with the help of predictive analytics |
Solution | A machine learning model for predicting the probability of promise to pay |
Technologies and tools | Python data analysis ecosystem, VPN Checkpoint, SQL Server, Lightgbm package |
The Client is a debt collection agency that collects debts across various industries and customers. The main clients of the agency are banks, retail, telcos, state-owned enterprises.
More than 1500 collection agents across the country deal with approximately 3.5 m debtors per month reaching out to approximately 2 m of them monthly.
The debt collection process includes the following steps:
Together with Company`s Head of Data Science, whose department had already initiated implementation of machine learning to improve decision making throughout the collections lifecycle, it was decided that InData Labs would explore the potential of predictive analytics for identifying those customers who are most likely to repay.
The indispensable condition of the assignment was to enable predictions running on the existing client`s MS SQL infrastructure.
InData Labs started working on a machine learning model for predicting the probability of promise to pay from the verified accounts. Accurate predictions should lead to a more prioritized targeting of accounts and thus – improved collection rates and reduced costs.
Developing the predictive model included some major steps, such as building a pipeline for data processing and feature creation in SQL Server, training the predictive model based on lightgbm, building a pipeline for getting predictions.
The team of a data engineer and a data scientist was assigned to the project, which consisted of the following stages:
Stage | Scope of work |
1. Data Preparation |
Data analysis Data cleansing Building Data Pipeline for data processing/aggregation |
2. Modeling | Models development and testing |
3. Deployment | Deployment into MS SQL 2017, Integration Testing |
As a result of the project, we`ve provided the following deliverables to the Client:
The predictive model delivered by InData Labs accurately predicts the probability of promise to pay from an account.
The model performance was measured by ROC_AUC score. The ROC_AUC score reached ≈0.775, which was a significant improvement for the Client.
This gives the Client the ability to optimize collection agents’ time, allowing them to target the most promising accounts first.