Case Study

Building Predictive Models to Improve Debt Collection Process

Key Details

Challenge Improve debt collection effectiveness with the help of predictive analytics
Solution A machine learning model for predicting the probability of promise to pay
Technologies and tools Python data analysis ecosystem, VPN Checkpoint, SQL Server, Lightgbm package

Client

The Client is a debt collection agency that collects debts across various industries and customers. The main clients of the agency are banks, retail, telcos, state-owned enterprises.

Challenge: improve debt collection effectiveness with the help of predictive analytics

More than 1500 collection agents across the country deal with approximately 3.5 m debtors per month reaching out to approximately 2 m of them monthly.

The debt collection process includes the following steps:

  1. connect with an account
  2. account verification
  3. promise to pay
  4. collection

Together with Company`s Head of Data Science, whose department had already initiated implementation of machine learning to improve decision making throughout the collections lifecycle, it was decided that InData Labs would explore the potential of predictive analytics for identifying those customers who are most likely to repay.

The indispensable condition of the assignment was to enable predictions running on the existing client`s MS SQL infrastructure.

Solution: machine learning model for predicting the probability of promise to pay

InData Labs started working on a machine learning model for predicting the probability of promise to pay from the verified accounts. Accurate predictions should lead to a more prioritized targeting of accounts and thus – improved collection rates and reduced costs.

Developing the predictive model included some major steps, such as building a pipeline for data processing and feature creation in SQL Server, training the predictive model based on lightgbm, building a pipeline for getting predictions.

The team of a data engineer and a data scientist was assigned to the project, which consisted of the following stages:

Stage Scope of work
1. Data Preparation Data analysis
Data cleansing
Building Data Pipeline for data processing/aggregation
2. Modeling Models development and testing
3. Deployment Deployment into MS SQL 2017, Integration Testing

As a result of the project, we`ve provided the following deliverables to the Client:

  • Deployable Python Module with:
    • Data Processing Engine
    • Predictive Engine
  • Python module deployed into MS SQL 2017:
  • Project source code and documentation.

Result: improved efficiency of the debt collection process

The predictive model delivered by InData Labs accurately predicts the probability of promise to pay from an account.

The model performance was measured by ROC_AUC score. The ROC_AUC score reached ≈0.775, which was a significant improvement for the Client.

This gives the Client the ability to optimize collection agents’ time, allowing them to target the most promising accounts first.

Autre Articles