Understanding the basics of lift chart
Artificial Intelligence and Machine learning have been around for a long time now but the wide range of applicability of ML has made it a key area of focus for business in recent times. There are many good articles covering concepts like exploratory data analysis, data analysis, feature engineering, and classification/regression. However, topics like model evaluation are not covered in detail and if covered, most of the time only a few sets of model performance measures which are commonly available in the machine learning model framework have been covered. This blog aims to cover the gap in discussion around model evaluation by focusing on one of the important methods to evaluate a model-Gain Charts and Lift Charts.
Lift charts
Lift charts are used to evaluate classification models with a binary target variable. While evaluating a model there are so many metrics that we can use like accuracy, precision-recall, ROC curve etc. Some of the metrics can be misleading too. For example if you use accuracy metrics in a data where target rate is 2% you might get 98% accuracy in your model even though all the points in one of the classes are classified completely wrong.
Metrics like precision, recall, f1-score, etc. can provide an idea on how the model is performing however they do not provide deeper insights on performance. Lift charts can provide further details like rank ordering in probability score and relative ability of model compared to a random classifier. The rank ordering will be further covered in the below topics.
How to build a lift chart
Steps to calculate lift is as follows:
- Calculate the probability of each of the observations.
- Sort the data based on the predicted probability in descending order.
- Group the data based on population. Typically they are grouped into deciles i.e. into 10 groups.
- Get the max and min probability of each of the deciles.
- Calculate the population percentage, cumulative population percentage, percentage of events and Cumulative percentage of events on each of the deciles. Cumulative percentage of events is also called Gain Percentage.
6. Calculate Lift by dividing the Gain Percentage with Cumulative Population percentage.
Use case
Let’s consider the very famous titanic dataset from kaggle as an example. We have 891 samples in the training dataset and for now let’s forget about the test dataset since we need actual Y to evaluate the performance. In this data set we have X variables such as age, their coach class, sex, ticket fare etc. With the training data we build a model to predict which passenger is more likely to survive. The model assigns probability between 0 to 1. I have built 3 models, LogisticRegression, SVM and RandomForestClassifier on the same data to show the comparison between the model’s lift chart so that you will be able to understand its importance better.
Gain chart
Gain charts will provide you with an idea about which model is performing better and which segment to choose and how many segments to target.
Let’s assume that we have lifeboats that can save upto 270 passengers. We have to pick only those with the maximum probability of survival. That’s when the Gain chart helps. It helps to target specific passengers whose probability of survival is more.
If you look at the plots in the above figure you can see that till 3rd segment(i,e 30% of the total population) we have 268 passengers. If we use the SVM model, out of 268 passengers the number of survivors that we capture using the model is 49% of 268 which is approximately 131 i,e we can pick 131 passengers who will survive correctly by targeting top 3 segments out of 268. But if we adapt Logistic Regression the number of survivors we can pick is 64% of 268 which is approximately equal to 171. If we adapt Random Forest the number of survivors out of 268 is 209 which is 78%. The last plot is if we pick the passengers randomly. It is a no-brainer that our Random Forest model is performing better.
This type of analysis also ensures that rank order is maintained i.e. in higher probability decile incidence rate should be higher compared to lower probability deciles. It indicates that an incident is likely to have a higher probability score compared to a non-incident. KS value can also be extracted (maximum difference between cumulative incidence rate and non-incidence rate by decile) by adding cumulative incidence and non-incidence rate in the table.
Lift chart
The lift chart basically shows how likely a passenger would survive if we pick a passenger in a random manner. Let’s analyse the result with a lift chart.
Here lift in the first decile is 1, 2.02, 2.54, 2.61 for Random Model, SVM, Logistic Regression and Random Forest classifier respectively. It means that the Random Forest model captures survivors among the passengers 2.61 times better than a random pick.
Coding Gain and Lift chart in python
def lift_chart(X,actual_target,model):
"""
DESCRIPTION
____________________________________________________________ Function that takes in X features, Y features and model object as input and creates Gain percentage and Lift.
PARAMETERS
_____________________________________________________________ X: DataFrame
The X features that are used by the model. actual_target: DataFrame
Actual target that is used to train the model. model: fit object
The fit object returned by the training algorithm. RETURNS
______________________________________________________________ output_df:DataFrame
Output dataframe with columns,
Max_Scr : Maximum Probability Score for that Decile
Min_Scr : Minimum Probability Score for that Decile
Actual : Sum of targets captured by the Decile
Total : Total population of the Decile
Population_perc : Percentage of population in the Decile
Per_Events : Percentage of Events in the decile
Gain Percentage : Gain percentage
Cumulative Population : Cumulative sum of population down the decile
Lift : Lift provided by that particular decile """
avg_tgt = actual_target.sum()/len(actual_target)
df_data = X.copy()
X_data = df_data.copy()
df_data['Actual'] = actual_target
df_data['Predict']= model.predict(X_data)
y_Prob = pd.DataFrame(model.predict_proba(X_data))
df_data['Prob_1']=list(y_Prob[1])
df_data.sort_values(by=['Prob_1'],ascending=False,inplace=True)
df_data.reset_index(drop=True,inplace=True)
df_data['Decile']=pd.qcut(df_data.index,10,labels=False)
output_df = pd.DataFrame()
grouped = df_data.groupby('Decile',as_index=False)
output_df['Max_Scr']=grouped.max().Prob_1
output_df['Min_Scr']=grouped.min().Prob_1
output_df['Actual']=grouped.sum().Actual
output_df['Total']=grouped.count().Actual
output_df["Population_perc"] = (output_df["Total"]/len(actual_target))*100
output_df['Per_Events'] = (output_df['Actual']/output_df['Actual'].sum())*100
output_df['Gain_Percentage'] = output_df.Per_Events.cumsum()
output_df["Cumulative_Population"] = output_df.Population_perc.cumsum()
output_df["Lift"] = output_df["Gain_Percentage"]/output_df["Cumulative_Population"]
return output_df
Here X is the X features, actual_target is actual_y and model parameter is the model fit object.
The above code provides decile wise lift chart and gain percentage in a pandas dataframe as output. The output looks like the image below.
Figure 3 is the lift chart for the above mentioned Random Forest model for the Titanic dataset. Rank ordering predicts the highest number of events in the first decile and then goes progressively down. If the event rate is not monotonically increasing then the model is not performing as expected. If you look in the above figure the monotonically increasing target rate(Per_Events column in figure 3) is actually breaking in the 5th decile. Though the difference is very less between 4th and 5th decile it means the model can be improved by parameter tuning or by using any other model.
End notes
Just like every other metric, lift charts are not a one-off solution, but they help in evaluating the overall performance of the model. You can quickly spot the flaws in the model if the slope of the chart is not monotonic. It also helps to set a threshold to choose the segments that are worth targeting much better than the random targeting.
References
https://www.kdnuggets.com/2016/03/lift-analysis-data-scientist-secret-weapon.html
https://www.analyticsvidhya.com/blog/2019/08/11-important-model-evaluation-error-metrics/