TDSM 4.11

From The Data Science Design Manual Wikia
Revision as of 22:03, 9 December 2017 by Caitaozhan (talk | contribs) (credit risk scoring)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

A credit risk scoring model predicts the probability of default on debt (financial default, not default in computing). The lower the probability of default, then the higher the credit you will receive.

To test a new credit risk scoring model, we do the following things:

1. Collect data

We'd better not only collect "good looking" data, but also collect "bad looking" data. Bad data such as data with many categorical variables and data that is skewed with a high or low default rate.

2. Train, Test, KS test

Split the data into train and test. Then train and test the model. We can evaluate the model using a KS test shown as follows.

Ks-graph.jpg

Reference: [1]

The larger the KS, the better the model will be. Because the model can successfully give "good" people a high credit and "bad" people a low credit.