Risk practitioners often use accuracy ratio (AR) to measure the discriminatory power of binary classification models, such as models of credit default and insurance fraud. The closer AR is to 1, the higher the discriminatory power of the model.
The above diagram shows the cumulative accuracy profiles of a realistic model, a random model, and a perfect model.
As the proportion of observations increases, a perfect model would correctly classify all events before all non-events, but a random model would indiscriminately classify events and non-events together. A realistic model would be somewhere in between.
AR is the ratio of the area between the cumulative accuracy profiles of the realistic and random models (B) to the area between the cumulative accuracy profiles of the perfect and random models (A+B).
Unfortunately, Excel does not come with a native function for calculating AR. It is possible to calculate AR in Excel manually, but the process involves auxiliary rows and columns with complicated formulas that have to be adjusted as observations are added or removed. The problem is exacerbated with the standard error of AR.
The workbook contains 2 worksheets. In the UNGROUPED worksheet:
- Cells A2–A2001 contain credit scores for 2000 borrowers. The scores range between 0 for least creditworthy and 100 for most creditworthy.
- Cells B2–B2001 contain ones if credit default occurred and zeros otherwise.
In the GROUPED worksheet, cells A2–E8 contain data from the previous worksheet grouped into 7 score ranges, each with an alphabetical rating grade.
To calculate AR of the ungrouped data, open the UNGROUPED worksheet and enter the formula
=QRS.DISC.AR(A2:A2001, B2:B2001) in cell D1. The result is -0.794, which is generally considered to be a large AR in absolute terms.
The result is negative-valued, because the credit scores and credit default events in this example are negatively correlated by design. In a perfect model, score=0 corresponds to event=1, and score=100 corresponds to event=0.
=QRS.DISC.AR(A2:A2001, B2:B2001, "TEST", "RAG")
To determine the statistical significance of the AR, add
"TEST", "RAG" to the formula. The result now contains a second row with a red/amber/green rating that summarizes the significance test.
The AR in this example has a green rating. A green/amber rating means the AR is significant at the 5%/10% significance level. A red rating means the AR is not significant at the 10% significance level.
Please read the documentation to learn how to return the p-value and other useful information about the significance test, as well as how to change the significance levels of the ratings.
=QRS.DISC.AR(A2:A2001, B2:B2001, "TEST", "RAG", "LABELS", TRUE)
To add labels to the result, add
"LABELS", TRUE to the formula. To swap the rows and columns of the result, add
"TRANSPOSE", TRUE to the formula.
To calculate AR of the grouped data, switch to the GROUPED worksheet and enter the formula
=QRS.DISC.AR(B2:B8, D2:E8) in cell F9. The result is -0.797, which is similar to the AR of the ungrouped data.
TRANSPOSE options can be used as before.
If you find QRS.DISC.AR useful, please share this page with other potential users.