migrate

Using migrate

This package is intended to serve as a set of tools to help convert credit risk data at two time points into traditional credit state migration (aka, “transition”) matrices. At a higher level, migrate is intended to help an analyst understand how risk moved in their credit portfolio over a time interval.

Background

One of the more difficult aspects of making a credit state migration matrix in R (or Python, for that matter) is the fact that the output doesn’t satisfy the structure of a traditional data frame object. Rather, the output needs to be a matrix, which is a data structure that R does support. In the past, there has been difficulty converting a matrix to something more visual-friendly. More recently, however, tools like the kableExtra and gt packages allow us to present visually appealing output that extends the structure of a data frame. Using the matrix-style output of migrate’s functions with a visual formatting package such as the two mentioned above will hopefully help analysts streamline the presentation of their credit portfolio’s state migration matrices to an audience.

Getting Started

If you haven’t done so already, first install migrate with the instructions in the README section.

First, load the package & the mock dataset (as a data frame) using library()

library(migrate)
data("mock_credit")

We can get a look at the data using dplyr::glimpse()

library(dplyr)
glimpse(mock_credit)
#> Rows: 900
#> Columns: 4
#> $ customer_id       <chr> "Customer_1083", "Customer_1468", "Customer_1430", "Customer_1498", "Customer_1272", "Customer_1058", "Customer_1397", "Customer_1061", "Customer_1199", "Customer_1120", "Customer_1053", "Customer_1106", "Customer_1165", "Customer_1008", "Customer_1196", "Customer_1064", "Customer_1231", "Customer_1315", "Customer_1127", "Customer_1184", "Customer_1361", "Customer_1290", "Customer_1004", "Customer_1096", "Customer_1130", "Customer_1297", "Customer_1401", "Customer_1425", "Customer_1437", "Customer_1005", "Customer_1357", "Customer_1316", "Customer_1208", "Customer_1065", "Customer_1423", "Customer_1087", "Customer_1162", "Customer_1147", "Customer_1251", "Customer_1268", "Customer_1245", "Customer_1206", "Customer_1236", "Customer_1063", "Customer_1246", "Customer_1079", "Customer_1371", "Customer_1091", "Customer_1261", "Customer_1284", "Customer_1157", "Customer_1278", "Customer_1441", "Customer_1037", "Customer_1176", "Customer_1181", "Customer_1171",...
#> $ date              <date> 2020-06-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-09-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-06-30, 2020-09-30, 2020-09-30, 2020-06-30, 2020-06-30, 2020-06...
#> $ risk_rating       <fct> 6, 10, 8, 5, 5, 6, 8, 8, 8, 11, 7, 8, 6, 10, 10, 5, 6, 5, 6, 7, 10, 4, 11, 10, 8, 5, 5, 6, 9, 10, 9, 10, 10, 8, 7, 7, 11, 7, 6, 7, 7, 8, 9, 6, 8, 7, 8, 10, 11, 8, 9, 8, 11, 8, 11, 7, 6, 6, 8, 9, 6, 6, 6, 6, 7, 9, 10, 10, 7, 11, 11, 6, 7, 6, 7, 6, 11, 9, 8, 5, 6, 5, 7, 6, 11, 7, 7, 9, 11, 11, 8, 12, 7, 6, 8, 6, 8, 9, 7, 7, 5, 6, 10, 9, 7, 8, 7, 8, 9, 6, 10, 6, 6, 4, 7, 12, 11, 12, 9, 8, 8, 11, 6, 4, 11, 5, 6, 7, 7, 10, 9, 8, 6, 8, 6, 10, 12, 7, 7, 6, 11, 10, 8, 6, 8, 12, 7, 11, 6, 11, 6, 7, 6, 4, 5, 9, 7, 10, 4, 4, 8, 7, 6, 8, 10, 9, 11, 6, 6, 9, 5, 11, 9, 6, 9, 11, 6, 6, 11, 6, 7, 5, 6, 8, 10, 7, 10, 8, 10, 9, 7, 9, 9, 8, 6, 7, 9, 7, 8, 10, 9, 10, 8, 9, 9, 4, 6, 10, 8, 6, 11, 4, 6, 7, 7, 9, 6, 5, 7, 8, 9, 6, 9, 11, 6, 7, 6, 6, 7, 8, 4, 9, 8, 6, 8, 7, 5, 7, 7, 9, 7, 7, 7, 9, 7, 11, 6, 9, 8, 8, 9, 4, 7, 12, 9, 7, 8, 8, 9, 6, 7, 8, 10, 9, 5, 6, 9, 8, 4, 7, 9, 8, 11, 9, 8, 11, 7, 6, 10, 9, 6, 7, 8, 9, 11, 6, 7, 6, 10, 9, 5, 8, 9, 11, 6, 8, 6, 8, 9, 7, 6, 8, ...
#> $ principal_balance <dbl> 2201000, 1538000, 3447000, 1302000, 2306000, 666000, 938000, 737000, 799000, 1747000, 722000, 1206000, 1040000, 1372000, 510000, 400000, 890000, 1021000, 1849000, 1205000, 503000, 637000, 750000, 673000, 3658000, 932000, 856000, 2881000, 989000, 1177000, 242000, 659000, 375000, 669000, 546000, 976000, 253000, 1232000, 1850000, 743000, 1156000, 787000, 1737000, 2218000, 1284000, 1283000, 1233000, 2013000, 853000, 1600000, 774000, 695000, 2672000, 1659000, 1273000, 1420000, 363000, 224000, 1932000, 2161000, 3197000, 2763000, 916000, 4921000, 253000, 619000, 877000, 618000, 2194000, 464000, 334000, 854000, 1493000, 515000, 4343000, 983000, 943000, 814000, 1259000, 1108000, 833000, 576000, 1223000, 492000, 2963000, 327000, 1261000, 961000, 1374000, 1424000, 376000, 2269000, 299000, 1065000, 1149000, 1964000, 947000, 390000, 1041000, 1270000, 2598000, 933000, 1793000, 551000, 1268000, 837000, 1686000, 967000, 312000, 593000, 484000, 223000, 509000, 1380000, 2...

Note that an important feature of the dataset is that there are exactly two (2) unique values in the date column variable

unique(mock_credit$date)
#> [1] "2020-06-30" "2020-09-30"

To summarize the migration within the data, use the migrate() function

migrated_df <- migrate(
  data = mock_credit, 
  date = date, 
  rating = risk_rating, 
  metric = principal_balance
)

head(migrated_df)
#> # A tibble: 6 x 3
#>   risk_rating_start risk_rating_end principal_balance
#>   <fct>             <fct>                       <dbl>
#> 1 4                 4                        12720000
#> 2 4                 5                         3807000
#> 3 4                 6                         1979000
#> 4 4                 7                               0
#> 5 4                 8                               0
#> 6 4                 9                               0

To create the state migration matrix, use the build_matrix() function

build_matrix(migrated_df)
#> Using  risk_rating_start  as the 'rating_start' column variable
#> 
#> Using  risk_rating_end  as the 'rating_end' column variable
#> 
#> Using  principal_balance  as the 'metric' column variable
#> 
#>           4        5        6        7        8        9       10       11      12      13
#> 4  12720000  3807000  1979000        0        0        0        0        0       0       0
#> 5   8690000 20029000  2655000        0        0        0        0        0       0       0
#> 6         0 10050000 73565000 15292000        0        0        0        0       0       0
#> 7         0  1760000  7489000 67429000 23386000  2778000        0        0       0       0
#> 8         0        0  1096000  2095000 68301000 10637000   220000        0       0       0
#> 9         0        0        0        0        0 41371000  9939000  2191000 1737000       0
#> 10        0        0        0        0  1861000  8134000 28044000 12288000  216000       0
#> 11        0        0        0        0        0        0  2335000 26242000 9398000 2672000
#> 12        0        0        0        0        0        0        0   968000 4393000 4221000
#> 13        0        0        0        0        0        0        0        0       0       0

Or, to do it all in one shot, use the %>%

mock_credit %>% 
  migrate(
    date = date, 
    rating = risk_rating, 
    metric = principal_balance
  ) %>% 
  build_matrix()