Page 282 - Data Science Algorithms in a Week
P. 282

Agent-Based Modeling Simulation and Its Application to Ecommerce    263

                       Data

                          Real data of LC business model is available via its platform. Data on arrival patterns
                       and arrival intervals are generated stochastically according to the data collected for years
                       2013 and 2014. There were 235,629 accepted loan requests during the period of interest.
                       Error! Reference source not found. summarizes descriptive statistics for variables relating
                       to the funded (accepted) borrowers within the time period.

                                                  Table 1. Borrower Profiles

                        Variable name      Minimum        Maximum       Mean           Std. Deviation
                        funded_amnt ($)    1000           35000         14870          8438
                        int_rate (%)       6.00           26.06         13.78          4.32
                        annual_inc ($)     3000           7500000       74854          55547
                        dti                0              39.99         18.04          8.02
                        delinq_2yrs        0              22.00         0.34           0.89
                        inq_last_6mths     0              6.00          0.76           1.03
                        revol_util ($)     0              892.30        55.69          23.10
                        total_acc          2.00           156.00        26.01          11.89

                          Variables of interest include loan amount (funded_amnt), interest generated based on
                       user  characteristics  (int_rate),  annual  income  of  the  borrower  (annual_inc),  debt-to-
                       income ratio (dti), number of delinquencies in the past 2 years (delinq_2yrs), number of
                       inquiries in the past 6 months (inq_last_6mths), revolving utilization ratio (revol_util),
                       verification status of the user, number of accounts open in the last 2 years (total_acc) and
                       the term of the loan (36 or 64 months).
                          The loan status includes Charged Off, Current, Default, Fully Paid, In Grace Period,
                       Late  (16-30  days)  and  Late  (31-120  days).  Only  completed  loans  are  considered  i.e.,
                       those that have been fully paid or charged off.


                       Neural Network

                          The neural network (NN) is used to map the characteristics of users to different risk
                       decisions and to copy trust. Profiles of completed loans are used to build the NN model
                       representations using combined datasets of the accepted and rejected loans. A random
                       sample of 2062 data points from the combined dataset forms the training data used in the
                       learning process. The input is normalized by dividing amount requested by 3.5, FICO
                       score by 850 and employment length by 10.
                          The network structure consisted on four layers (Fig. 2). The first layer has 4 neurons
                       representing each of the following variables: amount, FICO, dti and employment length.
   277   278   279   280   281   282   283   284   285   286   287