Four Digital Credit Underwriting Strategies to Expand Borrower Base Without Added Risks

Dimitry Senko

Lending IT Consultant and Senior Business Analyst, ScienceSoft

Lending IT

Banking IT

Finance IT

Published: Dec 19, 2024

5 min read

Contributors: Stacy Dubovik, Financial Technology Researcher, and Alex Savanovich, Senior Data Scientist.

Editor’s note: Dimitry Senko, a Lending IT Consultant at ScienceSoft, shares loan underwriting strategies and tools that may help lenders grow a quality customer base, address limited credibility data, ensure accurate risk scoring, and prevent credit decisioning delays.

As a lender, you can win thousands of potential borrowers a month via viral marketing tactics and simplified application experiences. But how many of them will become your actual customers after underwriting? Some will not qualify; some may abandon the application midway due to extended checks. And there are over 26 million “credit invisible” US nationals who may pass by your offer, doubting they will ever be granted a loan with no credit history.

The competition from fintech makes it even tougher for traditional banks and credit providers to acquire and retain new borrowers. Fully digital lending firms set the new bar for credit accessibility and servicing speed, which incumbents have been struggling to reach so far. Initially focused on underserved segments and younger audiences, alternative financing today attracts more and more consumers creditworthy enough to qualify for conventional loans, stealing a share of the traditional lenders’ pie.

Before switching completely to lending IT consultancy, I spent a decade working for a large commercial bank. There, I had ample opportunity to witness how dated risk-scoring models and tools can hamper the growth of the borrower base and, as a result, compromise potential loan portfolio returns. The good news is that I also found ways to shake this system out of stagnation, achieving a 25–200%+ growth in loan approvals solely through underwriting and without added default risks.

Strategy #1. New Data Points for Covering Previously Overlooked Thin-File Borrowers

Old-school risk scoring models, including the widely used FICO Score and VantageScore, rely on traditional data like credit history, repayment behaviors, and debt-to-income ratio to quantify loan applicant risks. One huge drawback of such models is that they disfavor applicants with limited credit histories who are potentially able to repay loans but may be rejected at the door due to the lack of data, with little chance of getting them back.

Meanwhile, there are plenty of alternative data points that can give a bigger picture of a consumer’s financial posture and creditworthiness. Evolving your underwriting models with new data types is a strategy that will allow you to capture 20%+ of previously unscorable thin-file applicants and drive an up to 15% increase in loan approvals without taking additional risks.

So, what is this alternative data, and where do you get it from?

Bank transaction data sheds light on an applicant’s spending and saving patterns. The study by the Consumer Financial Protection Bureau (CFPB) revealed that data on consumer cash flow can help lenders predict delinquencies for individuals with equal credit scores and identify applicants who are 20% more likely to meet their repayment obligations. For example, VantageScore’s new model, which uses both traditional credit scores and banking data, gives lenders a predictive lift of up to 10% compared to its conventional scoring algorithms.

Many large banks today offer prebuilt APIs for automated sharing of consumer transaction data, and some of my clients have already embraced banking data for holistic credit risk assessments. To avoid building multiple direct integrations, consider employing secure APIs by open banking aggregators like Plaid or MasterCard’s Finicity, which connect to thousands of financial institutions worldwide.

Employment and payroll data, which is typically not accessible directly to credit organizations, offers a clear view of an applicant’s financial stability and repayment capacity. Again, you can use market-available APIs. For example, income & employment verification APIs by Plaid cover 85% of the US workforce and enable access to detailed earning, deduction, and tax info.
Utility and rent payment data help evaluate the financial conduct of underbanked consumers. Steady, timely payments indicate applicants who are capable of repaying loans in time, while frequent delays signal high-risk prospects. Rent payment histories are available with specialized platforms focused on the rental industry. One example is Experian RentBureau, which also shares rental payment details with lenders to underwrite thin-file individuals. APIs by utility aggregators like Arcadia will help you access applicants’ metering and payment data.
Telecom payment data can give insights into a loan applicant’s overall payment discipline. Consider go-to integrations by Equifax Telco Insights for bringing consumers’ telecom and pay TV data directly into your underwriting system. Equifax suggests that telecom and utility data can create more potential to secure a loan for 81 million thin, young, and unscorable consumers.

What if your legacy loan underwriting system doesn’t support API-enabled integrations?

One budget-friendly strategy is to implement API-driven integration middleware. You may employ prebuilt products, e.g., by MuleSoft, or build custom middleware (engineers at ScienceSoft usually use frameworks like Apache Camel to speed up the development process). The latter option is a staple for funneling new lending data into outdated custom software.

What data can be obtained legally?

Using the above types of alternative data for credit scoring without an applicant’s explicit consent may violate data protection regulations like GLBA, FCRA, CCPA, and NYDFS. Make sure to include privacy declaration and consent fields in your loan application forms to prevent compliance breaches.

Some of my lending clients have recently asked me about the potential of social media data for consumer risk assessment. My take is that, for now, inferring creditworthiness based on social media footprints is legally and ethically unfeasible. Self-reported information may be inconsistent or fake, and there are currently no mechanisms to validate its trustworthiness. The use of potentially flawed data not only poses risks to scoring precision but also misaligns with FCRA requirements for the accuracy and relevance of data used for underwriting.

Strategy #2. Advanced Analytics for Defining More Low-Risk Clients While Sifting Likely Defaulters

Statistical risk assessment algorithms in lending depend on pre-mapped relationships between variables and can effectively accommodate only structured consumer data. This may still be OK for traditional scoring approaches. However, as soon as you decide to use alternative data, you will need new, more sophisticated data processing technology and risk rating models.

Here’s where AI-powered predictive risk analytics shine. Machine learning (ML) algorithms can quickly and effectively process unstructured data, such as borrower transaction histories and employment details. They can identify subtle dependencies between disparate data and accurately predict their impact on borrower behaviors. For example, ML can determine the correlations between payroll dates and utility payment trends and suggest how these may affect delinquency risks. It can also spot deviations that shouldn’t be considered for risk profiling, e.g., growth in spending before major holidays. Such comprehensive analytics will let you capture implicit signals of an applicant’s strong repayment potential, which would be missed in a traditional scoring scenario.

Sample architecture of ML-powered credit risk analytics

Sample architecture of ML-powered credit risk analytics by ScienceSoft. Follow the link to read a detailed description of the main solution components and data flows.

From my experience, traditional banks and credit organizations are extremely cautious about changing what works well, and some prefer to retain their heritage scoring models and tools, even knowing this limits their revenue potential. But those who do adopt AI for risk assessment report impressive results: a 25–50% uplift in general approval rates, a 200%+ increase in approvals across protected classes, and a 30–40% reduction in delinquency rates. Plus, the slow AI adoption pace by incumbents gives you the chance to come early and win a large share of previously overlooked consumers.

What AI tools should you consider for risk assessments?

AI underwriting platforms like Zest AI offer cost-effective prebuilt frameworks for constructing and managing risk-scoring ML models. However, such platforms may not allow deep model customization to your unique lending practices and borrower profiles, which will inevitably hamper model precision. Plus, integrating with specialized data sources may require substantial investments. In contrast, customized ML models provide full adaptability, smooth integrations with any system, and adherence to your proprietary underwriting standards. Although you’ll need to invest $100,000–$200,000 upfront in model engineering, my practice at ScienceSoft shows that custom ML can reach 95%+ accuracy of risk predictions — an unattainable rate for OOTB models.

How explainable is AI risk scoring logic?

Some of my lending clients worry that it will be impossible to demystify the logic of intelligent scoring models and, as such, prove the compliance of AI-supported credit decisions with ECOA, FHA, and the like. While it is a valid concern, there are proven ways to interpret AI outputs. At ScienceSoft, we create explainable ML models and apply LIME and SHAP techniques to interpret step-by-step scoring logic. By opting for transparent boosting algorithms like LightGBM and XGBoost, our data scientists secure both high accuracy and explainability of intelligent scores. Mind that compliance rules must be mapped at early planning stages to ensure model design and training in accordance with the necessary regulations.

Strategy #3. Debiased Scoring for Higher Loan Accessibility to Historically Discriminated Applicants

According to Experian, employing alternative consumer data and ML-powered models for credit risk assessment may bring lenders an up to 70% improvement in the Gini index, i.e., a reduction in risks of bias towards minorities and low-income populations.

However, underwriting AI models are only as fair as the data used to train them. It means that any discriminatory lending practices from the past reflected in the training data propagate to the model’s scoring decisions, limiting their objectivity. Now, let’s be honest: historical underwriting data from US lenders is not neutral. It mirrors enormous disparities in credit opportunities between applicants of different races, ethnicities, genders, and income levels. If you’re an established market player, you should acknowledge that your as-is data used for model training may inadvertently infuse historical bias in AI-produced risk scores.

My colleagues from data science admit that mitigating the impact of historical underwriting bias is by far the biggest challenge for lending AI engineers. Yet, they developed some practices to promote fairer intelligent scoring:

Techniques like statistical parity tests and disparate impact analysis can help pinpoint and quantify biases in historical underwriting data. The data can be further rebalanced via oversampling and new data synthesis for minorities to ensure a more equitable representation of various demographic segments in the training set.
Synthetic data for underserved groups should reflect their hypothetically accurate risk scores. This will ensure model training in alignment with your fairness goals. Economists Laura Blattner and Scott Nelson, who applied a similar technique in their study, achieved a 50% reduction in the scoring accuracy gap between low-income and wealthy applicants.
Applying a fairness-aware approach to ML model training and penalizing machine decisions that unreasonably discriminate against certain borrower groups helps prevent the live model from reinforcing historical scoring bias.

Sample interfaces of credit risk analytics

Sample interfaces of credit risk analytics by ScienceSoft. Follow the link to discover what data and KPIs you can control with a comprehensive lending analytics solution.

It’s also important to differentiate bias from statistical average. For example, market-wide data may show that, on average, women indeed have the lowest default rates, and applicants from rural areas indeed have lower disposable income than applicants from NY. Including up-to-date market averages on consumer segment-specific credibility and repayment behaviors in your model training dataset is necessary to avoid AI-produced reverse biases.

Strategy #4. Underwriting Automation for Attracting Borrowers via Superior Experiences

One of the strongest competitive perks of new-age vs. traditional consumer lenders is their high loan origination speed. Some digital lenders who natively rely on automation handle application processing in minutes, need a few hours to underwrite a loan, and offer same-day funding. This is a massive leap from the conventional process that may take 2–7 days — and a go-to way to attract and retain borrowers of any income and demographics.

How can traditional banks and credit providers achieve the same speed?

Risk scoring itself, whether rule-based or AI-powered, is the fastest part of the pipeline. Delays mainly occur at the pre-qualification and creditworthiness verification stages, which involve extensive document processing and data analysis. So, the primary task for you is to bring the right loan automation tools to remove the manual slog across these areas:

Intelligent image analysis can be applied to automatically retrieve data from digital loan applications and consumer documents and pull it into underwriting forms. You can introduce such algorithms already at the application stage so that borrowers can just upload their documents and not fill out forms manually.
ML-powered data validation algorithms can automatically cross-reference borrower-reported data with the data from your corporate systems and third-party sources (e.g., identity and AML/OFAC databases). They can instantly spot fraudulent data, incomplete disclosures, and servicing hard stops like applicant presence in sanction lists.
Collecting borrower credibility data rests on the integrations with conventional sources like credit rating platforms and new sources like employment verification platforms and utility aggregators. An AI-powered analytics system can further process the obtained multi-format data, predict applicant risks, and give a decision-ready score — all within minutes.
Your scoring solution can be designed to auto-approve low-risk loans, immediately communicate credit decisions to borrowers, and trigger fund disbursements. Such automation can be established via AI or custom rules. One major benefit of intelligent engines is that they can be trained to suggest alternative loan terms if an applicant doesn’t qualify for the initially requested financing.

From my experience, intelligent automation can help lenders speed up end-to-end consumer loan origination by more than 90% and automate 70–85%+ of credit decisions. Yet, even the smartest algorithms can fail to adequately handle non-standard cases. Plus, it’s unlikely you will trust the machine to approve large loan amounts. Usually, credit decisioning models are designed to route complex and high-sum applications for manual processing.

Notably, the speed and efficiency enabled by underwriting automation not only cater to consumer expectations but also drive a 10–50%+ reduction in operational costs. This gives you the opportunity to offer more competitive loan prices and further broaden your borrower base.

Start Small to Speed Up the ROI and Secure Future Budgets

I usually recommend that my lending IT clients start the underwriting transformation with a single area and scale up gradually. For example, you may pilot alternative borrower data and new scoring models for auto or student loans and progressively extend the innovations to other lending products. Or you can bring image analysis to application processing and then employ advanced intelligence to analyze borrower creditworthiness. Such an approach will help you optimize investments, speed up payback, and secure the buy-in from the leadership for further digital initiatives.

If you need advice on the best-fitting underwriting technology for your case, feel free to contact me or other consultants at ScienceSoft.

Why community banks need CRM to make more loan offers, now

Keep reading

Mortgage CRM: The Future of Loan Management

Keep reading

Cognitive systems: the future of banking and finance?

Keep reading