Methodology
How MSME, Gender & NPL Metrics Are Defined
This page summarises the core classification rules behind the public dashboards – how we define MSMEs, how we attribute gender to borrowers, how the gender gap indicator is calculated, and how Non Performing Loans (NPLs) are scoped.
1. MSME classification (Micro, Small, Medium)
MSMEs are defined in Kenyan law by annual turnover, not by product labels alone. However, turnover data in bank systems is often missing, zero, or populated with dummy values. To address this, we compute an MSME classification column in the fact table during ingestion and use a waterfall algorithm.
1.1 Legal & statistical thresholds
Based on the Micro and Small Enterprises Act (2012) and KNBS MSME surveys, annual turnover bands are:
- Micro: Turnover < KES 500,000
- Small: KES 500,000 – 5,000,000
- Medium: KES 5,000,000 – 100,000,000
- Large / Corporate: Turnover > 100,000,000
For the MSME dashboards, loans above the Medium threshold are treated as large corporates and excluded from the MSME view.
1.2 Step 1 – Exclude large corporates
Before assigning an MSME class we remove large exposures that are unlikely to be MSMEs:
- Exclude if
ANNUAL_TURNOVER_AMOUNT > 100,000,000, OR - Exclude if
LOAN_VALUE_AT_ISSUANCE > 50,000,000.
The KES 50M issuance threshold is a conservative proxy: a single facility above this size is almost always lower-corporate rather than an MSME.
1.3 Step 2–4 – Micro, Small, Medium (waterfall)
For the remaining loans we assign msme_classification using turnover first, then loan size as a proxy when turnover is missing or implausible:
- Micro
- Turnover > 0 and ≤ 500,000; OR
- Turnover is 0 / null / garbage and loan value at issuance ≤ 500,000; OR
- Product is mobile / overdraft and
CLIENT_TYPE = Individual– typical for mobile business credit lines.
- Small
- Turnover > 500,000 and ≤ 5,000,000; OR
- Turnover is 0 / null and
LOAN_VALUE_AT_ISSUANCEbetween 500,000 and 5,000,000.
- Medium
- Turnover > 5,000,000 and ≤ 100,000,000; OR
- Turnover is 0 / null and loan value at issuance between
5,000,000and50,000,000(the MSME dashboard cut-off).
Sole proprietors recorded as CLIENT_TYPE = Individualbut using business products (e.g. business working capital) are treated as MSMEs and classified through the same waterfall.
2. Gender classification
All dashboards use a categorical gender_flag with four possible values:
- Female-Owned
- Male-Owned
- Jointly-Owned
- Unknown
Gender attribution rules are:
- Individuals: 100% of the loan is attributed to the registered person's gender.
- Legal entities: ownership shares are weighted across natural-person owners; institutional owners are excluded from the gender calculation.
- Female-Owned: at least 51% of natural-person ownership is female.
- Male-Owned: at least 51% of natural-person ownership is male.
- Jointly-Owned: neither female nor male owners reach the 51% threshold.
- Unknown: ownership or gender data are missing or unusable.
The public Summary, Gender, Loan Performance, and Average Loan Size dashboards all use this same gender_flag to disaggregate results.
3. Gender gap indicator
The key KPI on the Gender dashboards is the gender gap, summarised in the cards and charts as “Gender gap (Male − Female)”.
For the selected metric (for example, outstanding value, number of loans, or unique borrowers) and current filters, we sum totals for Male-Owned and Female-Owned accounts only (excluding Jointly-Owned and Unknown), convert these to portfolio shares, and then take the difference:
gender_gap = male_share − female_share
where male_share = male_total / (male_total + female_total) and female_share = female_total / (male_total + female_total). The dashboards display this difference in percentage points.
4. NPL metrics and scope
On the Loan Performance pages, Non-Performing Loan (NPL) ratios and amounts are calculated on all loans in scope, not only business-lending products. This means NPL analysis always reflects the full loan portfolio (subject to the standard loan-status filters), even when other dashboards default to MSME-focused business products.
NPLs follow prudential practice: a loan is treated as non-performing when days_in_arrears ≥ 90 or when its prudential risk class is Substandard, Doubtful, or Loss. The npl_classification field (Normal, Watch, Substandard, Doubtful, Loss) is derived from days_in_arrears and underpins the NPL ratios shown in the dashboards.
