Machine Learning Based Data Optimization of Merchant Onboarding Processes in Fintech Ecosystems: A Time-Series and Process-Mining Study

Authors

  • Iftekhar Ahmed Master of Science in Business Analytics, St. Francis College, Brooklyn, USA Author

DOI:

https://doi.org/10.63125/kn2ycp72

Keywords:

Machine Learning, Process Mining, Onboarding, Fintech, Time-Series

Abstract

This study examined how machine learning–based data optimization and process analytics influence merchant onboarding performance within a FinTech ecosystem by integrating supervised modeling, process mining, and time-series evaluation. Using a retrospective dataset of 3,200 onboarding cases observed across 26 operational weeks, the analysis reconstructed event-driven workflows and quantified relationships between data quality, workflow behavior, and decision outcomes. Descriptive results showed moderate data missingness (mean = 7.8%), high median document completeness (92%), and substantial operational heterogeneity reflected in manual-touch intensity (41.7%) and escalation rate (9.6%). Median onboarding cycle time was 24.7 hours, although the mean reached 38.4 hours due to delay-heavy exception cases, and waiting time concentration averaged 63.2%, indicating that queue-related idle time constituted the majority of total duration. Logistic regression results demonstrated that the Data Quality Optimization Index significantly increased approval likelihood (OR = 1.65, p < .001), while Document Risk Intensity (OR = 0.72, p < .001) and Workflow Friction (OR = 0.66, p < .001) significantly reduced approval probability. Linear regression analysis further showed that Workflow Friction (+12.1 hours, p < .001) and Operational Effort (+9.3 hours, p < .001) substantially increased onboarding cycle time, whereas improved data quality reduced processing duration (−6.8 hours, p < .001). The final approval model achieved a McFadden pseudo R² of 0.29, and the cycle-time model explained 42% of duration variance (R² = 0.42; RMSE = 28.9 hours), with coefficient stability confirmed across rolling time-window robustness checks. Overall, the findings demonstrated that onboarding performance was systematically shaped by measurable data conditions and trace-level workflow characteristics rather than solely by merchant risk segmentation. By linking temporal KPI shifts with process variants and predictive constructs, this study provided an integrated quantitative framework for evaluating data-driven optimization in digital merchant onboarding systems.

Downloads

Published

2025-06-30

How to Cite

Iftekhar Ahmed. (2025). Machine Learning Based Data Optimization of Merchant Onboarding Processes in Fintech Ecosystems: A Time-Series and Process-Mining Study. International Journal of Scientific Interdisciplinary Research, 6(1), 506-535. https://doi.org/10.63125/kn2ycp72

Cited By: