Property and casualty insurers face a persistent challenge: homeowner insurance fraud is both more costly per claim and significantly rarer than auto fraud, creating severe class imbalance in training data. When MAPFRE extended its existing AI fraud detection system — originally built for auto claims — to homeowner policies, the team encountered a dataset where genuine fraud examples were too few to produce a reliable model. Standard remediation techniques such as under-sampling were off the table; the minority class was already too small to reduce further without destroying signal. Without a viable training corpus, the model struggled to generalize, leaving costly fraudulent claims slipping through detection.
MAPFRE addressed the data scarcity problem by augmenting real claims data with AI-generated synthetic records using DataCebo's CTGAN model from the open-source Synthetic Data Vault framework. Unlike standard generative adversarial networks, CTGAN uses a conditional vector constructed from categorical variables, enabling the generator to learn the complex distributions inherent in tabular insurance data without mode collapse. The team conducted systematic experiments varying both the volume and composition of synthetic data added to the training set, drawing on a rich feature set that included claims history, policy attributes, graph-based interconnection data, geocode information, and weather inputs. The validated synthetic augmentation pipeline was subsequently deployed to production, integrating directly with MAPFRE's existing fraud detection infrastructure.
Synthetic data augmentation delivered measurable improvement across both primary detection metrics — an outcome that is statistically uncommon in fraud modeling:
The simultaneous gain in recall and precision — metrics that typically move in opposite directions — gave MAPFRE the confidence to move the model from experimentation into full production deployment.
Have a similar implementation?
Share your customer's AI results and link it to your vendor profile.
Submit a case study →