Comparing Generative AI for Fraud Detection: VAE, GAN, and Diffusion in Auto Insurance

Wednesday, November 5th, 2025 Auto Fraud Insurance Industry Technology

This study introduces and evaluates 18 hybrid machine learning models for detecting fraudulent auto insurance claims, comparing the performance of Variational Autoencoders (VAE), Generative Adversarial Networks (GAN), and Diffusion Models (DM) combined with classifiers like XGBoost, LightGBM, and Random Forest. The analysis tackles key industry challenges, such as class imbalance and interpretability, and introduces anomaly detection (via Isolation Forest) and oversampling techniques (SMOTE, ADASYN) into the pipeline.

Findings show that models using Diffusion Models, particularly DM_XGBoost_SMOTE and DM_LightGBM_SMOTE, significantly outperform other configurations across classification accuracy, probabilistic calibration, and stability. DM_XGBoost_SMOTE achieved 83% accuracy, high recall, and the lowest log loss and Brier scores, making it a top contender for real-world fraud prevention in claims handling.

Importantly, the research emphasizes the use of interpretable SHAP values, enabling deeper insights into feature importance—critical for regulatory compliance and claims investigations. For claims adjusters and insurance tech teams, this paper provides a replicable AI framework that blends fraud detection precision with operational feasibility, including recommendations for deployment in real-time systems using tools like Kafka or Flink.

External References & Further Reading

https://link.springer.com/article/10.1007/s44163-025-00574-5