Big Data For Insurance Claim Fraud Detection

Machine Learning and Big Data solution for detecting claim fraud based on past data and external signals

Project Synopsis

A large health insurance company wanted to capitalize its existing claim data and use it figure out the probability of incoming claim being fraudulent. This was their first use of the machine learning technology and they wanted an approach to build an algorithm, verify it's prediction and then use it.

The insurer needed a solution with the following features:

  • Predictive models based on the past claim data
  • Verification of prediction compared to past data
  • Capability to explain the prediction to describe why a given claim looks fraudulent

Rare Mile Solution

We designed and developed a machine learning solution based on existing claim data. The delivered solution had the following features:

  • Analyzed existing claim data to build predictive models
  • Enriched claim data with some publicly available data about geographies as well as health care provider.
  • Trained, refined and retrained the system to achieve more than 95% accuracy against the existing data.
  • Built an explanation model that quantified the fraud score and explained which factors influenced the fraud scores.

Project Highlights

  • Created a machine learning system based on internal and external data.
  • Developed predictive models to predict claim fraud.
  • Delivered a very high accuracy of prediction score.

About Project

Data Science Algorithm & Implementation of Fraud Detection System

Technologies Used

Rare Mile iGimlet Text Analytics Solution
Machine Learning Solution
Rules Engine for Data Extraction
Spark & Storm

Client Details