Fighting Fraud with Real-time Data Hubs

For financial services firms, fraud is not just a cost and a regulatory concern: each false positive or false negative also represents a disruptive experience for the customer.

Unsurprisingly, enhancing fraud detection is an ongoing priority — and many firms have turned to advanced analytics and machine learning models to deliver improvements. In fact, fraud detection has shown up as the number one use case for machine learning and AI in banking in a number of surveys. Vendors are building machine learning and AI into their fraud detection products and banks have been experimenting building their own models.

Common to many of these initiatives is the use of new data streams. In particular, there are three data related trends that stand out:

  1. Enriched transaction data: Payment schemes worldwide are adopting the ISO20022 standard. This can mean a richer set of data travels with the payment instruction — and this may be useful in payments related fraud detection models. Similarly, with the introduction of the 3D Secure 2 (3DS2) for online card payments — richer data is exchanged during the transaction and is used to estimate risk and to decide what form of additional verification may be required. 3DS2 moves beyond the original 3D Secure approach of enforcing a basic multi-factor verification for ecommerce transactions.

The remainder of this article focuses on the second trend above.

Streaming data into a hub offers the obvious benefit of detecting abnormal customer behaviour — and hence possible fraud — in a more detailed way. Online card purchases, cardholder present transactions in store, authentications, location info and payments initiated in mobile banking interfaces, ATM visits and more can all be combined as part of the anomaly detection process. So a card holder present transaction in a different location to an ATM withdrawal done at the same time may be identified as anomalous. Beyond near real-time fraud detection these models may also contribute to viewing longitudinal changes in customer behaviour that should be investigated as part of a broader know-your-customer responsibility.

Let’s look at some of the considerations in successfully implementing a fraud hub.

Ingesting data at the rate required (which may vary considerably by time of month and season) is a key challenge. In particular consideration must be given to:

  • Integration into the source system pipelines — either via an enterprise service bus (which might present capacity constraints), through direct integration to a pipeline or via the use of pre-built adaptors that might work with queuing or event hub systems in use.

Security for the data and modelling environment is a further important consideration — particularly when public cloud resources are used for the hub:

  • Some form of anonymization on key PII attributes might be done in the source pipeline before transmission to the hub. Security principles at the bank may dictate the technique used. Where possible, hashing attributes are more useful in preserving patterns that models may recognize. For example, in one case the customer’s email domain was significant in detecting phishing originated attacks that lead to anomalous transactions.

Banks may choose to use a combination of ISV devised detection models and bespoke models developed by themselves. Model development tooling is likely to be driven by the data science team’s experience and preference. Considerations may include:

  • The availability of training data that connects verified fraud incidents with the combination of data that will be aggregated in the hub. In some cases training data may have to be accumulated once the hub has been built.

Other areas to give thought to include:

  • Data models to be used when landing data in the hub — particularly for the master data. An option that has come available recently is to use Microsoft’s Banking Accelerator open Common Data Model for master customer and product data and to land transactional datasets in Azure Data Lake — this supports interaction with data stored in the Common Data Model.

Fraud hubs offer exiting potential with uses that extend beyond fraud detection. Business results will vary considerably based on data ingested, sophistication of existing fraud solutions and the quality of models built and maintained — but reduction of missed fraud events and reduction of false positives both in excess of 10% appear to be achievable.

With the right attention given to security, cloud environments are often a great fit for building a fraud hub.

In my role at Microsoft I define blueprints for what our Services teams worldwide do to help our Financial Services customers achieve more. Views are my own.