ML Data Pipelines for Real-Time Fraud Prevention @PayPal | Software Development Conference QCon New York

What You’ll Learn

Understand How Real-time inference is supported by near-real-time data streaming, and offline analytical computation of the features

Learn How the trained models are packaged, configured and served into large-scale production environment

Hear how the data tier is organized, and how data is managed at PayPay

Abstract

PayPal processes about a billion dollars of payment volume daily ($451bn in FY2017); complex decisions are made for each transaction or user action, to manage risk and compliance, while also ensuring good user experience. PayPal users can make payments immediately in 200 regions with the assurance that the company’s transactions are secure. How does PayPal achieve this goal in today's complex environment filled with "high-level" fraudsters as well as constantly increasing customer demand? While many industry solutions rely on fast analytics performed in near-real time over streaming data, our business requirements demand real-time, millisecond-range response. This talk will focus on the architectural approach towards our internally built real-time service platform that leverages Machine Learning models and delivers unparalleled performance and quality of decisions. This platform has established a fine balance between Big Data and sustainable support for a high volume of real-time decision requests. Well-structured design, along with domain modeling methodology provide for high adaptability to emerging fraud patterns and behavioral variations, deployment on real-time event-driven, fast data in-memory architecture that accelerates detection and decisions, thereby reducing losses, improving customer experience, and allowing efficient new integrations.

Question:

Is this talk a repeat or has it changed?

Answer:

This talk builds up on my previous talks at QCon-London and QCon.ai earlier this year. The main focus will be given to the production inference, instead of trying to cover complete end-to-end ML development pipeline. The model inception, training and testing will be only briefly mentioned, while most of the time will be given to deeper level of details of the large-scale production stack.

Previous Talks on this Topic:

QCon London 2018: Real-Time Data Analysis and ML for Fraud Prevention

QCon.ai: Data Pipelines for Real-Time Fraud Prevention at Scale

Question:

What is the level of experience someone attending this talk should have?

Answer:

The talk is geared towards the existing as well as aspiring practitioners of ML with the general understanding of Machine Learning landscape, not necessarily the data scientists, but rather architects and engineers focused on delivering the ML inference comute as a production capability at scale.

Speaker: Mikhail Kourjanski

Lead Data Architect, Risk and Compliance Management Platform @PayPal

Mikhail Kourjanski is the Lead Data Architect at PayPal, responsible for the data architecture of the PayPal real-time decisioning platform, that handles billions of events per day and maintains dozens of petabytes of data. For fraud prevention function alone, this platform saves more than $500M in annual profits.

Mikhail has over 20 years of work experience, including high-tech software engineering, academic research, and consulting for the Financial Services industry. Mikhail’s architecture work includes a number of innovative developments such as high-performance distributed processing over eventually consistent data, multi-layer security model for data-in-transit middleware, service domain models for banking and Fintech clients. Mikhail had delivered multiple engagements for the Top-10 banks in the roles of trusted advisor up to CIO level, lead architect, and IT delivery executive. Prior to consulting period of Mikhail’s career, he proved a successful entrepreneur running his own company, winning and delivering R&D projects for the US Government agencies. Mikhail earned his Ph.D. degree in applied mathematics from the Moscow State (Lomonosov) University, Russia, followed by the post-doctoral research position at UC Berkeley. Mikhail’s academic research focused on large-scale distributed systems and real-time simulations for the Transportation industry and Smart Cars technologies.

Find Mikhail Kourjanski at

Speaker page

Similar Talks

Programming for Hostile Environments

SVP, Engineering @packethost

Nathan Goulding

Platforms at Twilio: Unlocking Developer Effectiveness

Senior Director Platform Engineering @twilio

Justin Kitagawa

Help! I Accidentally Distributed My System!

Software Engineer & Engineering Manager @Honeycombio

Emily Nakashima

Help! I Accidentally Distributed My System!

Developer Programs Engineer @Google

Rachel Myers

Heretical Resilience: To Repair is Human

Staff Infrastructure Engineer @travisci

Ryn Daniels

Effective Java, Third Edition - Keepin' it Effective

Author of Effective Java, Lead Design of Java Collection API & Carnegie Mellon Professor

Joshua Bloch

AutoCAD & WebAssembly: Moving a 30 Year Code Base to the Web

Software Architect @autodesk

Kevin Cheung

Software Is Eating the World, ML Is Going to Eat Software

Language Designer Working on Tooling @Facebook, worked on TypeScript, F#, & Swift

Joe Pamer

Smart Speakers: Designing for the Human

UX Lead @Google

Charles Berg

Tracks

Microservices: Patterns & Practices

Evolving, observing, persisting, and building modern microservices
Developer Experience: Level up Your Engineering Effectiveness

Improving the end to end developer experience - design, dev, test, deploy, operate/understand. Tools, techniques, and trends.
Modern Java Reloaded

Modern, Modular, fast, and effective Java. Pushing the boundaries of JDK 9 and beyond.
Modern User Interfaces: Screens and Beyond

Zero UI, voice, mobile: Interfaces pushing the boundary of what we consider to be the interface
Practical Machine Learning

Applied machine learning lessons for SWEs, including tech around TensorFlow, TPUs, Keras, Caffe, & more

Ethics in Computing

Inclusive technology, Ethics and politics of technology. Considering bias. Societal relationship with tech. Also the privacy problems we have today (e.g., GDPR, right to be forgotten)
Architectures You've Always Wondered About

Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, Goldman Sachs
Modern CS in the Real World

Thoughts pushing software forward, including consensus, CRDT's, formal methods, & probalistic programming
Container and Orchestration Platforms in Action

Runtime containers, libraries, and services that power microservices
Finding the Serverless Sweetspot

Stories about the pains and gains from migrating to Serverless.

Chaos, Complexity, and Resilience

Lessons building resilient systems and the war stories that drove their adoption
Real World Security

Practical lessons building, maintaining, and deploying secure systems
Blockchain Enabled

Exploring Smart contracts, oracles, sidechains, and what can/cannot be done with blockchain today.
21st Century Languages

Lessons learned from languages like Rust, Go-lang, Swift, Kotlin, and more.
Empowered Teams

Safely running inclusive teams that are autonomous and self-correcting

Schedule

Track: Practical Machine Learning

Location: Empire Complex, 7th fl.

Duration: 11:50am - 12:40pm

Day of week: Wednesday

Level: Advanced

Persona: Architect, Data Scientist, Developer

What You’ll Learn

Abstract

Find Mikhail Kourjanski at

Similar Talks

Tracks

Microservices: Patterns & Practices

Developer Experience: Level up Your Engineering Effectiveness

Modern Java Reloaded

Modern User Interfaces: Screens and Beyond

Practical Machine Learning

Ethics in Computing

Architectures You've Always Wondered About

Modern CS in the Real World

Container and Orchestration Platforms in Action

Finding the Serverless Sweetspot

Chaos, Complexity, and Resilience

Real World Security

Blockchain Enabled

21st Century Languages

Empowered Teams

Presentation: ML Data Pipelines for Real-Time Fraud Prevention @PayPal

Track: Practical Machine Learning

Location: Empire Complex, 7th fl.

Duration: 11:50am - 12:40pm

Day of week: Wednesday

Level: Advanced

Persona: Architect, Data Scientist, Developer

More talks on:

Share this on:

What You’ll Learn

Abstract

Find Mikhail Kourjanski at

Similar Talks

Tracks