Presentation: ML Data Pipelines for Real-Time Fraud Prevention @PayPal
Share this on:
What You’ll Learn
- Understand How Real-time inference is supported by near-real-time data streaming, and offline analytical computation of the features
- Learn How the trained models are packaged, configured and served into large-scale production environment
- Hear how the data tier is organized, and how data is managed at PayPay
Abstract
PayPal processes about a billion dollars of payment volume daily ($451bn in FY2017); complex decisions are made for each transaction or user action, to manage risk and compliance, while also ensuring good user experience. PayPal users can make payments immediately in 200 regions with the assurance that the company’s transactions are secure. How does PayPal achieve this goal in today's complex environment filled with "high-level" fraudsters as well as constantly increasing customer demand? While many industry solutions rely on fast analytics performed in near-real time over streaming data, our business requirements demand real-time, millisecond-range response. This talk will focus on the architectural approach towards our internally built real-time service platform that leverages Machine Learning models and delivers unparalleled performance and quality of decisions. This platform has established a fine balance between Big Data and sustainable support for a high volume of real-time decision requests. Well-structured design, along with domain modeling methodology provide for high adaptability to emerging fraud patterns and behavioral variations, deployment on real-time event-driven, fast data in-memory architecture that accelerates detection and decisions, thereby reducing losses, improving customer experience, and allowing efficient new integrations.
Is this talk a repeat or has it changed?
This talk builds up on my previous talks at QCon-London and QCon.ai earlier this year. The main focus will be given to the production inference, instead of trying to cover complete end-to-end ML development pipeline. The model inception, training and testing will be only briefly mentioned, while most of the time will be given to deeper level of details of the large-scale production stack.
Previous Talks on this Topic:
- QCon London 2018: Real-Time Data Analysis and ML for Fraud Prevention
- QCon.ai: Data Pipelines for Real-Time Fraud Prevention at Scale
What is the level of experience someone attending this talk should have?
The talk is geared towards the existing as well as aspiring practitioners of ML with the general understanding of Machine Learning landscape, not necessarily the data scientists, but rather architects and engineers focused on delivering the ML inference comute as a production capability at scale.
Similar Talks
Tracks
-
Microservices: Patterns & Practices
Evolving, observing, persisting, and building modern microservices
-
Developer Experience: Level up Your Engineering Effectiveness
Improving the end to end developer experience - design, dev, test, deploy, operate/understand. Tools, techniques, and trends.
-
Modern Java Reloaded
Modern, Modular, fast, and effective Java. Pushing the boundaries of JDK 9 and beyond.
-
Modern User Interfaces: Screens and Beyond
Zero UI, voice, mobile: Interfaces pushing the boundary of what we consider to be the interface
-
Practical Machine Learning
Applied machine learning lessons for SWEs, including tech around TensorFlow, TPUs, Keras, Caffe, & more
-
Ethics in Computing
Inclusive technology, Ethics and politics of technology. Considering bias. Societal relationship with tech. Also the privacy problems we have today (e.g., GDPR, right to be forgotten)
-
Architectures You've Always Wondered About
Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, Goldman Sachs
-
Modern CS in the Real World
Thoughts pushing software forward, including consensus, CRDT's, formal methods, & probalistic programming
-
Container and Orchestration Platforms in Action
Runtime containers, libraries, and services that power microservices
-
Finding the Serverless Sweetspot
Stories about the pains and gains from migrating to Serverless.
-
Chaos, Complexity, and Resilience
Lessons building resilient systems and the war stories that drove their adoption
-
Real World Security
Practical lessons building, maintaining, and deploying secure systems
-
Blockchain Enabled
Exploring Smart contracts, oracles, sidechains, and what can/cannot be done with blockchain today.
-
21st Century Languages
Lessons learned from languages like Rust, Go-lang, Swift, Kotlin, and more.
-
Empowered Teams
Safely running inclusive teams that are autonomous and self-correcting