Rearchitecting Uber's Payment System
How Uber rearchitected it's payment systems for audibility and IPO readiness
Note: This article has several weaknesses. It skips key details. It doesn’t flow quite as well as I would want it to. I spent an inordinate amount of time editing it but gave up. I have decided to publish it and I hope it helps people out.
The magical Uber experience
The Uber experience is now taken for granted, but when it started, it was close to magical. You could book a ride without haggling over the price, knowing exactly what you would pay. The app showed you how far away your ride was. You no longer needed to wait on the street for a cab or call a taxi service. Your driver would pick you up exactly where you were, without needing detailed directions. After your ride, there was no fumbling with change; you simply got out and left, automatically charged the correct amount. You then received a receipt in your email. Behind this magical experience was a payments system that had to scale rapidly while keeping things simple.
The original payments system and how it helped Uber to grow
When I joined the Uber money team in early 2016, the original payments infrastructure was simple and effective and consisted of two teams:
Charging riders after each trip
Paying drivers on a weekly schedule at 4 AM.
Other teams used this information to send receipts to the user, generated tax reports for the drivers and also the breakdown of payments for the driver.
The decoupled approach offered advantages during Uber's rapid expansion. The system's flexibility allowed for easy international onboarding of new payment methods and empowered city-level General Managers to manage driver payments through spreadsheets and SQL scripts until automation became feasible.
It’s hard to overstate how adept and motivated Uber’s Ops teams were. They would work day and night to make new cities operational. It seems like a distant memory but originally all Uber started with was Uber black. Lyft was “original” ride sharing at scale. Lyft had launched ridesharing in 2012 but Uber’s expansion was rapid and swift. Within a couple of years, Uber had not only caught up but leapfrogged Lyft to the point where Uber became a verb.
Why build a new system?
As Uber approached its IPO, Uber needed a robust, auditable, and immutable financial system. The existing system had mutability baked in. It was also hard at times to trace the payments back to the source business events. In a mutable system, records can be changed without leaving a trace. This made it difficult to ensure financial integrity and traceability—key requirements for any public company preparing for an IPO. To solve these issues, we started designing Gulfstream, guided by a few core principles
Around late 2016, Daniel Issen was hired by Uber specifically for developing Gulfstream. The whole money team met in an all hands room where Daniel outlined the vision. At the end he asked if there were questions and surprisingly there were none. In hindsight, one of us should have probably asked if we could not repurpose the existing system to deliver on the requirements. New systems are shiny but migrations are an absolute pain.
From an organization perspective, the rider and driver payment teams were merged into one team to build out the new infrastructure. To develop the initial prototype, there was a war room in San Francisco for 30 days. Engineers were also flown in from the Amsterdam office so that everyone was in the same room.
Core Design primitives
Business Events: Any business event such as a customer booking a trip, a driver asking to be paid or a customer ordering food.
Accounts: Holders of money
Orders: Used to move money between accounts
The design drew inspiration from Martin Fowler's analysis patterns and the centuries-old principles of double-entry bookkeeping.1
The fundamental insight was that money cannot be created or destroyed. It can only be moved between accounts. Orders are used to move money between accounts and orders are immutable.
Uber rides
Let’s say a rider pays $10 as fare. Uber would keep $2.5 as commission and pay the driver $7.5 for the trip.
User’s credit card account → -$10
Driver’s account → +$7.50
Uber’s commission account → +$2.50
Notice that no money was created or destroyed. The rider paid 10 dollars which was distributed between the driver and Uber.
Uber also used to give out incentives to drivers for having good ratings. Let’s say the incentive payment was $10. The way this would be modeled in the system was
Uber’s marketing account → -$10
Driver’s account → $10
Uber Eats
A customer placing an order for a burrito worth $24. In this scenario Uber needs to charge the eater and pay both the restaurant and the driver.
User’s credit card → -$24
Uber’s commission → +$7.20
Restaurants account → +$16.80
Driver’s account → +$10 for delivering the burrito
What if an order needs to be changed?
The original orders are immutable. So an adjustment order would be created.
So for example let’s say a driver took a longer route than necessary then an adjustment order is created to refund the customer the excess amount due to the longer route..
The Auditor’s perspective
You could show the fundamental building blocks of the system to an auditor, have a light bulb go off in their head about how it works and assure them that it is impossible to mess with.
Other requirements:
Latency: Surprisingly low latency is not as critical as you would think. The primary requirement is that once the ride is done, the rider should get an invoice/receipt in about 30 seconds or so because that’s the time it takes the rider to take their phone out of their pocket and check their email to see if the invoice arrived or not.2
Charge the rider exactly once. Once is ideal. Zero is preferable to charging the rider twice.
Technical Implementation
Uber's engineers built one of the company's first Java-based services, establishing foundational infrastructure using Kafka for pub/sub messaging.
Two microservices were sprung up.
Order Creator → The order creator would take the business event and generate an order.
Order Processor → The order processor would subscribe to the order generated by the order creator, charge the customer and move the money between the accounts. The order processor also creates orders to record the fact that the customer has been charged, the driver has been paid.
Because all the orders were published using Kafka, other services at Uber could subscribe to the orders, send receipts, compute taxes and build dashboards/analytics.
Charging the customer exactly once
One of the critical requirements was charging the customer exactly once, ensuring every rider was charged and every driver was paid exactly once. Zero is better than double—nobody wants to be overcharged.
Database-Level Idempotency:
Each order had a unique ID. Once a customer was charged for a business event, the order is written to a database.Transaction-Level Idempotency:
The Order Processing Service placed a hold on an account before actually charging the customer. If a concurrent process tried to charge the customer, only one of the processes would successfully acquire a hold.Provider-Level Idempotency:
Payment providers like Stripe and Braintree can detect and block duplicate payment requests. Ensuring retries used the same transaction reference, extended idempotency across the system.Use the same PSP for retries: For a given order, in case the PSP returns a network error or fails to return a clear signal that the payment failed, reuse the same PSP because it’s possible that the PSP charged the customer but wasn’t able to tell Uber about it.
The Migration Process: Rolling Out Rider Payments
Rolling out any new system is the trickiest part of the whole process. Customers can’t be broken.
Rider payments were migrated first. This was because each rider payment was independent while driver payments required aggregation of the amounts payable as they were paid for all the trips over a week.
For rider payments the following steps were taken
Extensive Testing: Each service was unit tested. The system as a whole had integration tests.
Validation: At some point the rubber has to hit the road. A payments system has to charge the customer. It was time to test the system on actual business events. Engineers "dogfooded" the system by taking short trips/ordering food and verifying invoices and credit card charges.
Gradual migration: Customers were gradually migrated over to the new system. Slowly at first and then suddenly. At a certain point you are better off pulling the trigger and just migrating to the new system(even if you discover faults) because keeping two parallel systems running is hard.
Writebacks to the old rider payment system were done because there were many systems that depended on the old rider payments data.
A Last-Minute Change That Simplified Everything
Initially, the "Order Creator" service used ListenableFutures for handling asynchronous operations, while "Order Processor" relied on Akka, a Scala-based library for building concurrent systems. However, we found debugging Akka issues in Java to be nearly impossible. Just a couple of weeks before Gulfstream went live and started processing orders, I decided to switch the order processor from Akka3 to ListenableFutures. Making the change before going live was infinitely easier than doing it before Gulfstream impacted customers. This last-minute change also unified the technology stack across both services and made it easier to maintain the system.
Ultimately, building Gulfstream was a crash course in balancing speed, scale, and reliability. It taught me that while new systems can bring shiny new features, successful migrations require grit, relentless simplification, and a commitment to getting the fundamentals right.
Sources for this article:
Apart from my personal experience of building the system, I have borrowed liberally from the following sources
Evolution of Payments at Uber by Nimish Sheth & Steven Karis
To the Nines: Building Uber’s Payments Processing System by Paul Sorenson
Reliable Processing in a Streaming Payment System by Uber engineers Emilee Urbanek and Manas Kelshikar
Also https://underhood.blog/uber-payments-platform also has a fairly comprehensive overview of the whole system as well
From a hacker news comment:
The historical significance of Double-Entry bookkeeping can not be understated. Prior to Leonardo "Fibonacci" Bonacci bringing it to Pisa from the Arabs in Algiers, North Africa, there was no concept of "budget" for most merchants, each transaction was a separate affair, and consequently, finance could not scale. There were several attempts at international lending, and it bankrupted the Bardi and Paruzzi companies who lent lots of money to King Edward III, who ended up defaulting on it. The Medici family succeeded where others failed, providing a level of service in banking previously unparalleled, thanks to being able to manage complex cash flows through Double-Entry bookkeeping. (Fibonacci also introduced the Hindu-Arabic numeral system at a time where most of Europe was using Roman Numerals for arithmetic). It made the Republican city states of northern Italy on par with the political power of noble families, and as other countries caught up with Florence, one could, for the first time, seek an education outside of the Church and in their own country.
For the sake of clarity, I have omitted one place where latency is required, which is placing an authorization hold on the customer’s account. Before the ride is booked, Uber places an authorization hold on the customer’s credit card/payment method to be sure that the customer can pay. This authorization hold is latency sensitive as it directly affects the time taken to book a trip/order food.
My personal recommendation would be to use ListenableFutures when using Java. Use Akka only if you are using Scala. It’s not worth it because Akka and actors are hard to debug and hard to trace with built in java tooling.