As a company that builds products for the next-generation of financial services around the world, expanding globally into new geographies presents a unique set of engineering challenges.
Chief among them is that we must ensure compliance with stringent regional data regulations while simultaneously minimising latency for users across the globe.
This blog post chronicles our Engineering team’s journey as we navigated the complexities of global expansion, and how we crafted an architecture within a demanding six-month go-live timeline that now successfully supports our business customers across regions.
The Challenge: Global Reach vs. Local Compliance
Our initial payments network, while successful in the UK, faced limitations as we expanded into the EU, US, and Australia. Local data processing and storage regulations, varying significantly across regions, presented a critical hurdle. We needed to ensure our system:
Complies with data residency requirements: Adhering to stringent data privacy regulations in every region we operate within.
Delivers optimal user experience: Minimised latency to ensure lightning-fast transaction speeds for our global user base.
Streamlines merchant integration: Providing a single, unified integration point for our merchants, regardless of their location.
Maintains engineering agility: Meeting our six-month go-live timeline while preserving lean development practices, without burdening engineers with complex regional processing decisions.
Technical Foundation: Banked's Infrastructure
Banked's infrastructure is cloud-native, built on Google Cloud Platform (GCP). Our microservices architecture runs on Kubernetes for container orchestration and is distributed across multiple Availability Zones (AZs) in each region to ensure reliability and fault tolerance.
We use Istio as our service mesh to manage internal system communication, enhance security, reduce latency, and improve reliability.
Our system is built on stateless services that rely on Google-managed solutions: Cloud SQL and Cloud Spanner for data storage, Pub/Sub for streaming, Secret Manager for secrets management, and Cloud KMS for cryptographic operations.
Each domain and utility service exposes well-defined APIs supporting both synchronous communication (via gRPC) and asynchronous communication (via Google Pub/Sub).
The core of our system is a global payments network comprising over 50 services. These include global components that handle data standardisation, flow management, and state maintenance, as well as region-specific services that interact with local banks, data providers, and payment aggregators. We use event sourcing, recording every interaction as an event - the foundation of our payment network. This approach enables accurate state reconstruction and efficient management of payment and consent data across regions.
While most of our microservices are region agnostic, some are region-specific, primarily those integrating with local aggregators and banks. Both global and regional services handle PII data.
Architectural Considerations: A Multi-faceted decision
To address the challenges of global expansion, we evaluated three distinct architectural approaches:
1. Global Mesh
This approach involves a multi-cluster setup where global services are replicated across all clusters, while local services remain regionally deployed. An incoming request hits the closest cluster, and the internal services determine which other services they need to communicate with to comply with data residency requirements.
Pros:
Unified system: Enables seamless communication between services across regions.
Potentially lower latency: Can leverage service proximity for certain use cases.
Cons:
High complexity: Setting up and managing complex network infrastructure and multi-cluster configurations would significantly increase operational overhead. Additionally, this setup would require real-time data replication for all global services' data, including rapidly changing data (this might add latency as well).
Single fault domain: Any maintenance or outages have global impact. We should aim for a greater number of smaller fault domains where possible to lower operational risks
Developer cognitive load: Inter-cluster communication and data residency requirements place a significant burden on engineers, as each service within the system must make these decisions independently. For instance, when an EU transaction request reaches the US cluster, it first interacts with our API payments orchestrator. This orchestrator, a global service that does not store data, must then communicate with the customer's service to store and retrieve payer information before engaging the payments service. Given that payer information constitutes PII, the orchestrator is responsible for identifying and contacting the appropriate EU customer's service, rather than the local US service.
Risk of errors: The increased complexity raises concerns about potential human errors and slower development cycles.
Decision: We ruled out this option due to its high complexity and high risks it posed to our engineering agility and meeting our initial requirements. It would just make our system too complex to maintain.
2. Master Cluster with Partial Local Clusters
This model centred on a primary cluster in the UK that served global components while supporting regional clusters with local services.
Pros:
Simplified management: Reduced complexity compared to the Global Mesh approach.
No global data synchronisation: Eliminates the need for complex data synchronisation between global services.
Minimise costs: Achieving redundancy and scalability with a lower total number of instances of each service would allow us to minimise costs.
Cons:
Increased latency: Higher latency for clients far from the master cluster, as every request must first hit the master cluster before routing to a local cluster.
Data locality restrictions: Limits storage of local consumer data in other domains. This would require restructuring entire domains - for example, the customer’s service would need to encapsulate local data in a more complex setup than a simple service, with similar complexity applied across other domains.
Centralised bottleneck: The master cluster could become a single point of failure or performance bottleneck.
Decision: This option was unfeasible due to unacceptable latency, especially for Australian clients. Tests showed median latency exceeding 350ms for Australian clients to the master cluster, which would double with an additional hop to the regional AU cluster, severely impacting user experience.
3. Regional System Deployment
This model deploys independent systems in each region while maintaining a unified view for observability and reporting. Incoming traffic hits the closest region and is routed to processing through a routing component at the system's top level via the API gateway.
Benefits of Regional System Deployment:
Optimal data residency: Aligns perfectly with data residency requirements.
Low latency: Minimises latency for clients. Most requests hit the cluster in their processing region, and those that don't are routed in one hop to the correct processing region.
Enhanced agility: Simplifies maintenance and updates within each region. The complexity is contained to one component - the API gateway.
Challenges of Regional System Deployment:
Data consistency challenges: Configuration and context data must be available in all clusters. Though this data changes slowly, it requires near real-time replication.
Increased cost: Multiple complete clusters of the systems can lead to increased infrastructure costs.
Fragmented system view: The global system comprises several independent clusters, making it more challenging to view as "one system" for observability and system configuration.
After careful consideration and rigorous testing, we chose Regional System Deployment as the most viable solution. This approach best balances data residency requirements, user experience, and engineering agility while allowing us to meet the aggressive timeline.
Execution: A race against time
With a six-month deadline looming, we planned the execution. Key milestones included:
System infrastructure deployment: Deploying and configuring identical GKE clusters across regions, with network peering enabling limited cross-cluster communication for request routing. Also, each cluster had its own database instances for the use of the services deployed in the cluster.
Routing setup: Implementing robust routing logic to direct traffic to the appropriate regional cluster. The routing logic sits at the very top of the system to avoid additional logic in internal system domains. The API gateway handles authentication and authorisation. During authentication, it extracts context from the request to determine the processing region. The request is then either processed directly in the current cluster (if it matches the target region) or forwarded to the internal API gateway of the appropriate processing region along with the context data. For requests that do not require authentication the routing logic considers if the request can be served locally for optimal performance or context is extracted from the request to determine the region.
Configuration and context data replication: While most data in this architecture remains local and intentionally non-replicated, we must replicate configuration and merchant context data for request routing. This slowly changing data requires near real-time replication across regions.
CI/CD and service deployments: We already had a well-established, robust CI/CD tooling and process. To accommodate global deployment, we needed to introduce regional awareness to our CI/CD pipeline. This meant enhancing our ArgoCD tool (used for GitOps deployments) to synchronise all regions to the desired service versions. We also improved our pipeline to progress through environments sequentially, ensuring thorough testing in each region before advancing to higher-level environments.
Observability setup: We already had robust tooling and infrastructure in place that ensured excellent reliability, latency, and success rates. However, managing a global system required aggregated metrics views and cross-region tracing capabilities. This led us to implement a global management cluster - a GKE cluster with global tooling like mimir and our global tempo instance. To meet our tight deadlines, we compromised by relying on a regional observability stack for three months and developing global observability after the first merchant go-live.
Datahub setup: Our datahub stores, models, and transforms data to serve both internal reporting and analytics-based features. It functions as a separate component from our real-time system and relies on event data produced by that system. The new setup involves deploying local datahubs for each region, plus a global datahub that stores anonymized and aggregated data.
Admin console setup: Our admin console allows internal stakeholders to add various configurations to the system through API communication with our internal services. Due to time constraints, we deployed separate admin consoles for each regional cluster—a compromise we plan to address by implementing a single, centralised admin console to manage the entire system.
End-to-End testing: We leveraged our existing comprehensive regression suite while adding cross-region testing capabilities to verify that requests are processed and data is stored in the correct region, regardless of origin. We implemented test jobs that serve both as quality gates and blackbox tests, running from various regions to validate processing in specific target regions.
Transition plan: Developing a detailed plan for a seamless transition to the global system with minimal downtime. The main concern was ensuring current UK merchants experienced no disruptions during the transition. We achieved this by implementing multi-region support in the API gateway, including context extraction - but defaulting to the UK region. We then gradually activated routing rules after validating the context extraction results.
This was a challenging but rewarding endeavour. By focusing on clear goals, maintaining a relentless focus on execution, and leveraging the strengths of our team, we successfully launched our global payments network within the six-month timeframe, setting the stage for continued growth and innovation.
Here’s what we learned in the process:
Trade-offs are inevitable: Choosing the right architecture requires careful consideration of trade-offs between performance, scalability, compliance, and development effort.
Agility is paramount: In a fast-paced environment, maintaining engineering agility is crucial for delivering projects on time and within budget. Consider also the day-to-day cognitive load on engineers carefully as it directly impacts their productivity and the system's quality.
Continuous improvement is key: The journey towards a truly global payments network is ongoing. We will continue to monitor, optimise, and evolve our system to meet the changing demands of the global market.
In part 2 of this blog , we will drill down into the global routing component , transition plan and execution.
Banked Ltd is authorised and regulated by the UK Financial Conduct Authority
151 Wardour St, Unit 5.01, London, W1F 8WE, UK
Company number 11047186 : Firm Reference Number 816944 : +44 (0) 20 8090 2747
© Banked : 2025