Go at Scale: Engineering Patterns from Uber, Stripe, and High-Growth Teams
Learn how large engineering teams organize Go codebases for scale. Discover Uber's monorepo strategy, dependency management, testing patterns, and workflows that handle thousands of developers.
The Problem That Only Big Teams Face
Imagine you are a junior developer at a startup. You have 10 developers. Every morning, someone pushes code. Everyone pulls it. Things work or they break immediately. Fix or revert. Simple.
Now imagine you are at Uber. You have 5,000 developers. Ten platforms. Thousands of services. Millions of lines of code. Code changes happen thousands of times per day across the codebase. Most changes succeed. Some break systems serving hundreds of millions of users.
The question shifts from βdoes the code work?β to βhow do we prevent disasters when thousands of people touch the same codebase daily?β
Uber engineers solved this problem not with better tools, but with better structure. They built systems around organization, ownership, and verification β not complexity.
This guide reveals those patterns. Not theory. Actual strategies used by companies building at massive scale.
Part 1: The Monorepo β Uberβs Foundation
Uber does not manage hundreds of separate Git repositories. Uber uses a monorepo β a single Git repository containing code for virtually every service, library, and tool.
This sounds chaotic. It is actually the opposite.
Why a Monorepo Makes Sense at Scale
Problem at small scale: Each service in its own repo works fine. But when you have 500 services that depend on each other, managing versions becomes impossible. Every change to a shared library forces coordinated updates across dozens of repositories. It becomes a game of dependency hell.
Solution: One repository. Entire codebase visible. A change to a shared library is immediately visible to all code that uses it. Dependencies are up-to-date automatically. No version mismatches because there is no versioning β everything is on the latest commit.
uber-monorepo/
services/
auth-service/
payment-service/
ride-matching/
user-platform/
driver-platform/
libraries/
common/
errors/
logging/
metrics/
storage/
postgres/
redis/
cassandra/
tools/
deploy/
monitoring/
testing/
vendor/
dependencies/
Advantages at scale:
- Atomic commits β You commit a library change and its consumersβ updates at the same time. No partial states.
- Easier refactoring β Rename a function in a library, the compiler tells you every place that breaks. You fix them all in one commit.
- Dependency transparency β See exactly what depends on what. No hidden transitive dependencies.
- Shared standards β One lint configuration, one test standard, one deployment process, everyone follows it.
Challenges:
- Repository size β Uberβs monorepo is hundreds of gigabytes. Cloning it takes time. Diffing is slow.
- Tool overhead β Git canβt handle tens of thousands of files efficiently. Uber uses custom tooling on top of Git.
- Access control complexity β How do you give Person A access to Service B but not Service C when they are in the same repo?
Most companies adopt the monorepo strategy as they grow beyond 100 engineers and 50 services. Before that, separate repos are simpler.
Managing Monorepo Scale
For large monorepos, Uber uses tools like:
Bazel (build system)
- Replaces
go build - Understands entire dependency graph
- Caches build artifacts across the organization
- Enables building only changed code and its dependents
Phabricator/Arcanist (code review)
- Understands which files changed
- Routes reviews to responsible teams
- Blocks merge if tests fail
- Enforces ownership rules
Git with filtering
# Only clone the code you need
git clone --filter=blob:none <monorepo-url>
git sparse-checkout set services/my-service/
This reduces clone size from 100GB to what you actually use.
Part 2: Ownership and Responsibility β The OWNERS File
In a monorepo with thousands of developers, you cannot have everyone changing everything. You need clear ownership.
Uber uses OWNERS files throughout the codebase:
# services/auth-service/OWNERS
auth-team@uber.com
@mchen
@priya.patel
# services/auth-service/internal/jwt/OWNERS
@mchen (primary)
@priya.patel (backup)
How it works:
- Developer submits a code review
- System checks which
OWNERSfiles apply - Those people are automatically added as reviewers
- Code cannot merge without their approval
Benefits:
- Clarity β Everyone knows who owns what
- Accountability β Clear owner for each piece
- Knowledge β You know who to ask questions about a service
- Quality gates β Experienced people must approve changes
The rule: Every directory has an OWNERS file. No exceptions. No βanyone can change this.β
Part 3: Testing at Scale β The Test Pyramid
Uber runs millions of tests daily. Not all at once. But the testing strategy creates confidence:
The Three-Level Pyramid
/\
/ \
/ E2E \ Integration tests across services
/______\ (slow, realistic, few tests)
/ \
/ Integration\ Tests involving real databases, caches
/____________\ (medium speed, medium count)
/ \
/ Unit Tests \ Tests for individual functions
/______________\ (fast, many tests, instant feedback)
Level 1: Unit Tests (Vast Majority)
Fast. Isolated. No database. No network. Mock everything.
// userservice/user_test.go
func TestCreateUserValidation(t *testing.T) {
user, err := CreateUser("", "invalid")
assert.Error(t, err)
assert.Nil(t, user)
}
Developers run unit tests locally before pushing. Takes seconds. Catches 80% of bugs immediately.
Level 2: Integration Tests
Real database (test database). Real services talking to each other. Slower. Fewer of them.
// auth_test/integration_test.go
func TestAuthServiceWithRealDatabase(t *testing.T) {
db := setupTestDB()
defer db.Close()
svc := auth.NewService(db)
token, err := svc.GenerateToken(userID)
assert.NoError(t, err)
valid := svc.ValidateToken(token)
assert.True(t, valid)
}
These run in CI after unit tests pass. Takes minutes. Catches integration issues.
Level 3: E2E Tests
Real services deployed in test environment. Real HTTP calls. Extremely slow and expensive.
// Minimal E2E tests
// Only test critical user journeys
// - User signup β verification β login
// - Rider books ride β sees driver β gets rating
Only critical paths. Maybe 50 E2E tests for entire platform. Run on each release, not on every commit.
Test Execution Strategy
Commit pushed
β
Run unit tests (30 sec)
β
ββ PASS β Run integration tests in parallel (5 min)
ββ FAIL β Notify developer, stop
β
ββ PASS β Run E2E tests on staging (30 min)
ββ FAIL β Notify developer, stop
β
ββ PASS β Merge to main, trigger deployment
ββ FAIL β Block merge, investigate
Key principle: Fast feedback for common cases, thorough verification before production.
Part 4: Code Organization β The Service Structure
At Uber, each service follows a consistent structure. This uniformity allows any developer to navigate any service.
services/
payment/
cmd/
payment-server/
main.go // Service entry point
internal/
service/
payment_service.go // Core business logic
refund_service.go
repository/
payment_repo.go // Database layer
handler/
payment_http.go // HTTP handlers
api/
v1/
payment.proto // API definition (gRPC or OpenAPI)
config/
config.go // Configuration loading
tests/
integration/
payment_test.go // Integration tests
Makefile // Local development
go.mod
go.sum
Why this structure:
cmd/β Entry points. One service = one binary = onemain.gointernal/β Private code. Compiler enforces it. Cannot import from another serviceβs internal/api/β API contracts. Humans and machines read thisconfig/β Centralized configurationtests/β Test structure mirrors production structure
The rule: Every developer can find code in any service using the same mental model. No surprises.
Part 5: Dependency Management at Scale
In a monorepo, dependencies are both a strength and a hazard.
The strength: All code is visible. If Service A depends on Library B, you see that relationship clearly.
The hazard: Circular dependencies. Tight coupling. Complex dependency graphs.
Dependency Rules
Uber enforces strict rules:
1. Acyclic Dependency Graph
Services can depend on libraries and lower-level services. But never circular.
User Service
β
Auth Library
β
Common Library
β
(Nothing above)
Payment Service
β
User Service β ALLOWED (depends on lower layer)
β
Common Library
Not allowed:
User Service β Payment Service β User Service β CYCLE
These are caught by the build system. They prevent building.
2. Explicit Dependencies
Every import must be in go.mod. No implicit dependencies. The build system verifies it.
import (
"uber/services/user" // OK - explicit in go.mod
"uber/libraries/common/errors" // OK - explicit in go.mod
"some-external/package" // Must be in go.mod
)
3. Dependency Versioning Strategy
In the monorepo, everything uses HEAD (the latest commit). But external dependencies are pinned.
// go.mod
require (
github.com/some-library v1.2.3 // Pinned
github.com/another-lib v2.0.0 // Pinned
)
// Internal: always latest
import "uber/libraries/common" // Auto-latest
This prevents external library breakage while ensuring internal consistency.
Part 6: Deployment and Rollback
When thousands of developers are pushing code, deployment becomes safety-critical.
Canary Deployments
Instead of deploying to all servers at once, Uber deploys to a small percentage first:
Version 2.0.0
β
Deploy to 1% of servers (canary)
β
Monitor for 5 minutes
ββ Error rate normal? β Continue
ββ Error rate spiked? β Automatic rollback
ββ Manual inspection needed? β Pause for review
β
Deploy to 25% (still safe to rollback)
β
Monitor for 10 minutes
ββ All good? β Continue
ββ Issues? β Rollback
β
Deploy to 100% (full rollout)
Key metric: Error rate. If error rate increases more than X% between versions, automatic rollback triggers.
Implementation:
// In deployment system
if currentErrorRate > baselineErrorRate * 1.2 {
// 20% increase in errors
rollback()
notifyOnCall()
createIncident()
}
Deployment Windows
Uber does not deploy during peak hours. Deployment happens during low-traffic windows when issues are easiest to spot and rollback is fastest.
Part 7: Code Review Culture
Code review is where quality is enforced. At Uber scale, code review must be:
- Fast β Average review time under 2 hours
- Thorough β All tests pass before review
- Clear β Comments explain the why, not the what
- Non-blocking β Reviewers suggest, not dictate
Code Review Checklist
Reviewers check:
- Tests cover happy path and error cases
- No new complexity without justification
- Dependencies are explicit
- Performance implications considered
- Error handling is correct
- Logging is sufficient for debugging
- Security implications reviewed (if relevant)
- Documentation updated
The culture: Disagreements are about ideas, not people. A senior developer deferring to a junior developerβs better solution is celebrated, not seen as weakness.
Handling Large Changes
For large refactorings or architectural changes:
- RFC (Request for Comments) β Written proposal describing problem, solution, and trade-offs
- Discussion β Async comments and suggestions
- Decision β TL makes final call
- Implementation β Small incremental commits, each reviewed
- Rollout β Canary deployment with monitoring
This prevents βsurprisesβ where a massive change breaks assumptions people didnβt know existed.
Part 8: Workflow for New Developers
A new engineer at Uber follows this onboarding:
Week 1: Local Setup
# Clone monorepo (sparse checkout, minimal)
git clone --filter=blob:none --sparse <monorepo>
# Check out just their team's code
git sparse-checkout set services/my-team/
# Build and test
bazel build //services/my-team/...
bazel test //services/my-team/...
Week 2: First PR
- Small bug fix or doc improvement
- Assigned to experienced reviewer
- Learn code review culture
Week 3-4: Small Feature
- Feature in their service only
- Understanding: how testing works, how to review othersβ code
Month 2-3: Cross-Service Feature
- Feature spanning multiple services
- Understanding: dependencies, coordination, larger systems
This gradual escalation prevents βnew developer breaks prodβ while building knowledge.
Part 9: Lessons for Your Team
You donβt need Uberβs scale to benefit from these patterns.
Applicable immediately:
- OWNERS files β Even for 10 developers, know who owns what
- Test pyramid β Unit tests fast, integration tests thorough
- Consistent structure β New service = same directory layout
- Explicit dependencies β go.mod is always complete and accurate
- Code review standards β Checklist, async, fast
- Dependency acyclicity β Check this as you grow
Adopt as you scale:
- Monorepo β When you have 50+ services
- Build system β Custom tooling for caching and parallelization
- Ownership automation β OWNERS files + CI enforcement
- Canary deployments β Error rate monitoring, automatic rollback
- Staged rollouts β 1% β 25% β 100%
Part 10: The Uncomfortable Truth
Uberβs infrastructure is not better because Uber has more engineers. Uber has better infrastructure because of decisions made when they were smaller β decisions that scaled.
Most companies approach scale reactively. They have 100 developers working on 50 services, each in separate repos, no clear ownership, inconsistent code structure. Then they hire 500 more developers and wonder why everything breaks.
Uber built patterns that scaled from 10 developers to 5,000. That is the insight. Not the tools. Not the budget. The intentional architecture.
And here is the thing: You can adopt these patterns today.
You donβt need Bazel to enforce acyclic dependencies. You need an OWNERS file and someone checking PRs. You donβt need Uberβs deployment system to do canary releases. You need to deploy to 10% first and monitor. You donβt need a monorepo to have consistent code structure. You need a template that every service follows.
The limiting factor is not tools. It is discipline.
The best large engineering teams are not distinguished by their tools or their budget. They are distinguished by their discipline in maintaining clarity when systems grow complex. That discipline, applied early, scales infinitely.
Tags
Related Articles
Organizational Health Through Architecture: Building Alignment, Trust & Healthy Culture
Learn how architecture decisions shape organizational culture, health, and alignment. Discover how to use architecture as a tool for building trust, preventing silos, enabling transparency, and creating sustainable organizational growth.
Team Health & Burnout Prevention: How Architecture Decisions Impact Human Well-being
Master the human side of architecture. Learn to recognize burnout signals, architect sustainable systems, build psychological safety, and protect team health. Because healthy teams build better systems.
Difficult Conversations & Conflict Resolution: Navigating Disagreement, Politics & Defensive Teams
Master the art of having difficult conversations as an architect. Learn how to manage technical disagreements, handle defensive teams, say no effectively, and navigate organizational politics without damaging relationships.