SlideShare a Scribd company logo
THE EVOLUTION OF
CONTINUOUS DELIVERY AT SCALE
QCon SF
Nov 2014
Jason Toy
jtoy@linkedin.com
1
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/cd-linkedin
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
How did we evolve our solution to allow
developers to quickly iterate on
creating product as LinkedIn
engineering grew from 30 to 1800
technologists?
2
?
We will be talking about that evolution today.
3
• How we have improved developer productivity
and the release pipeline
• The pitfalls we’ve seen
• How we’ve tackled them
• What it took
• What we have learned
4
What have we accomplished as we scaled??
• Scaling: From 2007 to Today
• 5 services -> 550+ services
• 30 -> 1800+ technologists
• 13 million members -> 332 million members
• At the same time
• Monolithic deployments to prod once every several
weeks -> Independent deployments when ready
• Manual -> Automated commit to production pipeline
• Faster iterations on the technology stack
5
LinkedIn 2007
• ~30 developers, 5-10 services
• Trunk based development
• Testing
• Mostly manual
• Nightly regressions: automated junit, manual functional
• Release (Every couple weeks)
• Create branch and deployment ordering
• Rehearse deployment, run tests in staging
• Site downtime to push release (All eng + ops party)
Problems in 2007
• Testing and Development
• Trunk stability: large changes, manual/local/nightly
testing
• Codebase increasing in size
• Release
• Infrequent, and time consuming
6
LinkedIn 2008-2011
• ~ 300 developers, ~300 services
• Branch based development, merge for release
• Testing
• Added automated ‘Feature Branch Readiness’
• Before merge prove branch had 0 test failures / issues
• Release (Every couple weeks)
• Exactly as before:
• Create, rehearse, and execute a deployment ordering.
7
Improvements in 2008-2011
• Branches supported more developers
• More automated testing
8
Tradeoff: Branch Hell
• Qualifying 20-40 branches
• Stabilizing release branch hard
• Point of friction: fragile/flaky/unmaintained tests
• Impact:
• frustrating process became power struggle
9
Problem: Deployment Hell
• Monolithic change with 29 levels of ordering
• Must fix forward: too complex to rollback
• Manual prod deployment did not scale:
• Dangerous, painful, and long (2 days)
• Impact:
• Operations very expensive and distracting
• Missing a release became expensive to developers
• More hotfixes and alternative process created
10
Linkedin 2011: The Turning Point
• Company-wide Project Inversion
• Build a well defined release process
• Move to trunk development
• Automated deployment process
• Build the tooling to support this!
• Enforcing good engineering practices.
• No more isolated development (no branches)
• No backwards incompatible changes
• Remove deployment dependencies
• Simplify architecture (complexity a cascading effect)
• Code must be able to go out at any time
11
LinkedIn 2011
• ~ 600 developers ~250 services
• Trunk based development
• Testing:
• Mostly automated
• Source code validation: post commit test automation
• Artifact validation: automated jobs in the test environment
• Release:
• On your own timeline per service
• One button to push to deploy to testing or prod
12
How did we make this work?
(A mixture of people, process, and
tooling)
13
?
Commit Pipeline
• Pre/Post commit (PCX) machinery
• On each commit, tests are run
• Focused test effort: scope based on change set
• Automated remediation: either block or rollback
• Small team maintains machinery and stability
• Creates new artifact upon success
• Working Copy Test
• PCX machinery to test local changes before commit
• Great for qualifying massive/horizontal changes
14
Shared Test Environment
• Continuously test artifacts with automated jobs
• Stability treated in the same respect as trunk
• Can test local changes against environment
15
Deployment vs Release
• New distinction:
• Deployment (new change to the site)
• Trunk must be deployable at all times
• Release (new feature for customers)
• Feature exposure ramped through configs
• Predictable schedule for releasing change
• Product teams can release functionality at will without
interfering with change
16
Deployment Process
• Deployment Sequence:
1. Canary Deployment (New!)
2. Full rollout
3. Ramp feature exposure (New!)
4. Problem? Revert step. (New!)
• No deployment dependencies allowed
• Fully automated
• Owners / Auto nominate deployment or rollback
• All the deployment / rollback information is in plans
17
People
• Everyone had to be willing to change
• Greater engineering responsibility
• No backwards incompatible changes
• Rethink architecture, practices (piecewise features)
• In return gave ownership of products and quality
back to engineers
• Release on your own schedule
• Local decision making
• You are responsible for your quality, not a central team
• You own a piece of the codebase not a branch (acls)
18
Tooling
• Acls for code review
• Pre/Post commit CI framework / pipeline
• CRT: Change Request Tracker
• Developer commit lifecycle management
• Deployment automation plans / Canaries
• Performance
• i.e. Evaluate canaries on things like exceptions
• Test Manager
• Manage automated tests (mostly in test environment)
• Monitoring for environment / service stability
• Config changes to ramp features
19
Improvements in 2011
• No merge hell
• Find failures faster
• Keep testing sane and automated
• Independent and easy deployment and release
• Create greater ownership
• More control over, responsible for your decisions
• Breaking the barriers: Easier to work with others
20
Challenges in 2011 (Overcame)
• Breakages immediately affect others, so find and
remove failures fast
• Pre and post commit automation
• Hard to save off work in progress
• Break down your feature into commits that are safe to
push to production. Use configs to ramp
21
Problems in 2011
• Monolithic Codebase
• Not flexible enough to accommodate
• Acquisitions
• Exploration
• Iterations needed to be even faster (non global block)
• Ownership could be clearer
• Of code
• Of failures
• Developer and code base grew significantly (again)
22
Multiproduct
• ~1500 products ~1800 devs ~550 services
• Ecosystem of smaller individual products each with an
individual release cycle
• Can depend on artifacts from other products
• Uniform process of lifecycle and tasks
• Abstractions allow us to build generic tooling to
accommodate a variety of technologies and products
• Lifecycle / tasks (i.e. build, test, deploy) owner defined
• Testing and Release mostly the same
• During your postcommit we test everything that depends
on you – to ensure you aren’t breaking anything
23
Improvements with Multiproduct
• No monolithic codebase
• Flexible
• Easier, faster to validate and not block
24
Challenges with Multiproduct
• Architecture
• Versioning Hell
• Circular Dependencies
• How to work across many products
• How to work with others
• Give people full control (no central police)
25
Conclusion: Key Successes
• 0 Test Failures
• Multitude of automated testing options
• Automated, independent, frequent deployments
• Distinguish between Deployments and Release
• More accountability and ownership for teams
26
Conclusion: Takeaways
• Notice any trends?
• Validate fast, early, often
• Simplify
• Build the tooling to succeed
• Creating more digestible pieces, giving more control to owners
• It’s all a matter of tradeoffs and priorities
• They change over time
• Ours seem to be getting better!
• It’s not only about technology: culture matters
• Change, Ownership, Craftsmanship
• People, process, technology
• Invest in improvements, and stick with it
27
Thanks!
28
Questions?
29
30
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/cd-
linkedin

More Related Content

What's hot (20)

PPTX
Effective .NET Core Unit Testing with SQLite and Dapper
Mike Melusky
 
PPTX
Cross Community CI project
Victor Morales
 
PDF
Play 2 Java Framework with TDD
Basav Nagur
 
PPTX
ONAP on Vagrant for ONAPers
Victor Morales
 
PDF
Embracing Observability in CI/CD with OpenTelemetry
Cyrille Le Clerc
 
PPTX
JENKINS Training
Nithin Kumar
 
PPTX
The Automated Monolith
Haufe-Lexware GmbH & Co KG
 
PDF
Continuous integration / deployment with Jenkins
cherryhillco
 
PDF
Jenkins-CI
Gong Haibing
 
PDF
Javantura v4 - The power of cloud in professional services company - Ivan Krn...
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
PPTX
Ansible: Infrastructure as Code for OpenShift
Ignacio Sánchez Ginés
 
PPTX
Louisville Software Engineering Meet Up: Continuous Integration Using Jenkins
James Strong
 
PDF
Building a loosely coupled toolchain with Rundeck and Puppet
smeunier114
 
PDF
Start with Angular framework
Knoldus Inc.
 
PPTX
Continuous integration
Lior Tal
 
PPTX
Modern Tools for Building Progressive Web Apps
All Things Open
 
PPTX
Docker в автоматизации тестирования
COMAQA.BY
 
PDF
Javantura v4 - JVM++ The GraalVM - Martin Toshev
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
PPTX
Microsoft ASP.NET 5 - The new kid on the block
Christos Matskas
 
PDF
Expedia 3x3 presentation
Drew Hannay
 
Effective .NET Core Unit Testing with SQLite and Dapper
Mike Melusky
 
Cross Community CI project
Victor Morales
 
Play 2 Java Framework with TDD
Basav Nagur
 
ONAP on Vagrant for ONAPers
Victor Morales
 
Embracing Observability in CI/CD with OpenTelemetry
Cyrille Le Clerc
 
JENKINS Training
Nithin Kumar
 
The Automated Monolith
Haufe-Lexware GmbH & Co KG
 
Continuous integration / deployment with Jenkins
cherryhillco
 
Jenkins-CI
Gong Haibing
 
Javantura v4 - The power of cloud in professional services company - Ivan Krn...
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
Ansible: Infrastructure as Code for OpenShift
Ignacio Sánchez Ginés
 
Louisville Software Engineering Meet Up: Continuous Integration Using Jenkins
James Strong
 
Building a loosely coupled toolchain with Rundeck and Puppet
smeunier114
 
Start with Angular framework
Knoldus Inc.
 
Continuous integration
Lior Tal
 
Modern Tools for Building Progressive Web Apps
All Things Open
 
Docker в автоматизации тестирования
COMAQA.BY
 
Javantura v4 - JVM++ The GraalVM - Martin Toshev
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
Microsoft ASP.NET 5 - The new kid on the block
Christos Matskas
 
Expedia 3x3 presentation
Drew Hannay
 

Similar to The Evolution of Continuous Delivery at Scale @ Linkedin (20)

PPTX
Devops Journey - internet tech startup
Viresh Doshi
 
PDF
Agile and continuous delivery – How IBM Watson Workspace is built
Vincent Burckhardt
 
PPTX
Agile, DevOps & Test
Qualitest
 
PPTX
Seacon Continuous Delivery Pipeline Tools Track
Mark Rendell
 
PPTX
What the music of the 1980s taught me about shipping software
Michael Ewins
 
KEY
Everything you ever wanted to know about deployment but were afraid to ask
lauraxthomson
 
PDF
Constant Contact SF's Road to CD
Solano Labs
 
PPTX
Rising Above the Noise: Continuous Integration, Delivery and DevOps
IBM UrbanCode Products
 
PDF
Continuous Delivery: releasing Better and Faster at Dashlane
Dashlane
 
PDF
DOD 2016 - Diogo Oliveira - The OutSystems R&D Continuous Delivery Journey
PROIDEA
 
PDF
Continuous Delivery: 5 years later (Incontro DevOps 2018)
Giovanni Toraldo
 
PDF
The Brave New World of Continuous Release - Baruch Sadogursky
jaxconf
 
PPTX
Automating the Quality
Dejan Vukmirovic
 
KEY
Modern Continuous Software Delivery
Martin Logan
 
PPTX
Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...
agilemaine
 
PDF
Build & Release Engineering
Pranesh Vittal
 
PPT
Making the Agile Leap to Continuous Deployment
Ethan Ram
 
PPTX
DevOps Overview in my own words
SUBHENDU KARMAKAR
 
PDF
ETCA_8
PMI2011
 
PPTX
One trunk one pipeline one truth
Paul Boocock
 
Devops Journey - internet tech startup
Viresh Doshi
 
Agile and continuous delivery – How IBM Watson Workspace is built
Vincent Burckhardt
 
Agile, DevOps & Test
Qualitest
 
Seacon Continuous Delivery Pipeline Tools Track
Mark Rendell
 
What the music of the 1980s taught me about shipping software
Michael Ewins
 
Everything you ever wanted to know about deployment but were afraid to ask
lauraxthomson
 
Constant Contact SF's Road to CD
Solano Labs
 
Rising Above the Noise: Continuous Integration, Delivery and DevOps
IBM UrbanCode Products
 
Continuous Delivery: releasing Better and Faster at Dashlane
Dashlane
 
DOD 2016 - Diogo Oliveira - The OutSystems R&D Continuous Delivery Journey
PROIDEA
 
Continuous Delivery: 5 years later (Incontro DevOps 2018)
Giovanni Toraldo
 
The Brave New World of Continuous Release - Baruch Sadogursky
jaxconf
 
Automating the Quality
Dejan Vukmirovic
 
Modern Continuous Software Delivery
Martin Logan
 
Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...
agilemaine
 
Build & Release Engineering
Pranesh Vittal
 
Making the Agile Leap to Continuous Deployment
Ethan Ram
 
DevOps Overview in my own words
SUBHENDU KARMAKAR
 
ETCA_8
PMI2011
 
One trunk one pipeline one truth
Paul Boocock
 
Ad

More from C4Media (20)

PDF
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
C4Media
 
PDF
Next Generation Client APIs in Envoy Mobile
C4Media
 
PDF
Software Teams and Teamwork Trends Report Q1 2020
C4Media
 
PDF
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
PDF
Kafka Needs No Keeper
C4Media
 
PDF
High Performing Teams Act Like Owners
C4Media
 
PDF
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
C4Media
 
PDF
Service Meshes- The Ultimate Guide
C4Media
 
PDF
Shifting Left with Cloud Native CI/CD
C4Media
 
PDF
CI/CD for Machine Learning
C4Media
 
PDF
Fault Tolerance at Speed
C4Media
 
PDF
Architectures That Scale Deep - Regaining Control in Deep Systems
C4Media
 
PDF
ML in the Browser: Interactive Experiences with Tensorflow.js
C4Media
 
PDF
Build Your Own WebAssembly Compiler
C4Media
 
PDF
User & Device Identity for Microservices @ Netflix Scale
C4Media
 
PDF
Scaling Patterns for Netflix's Edge
C4Media
 
PDF
Make Your Electron App Feel at Home Everywhere
C4Media
 
PDF
The Talk You've Been Await-ing For
C4Media
 
PDF
Future of Data Engineering
C4Media
 
PDF
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
C4Media
 
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
C4Media
 
Next Generation Client APIs in Envoy Mobile
C4Media
 
Software Teams and Teamwork Trends Report Q1 2020
C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
Kafka Needs No Keeper
C4Media
 
High Performing Teams Act Like Owners
C4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
C4Media
 
Service Meshes- The Ultimate Guide
C4Media
 
Shifting Left with Cloud Native CI/CD
C4Media
 
CI/CD for Machine Learning
C4Media
 
Fault Tolerance at Speed
C4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
C4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
C4Media
 
Build Your Own WebAssembly Compiler
C4Media
 
User & Device Identity for Microservices @ Netflix Scale
C4Media
 
Scaling Patterns for Netflix's Edge
C4Media
 
Make Your Electron App Feel at Home Everywhere
C4Media
 
The Talk You've Been Await-ing For
C4Media
 
Future of Data Engineering
C4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
C4Media
 
Ad

Recently uploaded (20)

PDF
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
PDF
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
PDF
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
PDF
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
PDF
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
PDF
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
PPTX
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
PDF
My Journey from CAD to BIM: A True Underdog Story
Safe Software
 
PDF
GDG Cloud Southlake #44: Eyal Bukchin: Tightening the Kubernetes Feedback Loo...
James Anderson
 
PDF
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
PDF
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
PPTX
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
PDF
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
Fwdays
 
PDF
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
PDF
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
PPTX
Enabling the Digital Artisan – keynote at ICOCI 2025
Alan Dix
 
PDF
How to Visualize the ​Spatio-Temporal Data Using CesiumJS​
SANGHEE SHIN
 
PDF
Pipeline Industry IoT - Real Time Data Monitoring
Safe Software
 
PDF
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
My Journey from CAD to BIM: A True Underdog Story
Safe Software
 
GDG Cloud Southlake #44: Eyal Bukchin: Tightening the Kubernetes Feedback Loo...
James Anderson
 
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
Fwdays
 
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Enabling the Digital Artisan – keynote at ICOCI 2025
Alan Dix
 
How to Visualize the ​Spatio-Temporal Data Using CesiumJS​
SANGHEE SHIN
 
Pipeline Industry IoT - Real Time Data Monitoring
Safe Software
 
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 

The Evolution of Continuous Delivery at Scale @ Linkedin

  • 1. THE EVOLUTION OF CONTINUOUS DELIVERY AT SCALE QCon SF Nov 2014 Jason Toy [email protected] 1
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /cd-linkedin
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 4. How did we evolve our solution to allow developers to quickly iterate on creating product as LinkedIn engineering grew from 30 to 1800 technologists? 2 ?
  • 5. We will be talking about that evolution today. 3 • How we have improved developer productivity and the release pipeline • The pitfalls we’ve seen • How we’ve tackled them • What it took • What we have learned
  • 6. 4 What have we accomplished as we scaled?? • Scaling: From 2007 to Today • 5 services -> 550+ services • 30 -> 1800+ technologists • 13 million members -> 332 million members • At the same time • Monolithic deployments to prod once every several weeks -> Independent deployments when ready • Manual -> Automated commit to production pipeline • Faster iterations on the technology stack
  • 7. 5 LinkedIn 2007 • ~30 developers, 5-10 services • Trunk based development • Testing • Mostly manual • Nightly regressions: automated junit, manual functional • Release (Every couple weeks) • Create branch and deployment ordering • Rehearse deployment, run tests in staging • Site downtime to push release (All eng + ops party)
  • 8. Problems in 2007 • Testing and Development • Trunk stability: large changes, manual/local/nightly testing • Codebase increasing in size • Release • Infrequent, and time consuming 6
  • 9. LinkedIn 2008-2011 • ~ 300 developers, ~300 services • Branch based development, merge for release • Testing • Added automated ‘Feature Branch Readiness’ • Before merge prove branch had 0 test failures / issues • Release (Every couple weeks) • Exactly as before: • Create, rehearse, and execute a deployment ordering. 7
  • 10. Improvements in 2008-2011 • Branches supported more developers • More automated testing 8
  • 11. Tradeoff: Branch Hell • Qualifying 20-40 branches • Stabilizing release branch hard • Point of friction: fragile/flaky/unmaintained tests • Impact: • frustrating process became power struggle 9
  • 12. Problem: Deployment Hell • Monolithic change with 29 levels of ordering • Must fix forward: too complex to rollback • Manual prod deployment did not scale: • Dangerous, painful, and long (2 days) • Impact: • Operations very expensive and distracting • Missing a release became expensive to developers • More hotfixes and alternative process created 10
  • 13. Linkedin 2011: The Turning Point • Company-wide Project Inversion • Build a well defined release process • Move to trunk development • Automated deployment process • Build the tooling to support this! • Enforcing good engineering practices. • No more isolated development (no branches) • No backwards incompatible changes • Remove deployment dependencies • Simplify architecture (complexity a cascading effect) • Code must be able to go out at any time 11
  • 14. LinkedIn 2011 • ~ 600 developers ~250 services • Trunk based development • Testing: • Mostly automated • Source code validation: post commit test automation • Artifact validation: automated jobs in the test environment • Release: • On your own timeline per service • One button to push to deploy to testing or prod 12
  • 15. How did we make this work? (A mixture of people, process, and tooling) 13 ?
  • 16. Commit Pipeline • Pre/Post commit (PCX) machinery • On each commit, tests are run • Focused test effort: scope based on change set • Automated remediation: either block or rollback • Small team maintains machinery and stability • Creates new artifact upon success • Working Copy Test • PCX machinery to test local changes before commit • Great for qualifying massive/horizontal changes 14
  • 17. Shared Test Environment • Continuously test artifacts with automated jobs • Stability treated in the same respect as trunk • Can test local changes against environment 15
  • 18. Deployment vs Release • New distinction: • Deployment (new change to the site) • Trunk must be deployable at all times • Release (new feature for customers) • Feature exposure ramped through configs • Predictable schedule for releasing change • Product teams can release functionality at will without interfering with change 16
  • 19. Deployment Process • Deployment Sequence: 1. Canary Deployment (New!) 2. Full rollout 3. Ramp feature exposure (New!) 4. Problem? Revert step. (New!) • No deployment dependencies allowed • Fully automated • Owners / Auto nominate deployment or rollback • All the deployment / rollback information is in plans 17
  • 20. People • Everyone had to be willing to change • Greater engineering responsibility • No backwards incompatible changes • Rethink architecture, practices (piecewise features) • In return gave ownership of products and quality back to engineers • Release on your own schedule • Local decision making • You are responsible for your quality, not a central team • You own a piece of the codebase not a branch (acls) 18
  • 21. Tooling • Acls for code review • Pre/Post commit CI framework / pipeline • CRT: Change Request Tracker • Developer commit lifecycle management • Deployment automation plans / Canaries • Performance • i.e. Evaluate canaries on things like exceptions • Test Manager • Manage automated tests (mostly in test environment) • Monitoring for environment / service stability • Config changes to ramp features 19
  • 22. Improvements in 2011 • No merge hell • Find failures faster • Keep testing sane and automated • Independent and easy deployment and release • Create greater ownership • More control over, responsible for your decisions • Breaking the barriers: Easier to work with others 20
  • 23. Challenges in 2011 (Overcame) • Breakages immediately affect others, so find and remove failures fast • Pre and post commit automation • Hard to save off work in progress • Break down your feature into commits that are safe to push to production. Use configs to ramp 21
  • 24. Problems in 2011 • Monolithic Codebase • Not flexible enough to accommodate • Acquisitions • Exploration • Iterations needed to be even faster (non global block) • Ownership could be clearer • Of code • Of failures • Developer and code base grew significantly (again) 22
  • 25. Multiproduct • ~1500 products ~1800 devs ~550 services • Ecosystem of smaller individual products each with an individual release cycle • Can depend on artifacts from other products • Uniform process of lifecycle and tasks • Abstractions allow us to build generic tooling to accommodate a variety of technologies and products • Lifecycle / tasks (i.e. build, test, deploy) owner defined • Testing and Release mostly the same • During your postcommit we test everything that depends on you – to ensure you aren’t breaking anything 23
  • 26. Improvements with Multiproduct • No monolithic codebase • Flexible • Easier, faster to validate and not block 24
  • 27. Challenges with Multiproduct • Architecture • Versioning Hell • Circular Dependencies • How to work across many products • How to work with others • Give people full control (no central police) 25
  • 28. Conclusion: Key Successes • 0 Test Failures • Multitude of automated testing options • Automated, independent, frequent deployments • Distinguish between Deployments and Release • More accountability and ownership for teams 26
  • 29. Conclusion: Takeaways • Notice any trends? • Validate fast, early, often • Simplify • Build the tooling to succeed • Creating more digestible pieces, giving more control to owners • It’s all a matter of tradeoffs and priorities • They change over time • Ours seem to be getting better! • It’s not only about technology: culture matters • Change, Ownership, Craftsmanship • People, process, technology • Invest in improvements, and stick with it 27
  • 32. 30
  • 33. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/cd- linkedin