liquidx-studio.github.io

LiquidXLogo.png


Author: Nobel Khandaker

Engineering Excellence Strategy

Contemporary research shows that high-performing software development teams are essential for creating high-performing organizations.

This document describes what we plan to do to improve our performance as a tech organization.

1. Change lead time

Software development lifecycle

  1. Product team provides a well-defined set of features and user stories
    • Product or feature description should contain associated UI design
    • The critical features and the release criteria for those feature should be well-defined
    • Product team will hold design meetings with engineers to clarify the product requirements
  2. Dev team designs the solution by prototyping and building proof-of-concepts and provides time/cost estimates
  3. The product team defines the release criteria
  4. Dev team prepares the end-to-end test scenarios and cases based on the release criteria
  5. Dev team completes development, testing and code review
  6. Engineers, product managers, business owners, everyone tests the product
  7. Product is released to internal and external customers for UAT and Preview (released using feature flighting)
  8. Product goes GA or live (GA - general availability or live)

image

Reduced inter-team dependency

Different engineering teams - application, platform, blockchain, and infrastructure teams agree to common interfaces, data formats and contracts during the product design phase. Each team is responsible for developing and testing against that contract and for delivering their components at the specified milestones. Final integration testing is performed once all components are available

High quality bar

Developers own the products or features they work on. Developers will write unit tests (with >=95% coverage) and will do code reviews. Developers will also perform end-to-end testing and testing security, scalability, performance and ensures the software meets the release criteria.

2. Deployment frequency

3. Change fail percentage

4. Mean time-to-restore (MTTR)

Severity Description Examples Response Time
1 Critical incident Service outage, data loss, ddos attack 30 minutes
2 Major incident One or more major features unavailable - no workarounds 1 Hours
3 Minor incident One or more major features unavailable - with workarounds 24 Hours