Create robust, secure systems

Description 

Robust systems are designed to tolerate failure of individual components without service interruptions. By building solutions with independent, redundant components we can achieve high availability in an environment where they are intertwined. We consider potential component failure modes when engineering systems, and validate our designs' security and fault tolerance as part of pre-deployment testing. Unanticipated failures of deployed systems are studied so that we adapt and improve our architecture over time.

Components

Data

  • Define who owns the data
  • Document data security requirements
  • Use role based access mechanisms when possible
  • Validate input to ensure consistency and avoid unexpected behaviors
  • "Be conservative in what you send, liberal in what you accept" (RFC793)

Infrastructure

  • Monitor the health of system components
    • In redundant systems, ensure that a component failure can be detected independent of total system failure
  • Keep accurate measurements of system performance and use over time
  • Design systems expecting failures to occur
  • "Defense in depth": implement appropriate security measures at the appropriate places
    • Consider the value of data - scale security to the value of data and risk
    • Do not rely on a single "security layer" to protect infrastructure and data

Services

  • Design applications for fault tolerance and performance
  • Prototype applications before implementation
  • Test for proper operation and behavior in unexpected situations

Support

  • Understand the dependencies and relationships between systems, so that problems can be more rapidly isolated
  • Be prepared to restart and restore a system
  • Define solution components abstractly to leverage common configurations