Software Architecture Matters
Plain Spoken Guidance for Software
There is evidence that good software architectures and high performing teams go together. Do good architecture create good teams or do good teams create good architectures? I am not sure which is the cause and which is the effect. It did get me thinking. What would I write down for architecture guidance? Below is my list. Expect future posts on with example scenarios.
- Keep it Simple
- Easy to Modify
- Built to Last
- Speak the Same Language
- Strive to Shift Left
- Limit Vendor Lock-in
Keep it simple
- Simple to Explain Write down the must have features , required scale/capacity, and implicit security, privacy, latency, availability requirements. Try to leverage a light weight documentation approach the involves different roles and teams. Why? When the team that writes the code, and builds the software is collaborating in requirements it leads to faster execution, and better solutions. In addition, it is a lot more fun.
- Create Clarity Use analysis to minimize the number of must haves, and clarify what is a must have vs should have vs nice to have. Why? Inventing and simplifying is one of the best parts about building software. Simplifying enables better, faster, and cheaper.
- Predictable and Repeatable Test to ensure behaviors remain consistent. Zero tolerance for flaky tests. Yes, judgement is required to understand what constitutes a behavior. Why? Speaking personally, I enjoy spending time thinking about how my service or feature will be used. Test driven development has made me a better coder by helping me understand the contracts I have with customers.
- Assume a Trusted Core Tightly coupled code written by a small team should be trusted and leveraged when you are a member of that team. Why? It is way more fun. Write code and move at the speed of trust!
Easy to modify
- Enable Personal Versions Enable personal code branches, services that can run on a local PC, access to native app emulators, and test requests. Why? Speaking personally I want a safe place to learn.
- Loosely Coupled Decompose hard problems into a set of self service APIs. Why? Open APIs enable teams to work independently and move fast. It feels amazing to make small targeted changes with just a few hours of ramp up time.
- Automate build, image, verify, release Continuous Integration and testing creates fast feedback loops and built-in gates eliminate fear of screwing things up. Why? Empowers engineers to make changes and improvements.
- Keep an Inventory
Build a list of services, applications, and useful libraries with details on who owns its and how to find the source and how to build/release the thing. Why? Feels good to self-navigate to system/service that needs improvements. From a compliance perspective having a list of things make assessment phase easier. For example, think about adding meta-data like providence for your Open Source, and most secure version of software you should leverage.
Build it to last
- Explicit scale/capacity Know what scale/capacity is needed and write down a plan to get there. Socialize the plan.Why? Brownouts/Blackouts make customers unhappy.
- Forward/Backward Compatibility APIs need forwards and backwards compatibility. Why? Without support for forward/backward compatibility, changes required synchronized deployment between provider and consumer. In a large scale system, it can take hours or days to roll out a change, and intentionally failing during software updates is not a level of quality any org aspires to meet. Another reason is complexity. Lack of Forward/Backwards compatibility create complexity as changes deep in the call stack will cause chaos because all upstream layers will need to deeply understand the change and compensate.
- Build in coarse grain mitigation and validate with coarse grain failure injection. Let hosts (or DC) fail and route requests to healthy hosts (or DC). Journal requests and follow up until success or exhaustion. Enable cache entries to go stale when providers go down. Create partial responses or reasonable static responses when the full answer isn’t possible. Enable fast rollback to last know good (applications, config, and data) Note: Rollback often requires backwards compatibility Why do it? Software engineering is complicated and sometimes we get it wrong. When we don't have mitigation we feel like the weight of the world is on our shoulders as we generate custom hot-fixes. So Sad :(
Speak the same language
- Use HTTP and JSon Use HTTP and JSon docs often and leverage the RFCs for response codes and tricky situations. Why? It is simple. Makes it way easier to understand what is going on, and engineer teams tend to be more empathetic to each other when using a common protocol.
- Leverage A Few Good, Widely Used Languages and Packages When starting a new project, look around your organization to leverage widely used languages and packages. Why? Over time the different systems grow. Having 20 different languages and 300 different software stacks kills engineering fungibility. Ramp up time goes way up. In the end variance of software is a tax on every engineer in the organization. Increasing the variance of software increases the tax.
Strive to Shift left
- Include Security/Privacy/Accessibility In Code Reviews Why? Engineers have context on code they just wrote, and it is much easier to solve problems when they are little.
- Include Code Scan Tools in CI Pipeline Add Black Duck and tools like it to scan for security vulnerability at build time. Why? Engineers have context on code they just wrote, and it is much easier to solve problems when they are little.
- Validate Key GDPR feature the Right to Forget in CI Pipeline Create negative test cases to attempt to track users who do not want to be tracked. Create negative test cases to access history after request to delete history. Why? This is a an important challenge to address, and it will take a lot of engineering time to get right. Shifting left is needed to make this more effective and efficient .
- Add Audit Controls as part of CI Pipeline Every business has some key audit controls. Best to work with audit and build in the historical records, reconciliations, and access roles early. Why? This is a an important challenge to address, and it will take a lot of engineering time to get right. Shifting left is needed to make this more effective and efficient.
Limit Vendor Lock-in
- Choose Technologies with Adoption Across Vendors. Specifically I am thinking of using technologies that work across both Amazon AWS vs Microsoft Azure vs Google Cloud. There is some cool stuff out there and with a little work enable compute workloads to run in multiple cloud eco-systems (run on both AWS and Azure). A caveat, Storage and Machine Learning across cloud providers is harder to pull off, and it may not be work building vendor agnostic APIs across all capabilities. Why? Selecting cloud providers is a big choice and leveraging a single vendor will have negative business impact. Past example of business impact include long periods of outages or temporarily running out of capacity in a specific region. In addition, diversity of providers enables organizations to better manage cost and capacity.
- Shared Code Make key application code portability across iOS, Android, and Web applications. Why? Think of how amazing it would be to write code once and have it run everywhere. It would be very cool if leverage of common code would power accessibility features through adaptive rending.
Making software leaders better