We have discovered Formal Architecture Verification as an affordable quality assurance (QA) measure that is suitable for a wide variety of applications. TLA+ is one of the most widespread technology in this context. Don’t let the dated looking project website fool you. For example Amazon reported using it for DynamoDB and other major players have reported using it before and since. We think this is a widely underestimated tool that deserves more attention. In this article we would like to talk about areas where the use of formal architecture verification would make sense. To foreshadow a bit, we see many applications in the health management and health devices sector as suitable. Be it HIS (hospital information systems / German KIS Krankenhausinformationssystem), LIS (laboratory information system / German Laborinformationssystem), medical records (German ePA elektronische Patientenakte) or devices like pacemakers, insulin pumps or the apps to control those.
Notice: There is a German version of this article as well.
One little paragraph about what formal architecture verification is: When you design your architecture, you already design the interaction of different architecture components. If you now want to know if your algorithmic conditions (i.e. timing conditions but not interface compatibility) hold up, formal architecture verification can help. In UML terms this method can mainly be applied to state diagrams, flow charts or sequence diagrams. You will model this architecture in the language of your formal architecture verification tool. The next thing is finding a mathematical proof for your conditions with the assistance of your verification tool. With TLA+ you will usually be notified as to why, if a proof could not be found. This way, formal architecture verification helps to find and fix very open but also very subtle insufficiencies before you wrote a single line of production code.
Before we get into the applications, we need to answer one important question though:
Is there a difference to Formal Verification aka Formal Code Verification?
TL;DR: Yes, cost as well as how strong the result is.
You’ve probably heard of Formal Verification before. What people usually mean with that term is Formal Code Verification. That is mathematically proving the correctness of the actual code. Formale Code Verification makes a much stronger statement about your system. But it is also much more involved than Formal Architecture Verification. To us Formal Code Verification seems more complex, has implications on which tech-ecosystem you should use and is thus much more costly.
We love Formal Architecture Verification because it is cost efficient and can be applied even before any line of code has been written. Thus it enables much shorter, more efficient iterative cycles.
Ok, so when should I use Formal Architecture Verification?
We think that every good application for Formal Code Verification is also a good application for Formal Architecture Verification. But due to the lower cost, there are more cases where we think Formal Architecture Verification is a good idea. What follows is an informal list of overlapping categories for applications of which we think, that formal architecture verification is suitable.
Risk for basic services incl. public security
Did you ever turn on your tap, to find stinky or no water? I’m sure you wouldn’t like that. Water treatment and valve controlling systems are vital to us and formal architecture verification would be worthwhile here. Similar things are electricity, network routing systems and emergency hotlines with their connected dispatching services.
Risk for integrity of critical data
Elections are very important to democratic nations. Some use electronic systems for voting and some just use electronic systems further down the line. Where these systems are in use, it is vital they can be trusted. Here is an example where some architecture verification of the runtime algorithm might have helped? Since voting results are usually passed across multiple layers, a verification of the overall architecture would definitely be good.
Other critical data where integrity is crucial could be applications for asylum or benefits, blood conserves with their metadata, transplantation lists or data that may exclude people to partake in modern life like no-fly lists (especially the reason) or account blocking for gatekeepers like Google (try buying a bus ticket without a google account or apple account in Strasbourg, to explain the problem by example).
Risk for substantial financial losses
When your or your institution’s existence is at risk due to finances, it is probably worth spending a couple of days of effort into formal architecture verification. This is basic risk management. Given that in simple cases formal architecture verification takes less than a week of work for the whole system, it is probably a worthwhile investment for risks north of at least 100k€ estimated loss with decent probability (once in 3 years) or maybe even 50k€ with high probability (once a year). These are obviously example values which need to be reevaluated in your risk management.
From this abstract risk management perspective let’s come back to some examples. We have two examples that we think unarguably make sense: (stock) trading software can burn a lot of money very quickly with litte recourse possible. Also space exploration is something where usually quite a lot of money is at risk with sometimes little to no way of fixing things on the go.
Risk for life and limb
This is an obvious one. Whether it may be a pacemaker, an insulin pump app, a radiation treatment machine, a vehicle’s control-by-wire system, a CNC door-shut-off system or a fire alarm system; formal architecture verification really should be done. This should obviously not be the only quality assurance measurement undertaken. A non-exhaustive list of additional examples we could think of: electronic prescription systems, patient health records, surgery ticketing systems or presence systems in mines on which’s basis a go-nogo decision for the explosives is made.
Risk for huge disaster
Remember when Amsterdam was nearly flooded in 2023? That system, as well as other lock control systems, would be a good candidate for formal architecture verification. And it wouldn’t hurt to include the manual operation of the locks and the trip to get there in the architecture either. We think that could have prevented that incident.
Other such catastrophes could be reactor meltdowns, emergency power on (cruise) ships, disaster notification systems or to go back to more natural disaster: earthquake warning systems.
Wrap Up
Especially but not exclusively in health-related applications we see formal architecture verification as a worthwhile QA measure that should become the default. If you want to see technical details on how this looks in practice, stay tuned for another post going into detail about that.