[Please read Introduction first]
The difference between configuration and state is the most basic software concept which nobody has ever heard of. I had the good fortune of working in a company where I was exposed to great engineers. Passionate, experienced people who had decades of knowledge in great software development yet were humble enough to listen to you. This is the most precious jewel which I have learnt in my work life. It comes from a renowned telecom CEO who has contributed to several basic RFC's in the industry.
"Configuration is what the user has entered. It must only be changed by the end user. State is what the machine has reported; it must not be modified in any manner."
This is the basic rule, which I will now explain by example. It all started as a heated discussion which revolved around a reference variable entered by a user into the system. The machine uses this reference value to do something, and there are conditions where the value used by the machine as reference value must increase or decrease. The machine also reports back a value. The VP of Engineering wanted to use just one variable for the first two conditions and another one for the returned value. User enters it, machine changes it as needed and adjusts itself. This violates the rule of configuration and state, and to make a long story short, we ended up understanding why, and implementing it like this, while the VP got demoted.
What is so great about this?. Why not use one variable for this?. The reason is that, there are 3 concepts here:
- The configuration value which the user entered. If he sees the value changed next day, and he did not do it, he would be confused.
- The "real" reference value used by the machine. Initially it is the configuration value, but the moment the machine needs to modify it, we need to store it as a separate value, so that, we can track that something has changed and notice the difference between the initially configured value and the current value.
- The value returned by the machine. This is "state", which is the value picked up by a sensor and returned back to the user. This is the value returned by a sensor. This cannot be modified by us for any reason, except maybe localization or metric changes, because that would be falsifying the truth.
I didn't understand this concept initially. But unlike many of the arrogant "know it all" software engineers/ architects today, I looked at my experienced colleagues nodding their heads, and I realized there is something here which I have to learn, because it seems to be important. So I went back to the person and spent time understanding what exactly he was trying to say. When this person finished his contract and left the company, I went to him and said something which I have never said to anybody else and hope that I can say to someone else tomorrow:
"Working with you has been the greatest privilege in my life, because you have made me realize that one should be humble, and there is always something more to learn. We can never underestimate someone, or think we know everything, because when we meet people with real knowledge, we realize we were fools, and we were so ignorant before and there is so much out there to learn, if only we had an open mind."
Once you understand the concept fully, it is amazing to see the number of places where this crops up. Recently I was involved in a design which needed us to bring some data into a database from multiple providers. Our initial database design I felt was "wrong" in some way. And I could not put my finger to the exact reason for this. What we were doing was that, we were merging configuration and state into a single table and shared columns.
If you have a table which is supposed to define the various fields sent to us from remote sensors, this table must contain the definition as we define it. We were having a column which we referred to as "Name", and which would be used as the default display name shown to the user. This was wrong, because the default display name shown to a user is a different concept than the "real" name of the field defined in the row. If we merge the two concepts, we get a skewed database where we cannot understand what the real field name is, because we modified the value to make it user understandable. This was configuration overriding state. What would have happened was that, this would impact the data driven model which we were trying to accomplish, because one of the primary tables driving the data driven approach would corrupt the information coming in.
This popped up in another place, where we had the requirement to store something which the user entered into the system, while we had another field which was populated by data which would come into our system from another system. The DBA wanted to merge the two fields into one column, but I stood my ground saying that these are two different things and cannot be merged into one column. One is what the user entered (configuration), the other is what the other system gave us (state). One must be stored and used for a purpose, the other must be logged. Even if they look like very similar concepts (or the same thing even), we cannot merge them into one variable or store them in the same place, because that is a basic flaw in the design.
There is one more weird condition where in a very complex design decision, what route to take is clarified when we use the concept of separation between configuration & state. Say you have a device in the field, and to configure it, you go through the UI and set the value to what it is. This value is configuration. Now, when the device sends data to the server, you cannot use the stored configuration to determine what device sent the message, because what the device sends is state. Not only should it be stored separately, what it is, should come as part of the packet, and not determined by the configuration done by a person on the UI. The problem with identifying state based on configuration is that, another user may change the configuration, but the device did not change physically on the field, or the configuration got lost from the table, and now we don't know what the device is. It could even be that, by an SQL statement, the configuration changed to a wrong value and now we are doing wrong things based on it. I guess, what this really means is that, state information should be self-sufficient for storage on the server and also "stateless" in some respects. We cannot decide what to do to state based on a value which is user configured. This must be determined from the state packet itself.
You may think you know all about software because you can code well, or know the technology well. But without understanding and using this core concept in your designs, you become just another "good" coder. This is lesson 1, because it is the perfect example of a concept which has nothing to do with technology, resembles philosophy and yet should be one of the binding principles of good software development.