Sunday, June 13, 2010

Lesson 1: The Difference between Configuration & State


[Please read Introduction first]

The difference between configuration and state is the most basic software concept which nobody has ever heard of. I had the good fortune of working in a company where I was exposed to great engineers. Passionate, experienced people who had decades of knowledge in great software development yet were humble enough to listen to you. This is the most precious jewel which I have learnt in my work life. It comes from a renowned telecom CEO who has contributed to several basic RFC's in the industry.

"Configuration is what the user has entered. It must only be changed by the end user. State is what the machine has reported; it must not be modified in any manner."


This is the basic rule, which I will now explain by example. It all started as a heated discussion which revolved around a reference variable entered by a user into the system. The machine uses this reference value to do something, and there are conditions where the value used by the machine as reference value must increase or decrease. The machine also reports back a value. The VP of Engineering wanted to use just one variable for the first two conditions and another one for the returned value. User enters it, machine changes it as needed and adjusts itself. This violates the rule of configuration and state, and to make a long story short, we ended up understanding why, and implementing it like this, while the VP got demoted.

What is so great about this?. Why not use one variable for this?. The reason is that, there are 3 concepts here:
  1. The configuration value which the user entered. If he sees the value changed next day, and he did not do it, he would be confused.
  2. The "real" reference value used by the machine. Initially it is the configuration value, but the moment the machine needs to modify it, we need to store it as a separate value, so that, we can track that something has changed and notice the difference between the initially configured value and the current value.
  3. The value returned by the machine. This is "state", which is the value picked up by a sensor and returned back to the user. This is the value returned by a sensor. This cannot be modified by us for any reason, except maybe localization or metric changes, because that would be falsifying the truth.
"Good software systems always separate the concepts of configuration and state. This is how really stable, well designed systems become that way. It is also something many telecom engineers know about because their firmware has to be better than desktop software."

I didn't understand this concept initially. But unlike many of the arrogant "know it all" software engineers/ architects today, I looked at my experienced colleagues nodding their heads, and I realized there is something here which I have to learn, because it seems to be important. So I went back to the person and spent time understanding what exactly he was trying to say. When this person finished his contract and left the company, I went to him and said something which I have never said to anybody else and hope that I can say to someone else tomorrow:


"Working with you has been the greatest privilege in my life, because you have made me realize that one should be humble, and there is always something more to learn. We can never underestimate someone, or think we know everything, because when we meet people with real knowledge, we realize we were fools, and we were so ignorant before and there is so much out there to learn, if only we had an open mind."


Once you understand the concept fully, it is amazing to see the number of places where this crops up. Recently I was involved in a design which needed us to bring some data into a database from multiple providers. Our initial database design I felt was "wrong" in some way. And I could not put my finger to the exact reason for this. What we were doing was that, we were merging configuration and state into a single table and shared columns.


If you have a table which is supposed to define the various fields sent to us from remote sensors, this table must contain the definition as we define it. We were having a column which we referred to as "Name", and which would be used as the default display name shown to the user. This was wrong, because the default display name shown to a user is a different concept than the "real" name of the field defined in the row. If we merge the two concepts, we get a skewed database where we cannot understand what the real field name is, because we modified the value to make it user understandable. This was configuration overriding state. What would have happened was that, this would impact the data driven model which we were trying to accomplish, because one of the primary tables driving the data driven approach would corrupt the information coming in.


This popped up in another place, where we had the requirement to store something which the user entered into the system, while we had another field which was populated by data which would come into our system from another system. The DBA wanted to merge the two fields into one column, but I stood my ground saying that these are two different things and cannot be merged into one column. One is what the user entered (configuration), the other is what the other system gave us (state). One must be stored and used for a purpose, the other must be logged. Even if they look like very similar concepts (or the same thing even), we cannot merge them into one variable or store them in the same place, because that is a basic flaw in the design.

There is one more weird condition where in a very complex design decision, what route to take is clarified when we use the concept of separation between configuration & state. Say you have a device in the field, and to configure it, you go through the UI and set the value to what it is. This value is configuration. Now, when the device sends data to the server, you cannot use the stored configuration to determine what device sent the message, because what the device sends is state. Not only should it be stored separately, what it is, should come as part of the packet, and not determined by the configuration done by a person on the UI. The problem with identifying state based on configuration is that, another user may change the configuration, but the device did not change physically on the field, or the configuration got lost from the table, and now we don't know what the device is. It could even be that, by an SQL statement, the configuration changed to a wrong value and now we are doing wrong things based on it. I guess, what this really means is that, state information should be self-sufficient for storage on the server and also "stateless" in some respects. We cannot decide what to do to state based on a value which is user configured. This must be determined from the state packet itself.

You may think you know all about software because you can code well, or know the technology well. But without understanding and using this core concept in your designs, you become just another "good" coder. This is lesson 1, because it is the perfect example of a concept which has nothing to do with technology, resembles philosophy and yet should be one of the binding principles of good software development.

Introducing Software Concept Design & Modelling

I come up with my best ideas in the night. And today 06/13/2010 I am happy to introduce software concept design & modelling or SCDM as a new area in software engineering. This sits somewhere between requirements gathering and software design. Most people don't know that something like this is needed for software work. This is also the key to understanding what is that "unquantifiable" thing which makes some engineers better than others even though they have lesser degrees/ knowledge/ experience. Most people who have it, don't even know that this is their key strength.

I never knew I had it. Then I worked at a company where I saw more senior people who had it too. I knew it was a great help in the software design process, but it was just unquantifiable.

"Software Concept Design & Modelling is all about understanding the requirements and designing the concepts around it, so that what we build as software makes sense as a whole, and provides value to the end user."

Sounds like meaningless jargon?. Not really. And to explain this in more detail, I will br providing concrete real-world examples of where this was done, and it provided immense value to the software development process.

"Understanding Software Concept Design is equivalent to understand the basic rules which make stable software. Such software has less entropy than others. Honing such skills is essential to make software development a repeatable success in the Enterprise."

But right now, I just want to pique your interest. What do these words really mean? - It means that, often, we fudge software design badly because we lose sight of what is really supposed to be built by using big words, or confusing the hell out of ourselves. In such situations, development either comes to a stand still, or worse, something gets built which turns out to be totally useless later.

The field of software concept design is not about technology specifics, or even domain specifics. It is about how we organize the concepts around a solution to ease the technical design. Requirements tell us what the user needs, Technical Design tells us how to implement a solution. SCD comes right in between, and it shows us how to effectively translate the requirements into a technical design.

This can totally change the direction of a software project:

1) Turns out that what we were discussing the past few weeks was of no use because as a complete solution, this makes no sense.

2) Somewhere along the line, we forgot what it is that we were building and why we were building it, because we were too focused on how to implement something.

The first time I encountered this dichtomy was when we were designing an application which would run some "tests" and produce data which would then be shown in various graphs and reports in the system as a reference value. There was a major flaw in this system right through the technical design and implementation.

We were calling it a "test" all the time, and never realized that when we have a test, that can either pass or fail. What we were doing is that, even if it failed, we were using the results of the test as the reference value. What we should have been doing was to realize that behind the thousands of variables, domain experts, and technical terminology, if the test failed - that indicated a problem, and we could not use the value as a "reference" anymore. And the reports and graphs would have no meaning without the same.
This is a good example of how a complete software development process with very intelligent people with lots of experience can completely fail, and the reasons for the failure really have nothing to do with the "technical design" other than that, there was NO conceptual design done here.

Much of this must have been difficult to follow and confusing as the issue seems obvious?. But we often sit quietly and listen to nonsense when everyone else is doing so, or someone very senior is going in a specific direction. Simple things get convoluted, everything seems directionless. All of these are as important as technical stuff.

In the next post, we will discuss the differences between configuration and status, why every engineer should know it, and why it is the most important thing which I never knew earlier. It is an excellent example of something which came down to a person as "tribal knowledge", "from experience", but can be considered as a very significant concept in Software Conceptual Design.
The Powered by Qumana