Monday, May 9, 2011

Dealing with Ambiguity: Zero does not mean anything

In software we often come across complex and difficult scenarios which might muddle our mind. One of the key concepts to understand is how to deal with ambiguity.

Basically, this means that if the data we have is not adequate to reach a satisfactory conclusion, then we need to generate the additional information which is necessary for the computer to understand what needs to be done. This automatically means that assuming that something needs to be done in a certain way is a wrong thing to do. In such scenarios, always have extra properties which can go down to the most basic level and which can be parsed to clearly understand the requirements.

A simple example of this is how we deal with nullable types, enums, etc. Enums make it easier to assign a meaning to an esoteric number which is unreadable and not maintainable. Nullable types for values like dates and numbers allow us to know for sure that, there was nothing present to begin with. In a similar manner, we often abuse zero to mean something in applications. Zero is zero, it means nothing, a number can be initialized to zero, to begin with, so, even if we use nullable types, it would still be good practise to say that 0 = default = nothing.

What can be more interesting than the fact that, all these concepts even boil down to simple naming conventions. I have seen reams of code which are completely meaningless start making sense, after I renamed "xyz" to something more meaningful like "index" or "counter" or something else. In this case, the name of the variable was making the usage ambiguous. I gave it a concrete name and everything simply fell together in place.

Another interesting aspect to this which confuses many engineers is that most of the time, when we feel ambiguous about something, it is because we do not have enough inputs. The existing inputs are not adequate for the program to assume something and then exhibit a new behavior, or execute some logic. Most often in such cases, we need to pass these new inputs or parameters right from the user end, through the various layers to the code which needs to decide what to do. And it is difficult for a novice engineer or even someone with experience to realize that ambiguity is tackled by removing it, and one of the ways of doing that is to add more inputs.

I was having a conversation with a friend of mine regarding a very complex problem. I understood only part of the detail, but during the conversation, it became apparent that there was an underlying ambiguity regarding what decision had to be made, and my friend was struggling to come out with an assertion that so and so would mean X and so and so would mean Y. I interjected and said that, the situation is ambiguous and that assertion cannot be made. We added one more column which would say concretely what would result in X and what would result in Y.

Sunday, April 10, 2011

Configuration & State in Web Service API's

I was recently commended on the strength of my API design, and there was a remark that elsewhere we did not do it the way I implemented my method. So I thought that I would like to add a concrete "modern" example of configuration and state considerations in web service API's.

Basically, we have an API which lets the customer send us "activate alarm" requests, when the alarm is activated, the device sends back the alarm status. When I designed the API, there was the thought that, why not simply use one database column for both considerations - "what the customer asked us to do" & "what the device told us it's status was".

The reasoning was that, if we are in fact immediately sending the device an alarm activation message, why, it would go into alarm state. Why have extra fields to capture this information?

I put my foot down. Nope, Nada - we won't do it that way. the activate alarm request is user configuration, the state of the alarm activation in the device is a separate entity, both are different and should not share the same storage space.

8 months after the API was deployed, we had a customer issue which we were able to easily debug because we were storing the values separately. That turned out to be a device issue.

Elsewhere, we shared data space, and there, now everyone is in a tizzy trying to fix that entire implementation.

So.. long story short - keep configuration and state different. Always.