Wednesday, November 11, 2015

Software Engineering is dual natured: both a particle and a wave

The title of this post maybe pretty obtuse and hard to make sense of. I'll try to explain.This post is to disparage everyone who evaluates software engineers only by checking to see whether they know how to sort an array fast or ask algorithm questions.

The fundamental reason why this is a bad idea is because similar to most natural phenomena and physics in particular, we are making the very wrong assumption that because someone knows everything about the building blocks of something, that person can be a great engineer building applications.

This is completely wrong. And I'll explain why...

In the beginning there were Newton's laws which could explain most natural phenomena at the physical level (laws of motion, etc). This is what I call as the "wave".

Later, it was found that these laws break down at sub atomic levels (basic building blocks). This is what I call the "particle".

Essentially, physicists realized that two completely different sets of rules defined what happens at the sub atomic level and what happens at the larger levels of physical phenomena.

Boiling down the essence of this means that just because you have a deep understanding of how something works internally, you cannot use those equations in your daily life. For that, you can rely on much simpler set of laws like Newton's laws.

Of course, you can do everything at the sub-atomic level all the time, which would mean that you spend all your time on figuring out how those very complex equations map at the physical level.

How easy would it be then to do anything practical fast? - Try to model the path of a ball thrown at 45 degrees using quantum mechanical equations...

This is also why these guys who know the internals very well cannot produce any tangible results to solving big problems. They are applying particle physics in situations where they are supposed to use Newton's laws.

I may have mentioned these very thoughts in some previous posts. Why I am writing this post is because I had a concrete example of this in my recent work.

The particle part of software is the underlying data structures and algorithms internal to a high level framework like .NET and C#.

The wave part of software is when we build complex software to do some workflow using these building blocks. This is more similar to a process in Applied Electronics & Instrumentation more than linked lists and sorting algorithms.

A very distinguished engineer tried to solve the out of memory issue using the particle methodology. Use a more efficient data structure to reduce the memory usage. He tried for months and failed and the software could never handle more than a certain amount of data.

When I saw the problem, I approached it from the wave perspective. Irrespective of the data structure used, or the sorting algorithms used - the problem was because of the way the entire process worked. It was similar to a factory where the pipes feed chemicals into the plant to manufacture something else. The input flow needs to be regulated to the consumption. If this is not done, the process will overload - run out of memory.

I fixed the problem which had remained open for 8 years. I used no sorting algorithms or specialized data structures. Now the problem is completely resolved.

It is true that understanding the building blocks of software can help in rare situations where the higher level code does not work the way you think it may - because of how it is coded internally. However, this is an extremely rare situation. Most software projects fail a lot before it reaches that stage. Just think about how buggy Android is - and it is built by people who are from the top most universities and have the most degrees. The reason for this is firstly their arrogance that they will only hire people from some top colleges to build software which will be used by everyone. This is also because they think too much at the particle level, while they should actually be thinking at the wave level.

The best examples of this kind of follies are: Google Wave, Android and even the most horrible to use Gmail.

More Data

It is always a struggle to provide a more clearer explanation of what I am trying to say here. Let me give you a few more examples which clearly show how a system is not the sum of its parts.

  • A factory consists of many parts mechanical, non mechanical and electrical all of which work together for the most part to produce the output. The behavior of the factory cannot be calculated as the sum of its parts because when parts come together they interact in new ways which may not always have been foreseen before. Even if it did, when we add more parts and go on, eventually we reach a point where we have to consider the entire system in a holistic manner rather than try to optimize atomically.
  • For Software Engineers who love to go back to the building blocks because it allows them to understand "better" how it works - why not go back to the hardware level? No software runs in isolation. It always run on top of some hardware which sometimes changes the behavior of the software. So, if we want to "really", "comprehensively" understand how best to optimize the software or how it works, perhaps, we should first stop at machine language and optimize at that level with compilers, and then go deeper into how the hardware handles the instructions at the chip level and fix that as well.

    If this seems nonsensical to you, then the whole talk of data structures should also. Because all structures are static and there is only so much which can be done at that level. Software when running in a system consists of rich interactions between live instances of objects with complex behavior. This is best optimized as a whole process, rather than as the sum of its parts.