Tuesday, March 8, 2016

Introducing the Big P Notation P(N)

An excellent article on Big O Notation:

https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/

I have realized that with lowered cost of computing and with computers in almost all desks in the world, the focus may have to shift to the Big P notation P(N) as I would like to refer to it.

Back in the old days when computers, memory and disk were super expensive, it was necessary to determine the maximum complexity of an algorithm and measure it so that we can optimize the algorithm. Definitely, there are cases where we still need to do this like when scaling upto a million users, etc.

However, today focus might need to shift to the P(N) notation which has become more and more important over the years.

If O(N) refers to time complexity, based loosely on how much CPU time an algorithm will consume, P(N) refers to people time complexity - which is based on how much human time is spent on creating, maintaining and optimizing an algorithm.

If 3 people work on the algorithm it becomes P(3). If you work on an algorithm 3 times as part of the development process, again you get P(5) [Dev - 1, Code Review - 2 people, Unit Tests - 2 people]

People are the most costly resource for computing today. I refer to the hundreds and thousands of well paid programmers. Any algorithm written today needs to consider the cost benefits as well.

If I can write an algorithm which is easy to understand, digest & maintain even by the most cheapest programmer I can hire, that might give you more bang for the buck than to worry about how much CPU time the algorithm will consume.

Sometimes, there are cases where a very computationally intensive O(N) algorithm is actually less preferred because of the high P(N) value.

The best case I can think of is an engine I wrote around 2004 which used reflection (very time consuming on the CPU) to validate DAL code before it hit the database. It took time to run, but it would raise many problems before the code would even hit the Oracle database thereby reducing development time significantly.

I would say it has a very low P(N) value. P(5) - I wrote it once, got it code reviewed, got it unit tested, and it "just worked" although it was relatively slow.

Because of some misguided notion of performance, we removed this algorithm and lost all the validation logic, and each person had to write the validations manually (sometimes missing many validations). Hence the P(N) value became P(N raised to 5). 5 counts of human effort for every person in my team.

So we reduced the O(N) value significantly, but massively increased P(N) value leading to a very unstable, though fast application.

Summary: Programs were written for people and not the other way around. Maybe, now that computing costs have gone down significantly, and people costs rise with inflation, we need to consider P(N) more than O(N) for many cases.

Monday, March 7, 2016

Lesson 8: No solution can be generalized to be always true in all situations. Depends on the system design.

We had a recent situation where we were going through a case where when the code reads from a value stored within a ConcurrentDictionary, this value could be stale.

We had an interesting discussion why this is ok. And more interesting about why we have to make an update to this value to be thread safe.

(1) Why Stale Value is OK in some cases

This is ok because in some cases, the design of the system is like a traffic light - some vehicles may go even when the light is orange, some may flout the "rules" and still the system can function.

If writes are consistent, reads can be stale in some systems. The critical reason why reads could be stale is if the stale value has no permanent effect on the system, and the cost of keeping it not stale is too expensive.

If the stale value caused permanent changes then it is not ok (persistence, database transactions, etc).

If the stale value is a signal which is read often and the effects of it being stale are very temporary in nature, for performance reasons, it is ok to read the stale value even while updating it in the ConcurrentDictionary.

I see this as a case where computers and programmers do not do well in grey area scenarios. Not all answers can be 0 or 1. Sometimes, grey areas exist and provide optimal solutions to real world problems.

(2) Why the update itself should be thread safe

If this ConcurrentDictionary stores a number, and we want to update the number, we want to keep this operation thread safe because if it is not, then depending on the number of threads accessing the write at the same time, the value of the number could careen across wrong values, thereby leading to a permanent bad state.

This is similar to the traffic signal entering a permanent state of malfunction where only some lanes turn green all the time for no particular reason except software error irrespective of the traffic load at each side of the signal.

We want the number to be correct on a permanent basis, but allow the reads to be stale sometimes for performance reasons.

A great way to do this is to use the AddOrUpdate method which I have not verified myself. But it looks like the below may work fine:

concurrentDictionary.AddOrUpdate(key, value, (key, oldValue) => oldValue + 1);

So locking the update in this case is not really being inconsistent when following this philosophy.