Tuesday, March 8, 2016

Introducing the Big P Notation P(N)

An excellent article on Big O Notation:

https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/

I have realized that with lowered cost of computing and with computers in almost all desks in the world, the focus may have to shift to the Big P notation P(N) as I would like to refer to it.

Back in the old days when computers, memory and disk were super expensive, it was necessary to determine the maximum complexity of an algorithm and measure it so that we can optimize the algorithm. Definitely, there are cases where we still need to do this like when scaling upto a million users, etc.

However, today focus might need to shift to the P(N) notation which has become more and more important over the years.

If O(N) refers to time complexity, based loosely on how much CPU time an algorithm will consume, P(N) refers to people time complexity - which is based on how much human time is spent on creating, maintaining and optimizing an algorithm.

If 3 people work on the algorithm it becomes P(3). If you work on an algorithm 3 times as part of the development process, again you get P(5) [Dev - 1, Code Review - 2 people, Unit Tests - 2 people]

People are the most costly resource for computing today. I refer to the hundreds and thousands of well paid programmers. Any algorithm written today needs to consider the cost benefits as well.

If I can write an algorithm which is easy to understand, digest & maintain even by the most cheapest programmer I can hire, that might give you more bang for the buck than to worry about how much CPU time the algorithm will consume.

Sometimes, there are cases where a very computationally intensive O(N) algorithm is actually less preferred because of the high P(N) value.

The best case I can think of is an engine I wrote around 2004 which used reflection (very time consuming on the CPU) to validate DAL code before it hit the database. It took time to run, but it would raise many problems before the code would even hit the Oracle database thereby reducing development time significantly.

I would say it has a very low P(N) value. P(5) - I wrote it once, got it code reviewed, got it unit tested, and it "just worked" although it was relatively slow.

Because of some misguided notion of performance, we removed this algorithm and lost all the validation logic, and each person had to write the validations manually (sometimes missing many validations). Hence the P(N) value became P(N raised to 5). 5 counts of human effort for every person in my team.

So we reduced the O(N) value significantly, but massively increased P(N) value leading to a very unstable, though fast application.

Summary: Programs were written for people and not the other way around. Maybe, now that computing costs have gone down significantly, and people costs rise with inflation, we need to consider P(N) more than O(N) for many cases.

No comments:

Post a Comment