Big O notation is a tool we use to estimate how long a program or an algorithm will run. On this blog we’ve covered a lot of algorithms but we only touched on the big O notation. If you have a degree in computer science then you’re probably familiar with big O. If not, in this short article you’ll learn what it is and how to estimate running times of algorithms yourself.
The Measure of Complexity
We need a way of figuring out how long our algorithm will run. There are a few factors that impact the speed of execution. Number of elements that your algorithm is processing, the speed of the machine it’s running on, the algorithm that’s processing the data…
Obviously, the speed of the machine is going to vary. Not everyone has the same spec machine and the speed could vary even if you run the same algorithm on the same machine. Because there’s a lot of things going on in your computer and a lot of other programs are competing for resources.
So, that leaves us with the algorithm and the number of elements. When calculating complexity, we generally want to know how many steps an algorithm will take for an input of size ‘N’. Let’s take this simple example:
This simple example has two ‘for’ loops, and for each element ‘N’ it will execute ‘N’ times. So in total it will execute N^2 times. In big O notation we would say that this algorithm has a complexity of O(N^2). If ‘N’ is 100, the inner loop would execute 10000 times. Calculating the running times is not that difficult, when you get the hang of it. If your algorithm has nested loops, you multiply. If you have loops that are not nested, you add. The example above has two nested loops so the execution time is N*N or N^2.
You could easily estimate the time this algorithm would take to run. You run it on your computer, note the time it took to execute, increase the number of elements and repeat. After a couple of measurements you would get a factor if increase and you could estimate the running time on your machine for a larger dataset.
Order of Growth
When estimating a running time of algorithms, we could go over every single line and calculate how many operations will be executed, how long each operation takes, etc. This is notoriously difficult to do and it will differ from system to system.
What we do is approximate the complexity by eliminating non dominant factors in the equation. This sounds complex, but what it really means is: you take the biggest number and ignore the rest. Let’s say that you actually went through your algorithm line by line, and you calculated the formula for the time complexity to be: f(N) = N^3 + N/2 + c where ‘N’ is the size of the input array and ‘c’ is a constant. If you look at the formula you’ll see that N^3 is the most dominant factor in the equation. As ‘N’ increases, the expression N^3 will grow rapidly. For example, if N=1000 then our formula would look like this: f(1000) = 1000000000000 + 500 + c. We can clearly see that N/2 is playing an insignificant role in this formula.
After eliminating the non dominant factors the formula becomes a lot simpler: f(N) = N^3. And this is good enough for us to estimate the running times of our algorithms.
Common Complexities
There are a couple of most common complexities that you will run into. Almost all of the algorithms that you will encounter during your job interviews (and real life) will have one of these complexities. We’ll go over them here. The complexities are ordered from the fastest to the slowest.
Constant – O(1)
An algorithm or a function with a constant complexity will always execute in roughly the same time. It’s not dependent on the size of the collection that the algorithm is operating on. This is the most desired complexity. Operations like addition or multiplication have this complexity. Pushing and popping an item from a stack have a constant complexity. Getting a first/last element in a swift array…
Logarithmic – O(log N)
The next on our list is the logarithmic complexity. It’s only a bit slower than the constant complexity. A good example of a logarithmic complexity is a binary search.
Linear – O(N)
The linear complexity is very common. The processing time of functions/algorithms that have linear complexity is proportional to the size of the collection. Usually these algorithms have one loop that’s iterating over the entire collection. A good example of the linear complexity is searching in an unsorted array where you have to iterate over the entire collection. For example, you might be searching for the longest string in an array of strings.
Linearithmic – O(N log N)
Linearithmic complexity is slightly worse than the linear, but it’s still acceptable. When you design your algorithms you should try to make them linearithmic or better in complexity. One well knows linearithmic algorithm is the merge sort. If your algorithm is operating on the ‘divide and conquer’ principle there’s a good chance it has linearithmic complexity.
Quadratic – O(N^2)
A typical quadratic algorithm will have two nested loops. The simple code example that we used is a quadratic algorithm. Some well-known quadratic algorithms are bubble sort, selection sort and insertion sort.
Cubic – O(N^3)
Cubic algorithms are very slow, especially for large data sets. You can easily identify them by three nested loops that they will have. You can find them in algorithms that are doing matrix multiplications.
Exponential – O(2^N)
Exponential algorithms are extremely slow. If you have a large dataset an algorithm with exponential time complexity will never finish. Brute force algorithms usually have exponential time complexity. If you see an algorithm that’s trying to do an exhaustive search on a dataset there’s a good chance you’re looking at an exponential algorithm.
These are the most widely used time complexities. In your daily life you’ll most likely encounter the first four. If you’re designing your own algorithms you might see the other three 🙂 You should try to make your own algorithms at least linearithmic in complexity.
Here’s how some of these time complexities look on a graph (n – input size, N – number of operations/time):
Conclusion
Big O notation is not just for experts in computer science. Every developer should be familiar with it. You write code every day and you can calculate the time complexity for every function that you write. So next time when you write a double (or triple) nested loop hopefully you’ll remember to check the time complexity and think about redesigning your function to be a bit faster.
That being said, I share the opinion of Donald Knuth that premature optimisation is the root of all evil 🙂 If you know that you’ll be working with a small dataset then you don’t really have to optimise your algorithms. But if you have no control over the input, you should.
I hope you’ve enjoyed this article and that you’ve learned something new today.
Have a nice day 🙂
~D;