SLOG CSC165: Complexity and Big-O

Previously, we just talked about algorithm run-time, but we did not talk in detail about how run-time is determined. Run-time is stated as a function relating the input to the complexity of an algorithm. How do we determine complexity? We use Big-O notation.

Best and Worst

In terms of complexity, we must first account for best and worst cases. The best case is the case where the algorithm takes the minimum running time to complete over an input. For example, in linear search, if the first item in the input is the item we are searching for, then we have a best case. The worst case on the other hand is if the item we are searching for isn't even in the list, in which case the algorithm would take the maximum running time to complete. We typically care about the worst case scenario, because that is what cause algorithms to take a long time to run in the real world. Although it seems pessimistic to look at the worst case all the time, we should be aware that it happens often.

There is also an average-case, which is the expected running-time under random inputs following certain probability distributions. This is rather difficult to determine, because it depends on the probability distribution, which is based on what?

Big-O notation

What is this Big-O notation that we have been hearing about? When relating complexity and algorithm run-time, what we really care about is GROWTH. How does the run-time grow based on the input? Big-O notation is a method for characterizing that growth. Lets break it down.

running time is proportional to the number of steps as a function of input size (n).
what we really care about is how this number GROWS as a function of n
we don't really care about the absolute number of steps, so 2000 and 2004 are the same.
if we compare two functions 2n² and 100n² we only care about the n²
so constant factors don't matter
what about 2n²+n and n²? we only care about n² because when n is really big, 2n is negligible compared to n²
think about it like this: as input grows, do you care if an algorithm takes 100 years or 200 years to run? Not really... both are too long...
we end up using a model of asymptotic growth, which deals with how the complexity grows as you reach the limit of the size of the input
typically done using O(f(n)

So, if we see O(n), it means that the algorithm's running time grows no faster than n. More specifically, if we see f(n) in O(n²), it means that the function is f(n) upper-bound by n².

There are many classes of functions. Such as linear, logarithmic, and exponential. Linear and log functions grow at a rate that is within reason in reality. If an algorithm's complexity grows faster than these, such as exponential, then its run-time grows very quickly. In real life, we don't want that!

SLOG CSC165

Sunday, 30 November 2014

Complexity and Big-O

Best and Worst

Big-O notation

No comments:

Post a Comment