Sunday 30 November 2014

Complexity and Big-O

Previously, we just talked about algorithm run-time, but we did not talk in detail about how run-time is determined. Run-time is stated as a function relating the input to the complexity of an algorithm. How do we determine complexity? We use Big-O notation.

Best and Worst

In terms of complexity, we must first account for best and worst cases. The best case is the case where the algorithm takes the minimum running time to complete over an input. For example, in linear search, if the first item in the input is the item we are searching for, then we have a best case. The worst case on the other hand is if the item we are searching for isn't even in the list, in which case the algorithm would take the maximum running time to complete. We typically care about the worst case scenario, because that is what cause algorithms to take a long time to run in the real world. Although it seems pessimistic to look at the worst case all the time, we should be aware that it happens often.

There is also an average-case, which is the expected running-time under random inputs following certain probability distributions. This is rather difficult to determine, because it depends on the probability distribution, which is based on what?

Big-O notation

What is this Big-O notation that we have been hearing about? When relating complexity and algorithm run-time, what we really care about is GROWTH. How does the run-time grow based on the input? Big-O notation is a method for characterizing that growth. Lets break it down.
  • running time is proportional to the number of steps as a function of input size (n). 
  • what we really care about is how this number GROWS as a function of n
  • we don't really care about the absolute number of steps, so 2000 and 2004 are the same.
  • if we compare two functions 2n2 and 100n2 we only care about the n2
  • so constant factors don't matter
  • what about 2n2+n and n2? we only care about n2 because when n is really big, 2n is negligible compared to n2
  • think about it like this: as input grows, do you care if an algorithm takes 100 years or 200 years to run? Not really... both are too long...
  • we end up using a model of asymptotic growth, which deals with how the complexity grows as you reach the limit of the size of the input
  • typically done using O(f(n)
So, if we see O(n), it means that the algorithm's running time grows no faster than n. More specifically, if we see f(n) in O(n2), it means that the function is f(n) upper-bound by n2.

There are many classes of functions. Such as linear, logarithmic, and exponential. Linear and log functions grow at a rate that is within reason in reality. If an algorithm's complexity grows faster than these, such as exponential, then its run-time grows very quickly. In real life, we don't want that!

Algorithm Run-time

From this point on, the course switches gears a little bit, and will begin to discuss algorithms, efficiency, and complexity. First lets talk a little bit about efficiency and orders of growth.

Although it may sound intuitive, but when first thinking about efficiency, one might ask "why is efficiency important?"
    The answer is that efficiency determines how long our programs takes to run, which is called run-time.

This is not to be confused with the time in seconds an algorithm takes to run. For instance, if one algorithm took 2 seconds to run, and another algorithm took 0.5 seconds to run, can you say that the second is faster than the first? No, because I did not specify many necessary factors that needs to be taken in account. For one, we need to think about the speed of the processor, as well as the implementation of Python. It also depends on the amount of data input, and in terms of extremely large data sets, some algorithms might take a millennium to run!

More importantly, efficiency is about algorithms, not about coding structure or clever coding.

So the way we normalize across these factors, is to talk about efficiency in terms of the steps an algorithm takes to produce a result based on an input. A step is both a physical thing and an abstract idea, in that we assume each step takes constant time.

Proof by Cases

Some proofs need to be broken down into different cases, through which the whole proof can be evaluated. This makes proving complex proofs easier. In fact, I think we already used this method for proving equivalence. In class, we showed that an implication "P implies Q" is equivalent to its contrapositive "not Q implies not P" by showing:

P Q P implies Q not Q implies not P
T T T T
T F F F
F T T T
F F T T
In each of the four cases above, we showed that P implies Q is true if and only if not Q implies not P is true. Even when P is True and Q is False, both the implication and its contrapositive are False, which means the equivalence still stand. Also note that proof by cases work in a much broader environment than propositions involving boolean variables.

Proofs and non-Boolean Functions, disproofs, and false proofs!

Non-Boolean Functions

In class, we discussed about non-boolean functions such as the floor function which takes in a number and returns the largest integer that is less than or equal to that number. The non-boolean functions can be used to set up more interesting predicates for evaluation. For example, prove that n2 + 3n + 127 is a prime number for all natural numbers n. Ok, lets take natural number 0, yes, 127 is a prime number. What about 1? If n = 1, then it equals to 131, yes that is a prime. 2? No...2 is not prime. So the proposition is incorrect. So for non-boolean functions, some input values would be ok, while others are not. The main takeaway is that just because a predicate seems to be true for some values, doesn't mean the whole proposition is true.

Disproving

Now, how do we disprove something? For example, how do we disprove the statement 2+2=5? We just need to prove that 2+2 does not equal to 5. So to disprove any statement S, we need to prove that not S is true. This is very useful, because we would not always get to prove true propositions, we would need to know how to disprove something that is false. The most important thing to notice is that the proposition is false, then we can negate it and start disproving. If we fail to notice that a proposition is false, we can try to prove it, but it wouldn't go anywhere. It is a very bad thing to prove something False to be True.

False Proofs

I think this is the funniest topic in this course. We must remember that the real world exists outside of the realm of mathematics, and there are many types of proofs out there, some of which do not actually prove anything! I think the most prevalent false proof out there is the proof by gut feeling. Our intuition is good for rough estimations and predictions based on bad evidence, which was useful in the wild, when we needed to escape predators lurking in the bushes, but not good for establishing truths.

Saturday 29 November 2014

Indirect Proofs - Proof by Contradiction

Proof by contradiction starts with an assumption that the proposition in question is false, followed by a series of logical deductions that lead to a contradiction of something that we all know to be true, then we can say that the proposition must be true. In simpler terms... if a proposition were false, then something false would be true, and since false cannot be true, our proposition better not be false.

This all sounds very abstract... so lets break it down.
Let P be a proposition
To prove P is true,
    we assume P is false (not P is True)
        this leads to some falsehood (such as 2 is not an even number)
    then not P -> falsehood is True (How? Remember truth tables?)
        the only way for this to work is if not P is false, which means P is true.
This is the same as proving the contrapositive of Truth -> P

Prove using Contrapositive

We already know that "If P implies Q" is logically equivalent to its contrapositive "Not Q implies not P". Therefore, the contrapositive would be equivalent to proving the original statement. In fact, this is sometimes easier to do, and should be used. In class, we used the example "Prove that if x2 is odd then x is odd for all natural numbers x". This becomes much simpler to prove if we prove the contrapositive. We just need to remember to indicate that we are proving the contrapositive, and state the contrapositive in the final prove.

For statements like "if x is odd then x2 is odd", the antecedent is equivalent to the consequence. To prove that two statements are equivalent, we would need to prove that each statement implies the other. First we prove P implies Q, then we prove Q implies P. Another way to prove equivalence is to use a chain of equivalence, whereby P is equivalent to a second statement which links to a chain of equivalent statements until Q.

Proofs

How do we communicate truth?

We sometimes need to convince other people that what we know or understand is true. How do we do that? We need proof.

Direct proof

A direct proof essentially show that there is a link between an assumption and the result. To form this link, we can use previously proven statements and axioms (e.g. DeMorgan's Law). Each assumption about a claim has its own scope. For instance, in the claim that for all elements of x in set D, x with the property 'P' implies x with property 'Q', we assume x is an element of D, whereby no x is a none-D, and P(x) is assumed true, then P(x) implies Q(x).

Axiomatic Method

Before diving straight into proofs, we should look back to the axiomatic method, a method of establishing truths in mathematics invented by Euclid in C. 300 BCE. He began with simple assumptions that are seemingly undeniable from logic, which are called axioms, and proved many propositions from those assumptions. His proofs consisted of a sequence of logical deductions from previous axioms and previously-proved statements, which finally concludes with the proposition in question. Theorems are important propositions proved through this deductive process.

Lets go back to the direct proofs. A direct proof can prove implications, which are in the form "if P, then Q". One way to prove an implication is to first assume P is true, then show Q is somehow linked to this assumption. Finding this link is not trivial. To visualize this, a Venn Diagram can be helpful. For example, to prove P(x) implies Q(x), we can look for results that P(x) implies, and results that imply Q(x). Now, we can visualize the results that P(x) implies as sets that contain P, while the results that imply Q(x) are contained by Q. Now all we need is to find a patch of containment from to Q.

One thing to note is that before a structured proof can be written, one should first do some scratch-work while one figures out the logical links. Just keep in mind to keep the scratch-work separate from the final proof.