The greedy approach, of picking every possible number
that is bigger than the one chosen before it, will
produce an increasing subsequence but likely not the longest. As an example consider the sequence
(10, 9, 2, 5, 3, 7, 101, 18)
. If we start by picking 10, we won’t be able to
pick 9, 2, 5, 3 or 7, otherwise our subsequence would no longer be increasing.
We can, however, pick one last number, either 101 or 18, ending with the
subsequence (10, 101)
which has length 2.
In this example, however, a valid solution to the problem will have 4 elements
in increasing order. Valid solutions would be (2, 3, 7, 18)
or
(2, 5, 7, 101)
, among others with length 4.
As we now know, to arrive at a valid solution, we have to consider more than one greedily built subsequence.
However, considering all possible subsequences, would require traversing a
tree with n
branches at first level (one for each number in the sequence),
n1 at second (all numbers in the sequence except for the one already picked), n2 at third level and so on, which would be O(n!).
A few key observations will allow us to drastically reduce the amount of subsequences to consider:

We can stop considering any subsequence as soon as appending an element to it would result in decreasing, e.g., say we have
(2, 5)
, appending 3 would cause the subsequence to no longer be increasing, so we can stop right there and ignore that branch. 
For each number we get by iterating over the original sequence, we search our pool of subsequences for the longest to which we can append the number we got, i.e., the longest having a last element smaller than the number we have, and append the current number to it.

At most one subsequence of length L is needed among our pool of candidates. Say we have
(2,)
and(2, 5)
in our pool and we find a 3. We cannot append 3 to(2, 5)
as that would cause the subsequence to decrease, but we can append to(2,)
resulting in the following pool of candidates(2,), (2, 3), (2, 5)
. However, any number that can be appended to(2, 5)
can also be appended to(2, 3)
, as 3 < 5. As such, there will never be a subsequence starting with(2, 5)
that has greater length than the one starting with(2, 3)
so(2, 5)
can simply be replaced with(2, 3)
, i.e., our pool would be(2,), (2, 3)
.
Let’s look at how our pool of candidates evolves over time while we iterate over
the sequence (10, 9, 2, 5, 3, 7, 101, 18)
.
[(10,)]
[(9,)]
[(2,)]
[(2,), (2, 5)]
[(2,), (2, 3)]
[(2,), (2, 3), (2, 3, 7)]
[(2,), (2, 3), (2, 3, 7), (2, 3, 7, 101)]
[(2,), (2, 3), (2, 3, 7), (2, 3, 7, 18)]
We start with 10. The next number is 9, which is less than 10 and as such cannot be appended to (10,)
. As the subsequences (9,)
and (10,)
have the same length (1),
we can only keep the one which has the smallest last value, i.e., 9
. Then we get 2, which cannot be appended to (9,)
. As both (2,)
and (9,)
have the same length we discard (9,)
and keep only (2,)
.
We then find a 5 which can be appended to (2,)
,
producing a new subsequence (2, 5)
. Our pool has
now two subsequences. We cannot get rid of (2,)
, although it is a smaller subsequence it can still produce other promising candidates if we find any numbers between 2 and 5, which we do because the next number in the sequence is 3. It cannot be appended to (2, 5)
, but it can be appended to (2,)
, producing a new subsequence (2, 3)
. Both (2, 3)
and (2, 5)
have length 2 so we keep the one with the smallest last value, i.e., (2, 3)
. Any number that can be appended to (2, 5)
can also be appended to (2, 3)
so it will never produce
a shorter subsequence.
When we get 7, our pool has (2,), (2, 3)
so the longest subsequence to
which we can append is (2, 3)
producing a new subsequence (2, 3, 7)
which is added to the pool.
Other iterations follow just the same algorithm, which can be written in Python 3:
This algorithm has O(n²) time complexity as we iterate the original sequence once and for each element iterate the list of candidates once.
It can be made a little better if we keep the list of candidates sorted by its last element. Notice that we only have one candidate for each given length and at the end of each iteration the candidate with the biggest last element will also be the candidate of greatest length. We can then use binary search to find the right candidate to append an element in O(log n) thus reducing overall complexity to O(n log n).