The Origin of Protein folds
If we take E.coli as our model then the average length of a functional protein is around 300 amino acids.
If the protein is longer or if a number of different proteins are required to act in concert then the problem is made worse.
The number of possible sequences to be sampled to find functional proteins of this moderate size is vast...
There are two ways of reducing the sparse sampling problem.
(a) The number of possible sequences with a particular function is very large. Sequence space is rich in function.
(b) The functional sequences are in some way linked- once you have one function it is easy to jump to the next.
Axe argues that experimental results indicate that (a) is not big enough to solve the sparse search problem....
I will look at what he says about (b) next week!