I sent mail asking for comment and received very little. The current text contains a simple fixed crossover version. More discussion is needed. Here is an excerpt from the mail I sent.
1. Determining the cross over point from short to tree gather/scatters?
There was a fair amount of discussion on how to determine this and
no consensus was reached. The following alternatives represent the
various ideas put forward and are offered up for debate. If I have
missed any please post them. I have tried to list them in order of
increasing complexity.
a. fixed size specified by the standard
b. each implementation at startup provides their idea of the size and
the crossover is taken to be the min (or max?)
c. the user specifies it via an option to mpirun. In this case
a subsidiary issue is whether this is a required option or
whether we fall back to (a) or (b) when it is not specified.
d. at startup time implementations supply local latency/bw estimates
and a global latency/bw estimate is obtained from a config file
or some other mechanism. The crossover is then calculated from
these via some yet to be determined formula.