Large scale tests - example 3

The data for these figures was generated in the following way. We generated 100 data points for each of the 1000 leaves in this data set. Most of these 100 points where generated at random, however (as can be seen) for each leaf we chose 40 points which were set to -1 (in different places in different leaves). We allowed some of these 40 points to be 0 or 1 with low probability. We then computed the correlation coefficients between all the leaves and performed hierarchical clustering. Next we ordered the resulting tree using our optimal ordering algorithm. The small images are enlargements of the same cluster in both results