Hi,
it is a good idea and it will increase the performance of the bayesian reload.
It is more important to have nice mail samples in mailgood and mailspam rather than have a big number of them.
I think it is better not to have more than 100.000 docs.
Regards,
Nico