That depends on a lot of factors; number of users, range of company operations… etc.. but there's no formula for it.
In a small-medium sized company with <500 employees, I found around 2000 msgs was enough; but that's not to say your situation is the same.
Also, there is probably a critical limit, when each additional message makes negligible difference to the Bayesian scores; at this point the cost of processing time would outweigh any advantage gained with more messages. So more is not necessarily better.
Rgds,
Tom