As I have been extracting place names from medieval French texts now for about a year and the project has reached 100 texts, I can say with some certainty that on average a text contains 60-70 place names. By text, I am counting both poetry and prose, and a sustained narrative of 1500 words or more. That being said, I have excluded chronicles and histories; they have many more places in them.
Some particularly geo-dense texts I have worked with so far have included:
JoinvMo – La Vie de Saint Louis (Joinville)
RenContrR – Renart le Contrefait
BrunLatC – Le Tresor (Latini)
AquilonW – Aquilon de Baviere
AmbroiseP – L’estoire de la guerre sainte (Ambroise)
A sample visualization, colored to emphasize the imprint of a particular text within the full dataset is the following (the abbreviations in the key do not reflect those above, taken from the DEAF, and the data set used below is considerably smaller than it is now):
So now some speculations. If, in fact, there are some 2700 works composed in medieval French as ARLIMA has asserted and average numbers per text are 60-70, then we should expect somewhere between 162,000 and 189,000 total placenames. Of course, numbers will be much higher when chronicles are included.
The most recent visual of the entire dataset (some 5000 geo-resolved points extracted from about 100 works) created for the “Interfaces numériques” section of the Frankoromanistentag 2014 (Münster) is the following and uses a heat map-style visualization:
Remember that the data visualized here represent place names taken from an index locorum. The numbers can radically change when we actually mine texts. For example, in a spatial dataset of some 100 works “France” occurred 51 times, whereas when we mined a textbase of some 200+ texts in medieval French for the real occurrences of “France” we got 7127.