NGRAMS is a Python library which can analyze a string or text against observed frequences of ngrams (strings of 1, 2, 3, 4, or 5 alphabetic characters.)
The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.
NGRAMS is available in a Python version.
GERMAN, a dataset directory which contains some short German texts;
NGRAMS, a dataset directory which contains information about the observed frequency of "ngrams" (particular sequences of n letters) in English text.
TEXT, a dataset directory which contains some short texts in English;
WORDS, a dataset directory which contains lists of words;
You can go up one level to the Python source codes.