In this paper we revisit Zipf's law in the context of linguistics. The deviations from the original simple power law are analysed and a dynamic model for text generation is proposed whose parameters can be associated with some structural features of languages. Furthermore, for the case of large corpora a novel phenomenology is disclosed. In this case a quantitative description of all the scaling regimes is possible by considering the family of solutions of a single first order differential equation.
Volume: 4, 09/2002
Pages: 87-99