Zipf law

Zipf law
  • Zipf law states that only a few words in a language are used very often, while many or most are used rarely.
  • George Kingsley Zipf (1902-1950) first proposed this law in 1935.
  • It was one of the first academic studies of word frequency.

Formulae of Zipf law

  • The frequencies f of certain events are inversely proportional to their rank r.
  • Frequency is given approximately byf(r) 0.1/r.

Explanation

  • The mostcommon word in English (Rank 1) occurs about one-tenth of the time in a typical text.
    • The next most common word (Rank 2) occurs about one-twentieth of the time and so forth.
  • Another way of looking at this is that a rank r-word occurs 1/r times as often as the most frequent word.
    • That is, rank 2 word occurs half as often as the rank 1 word,
    • The rank 3-word one-third as often,
    • The rank 4-word one-fourth as often, and so forth.

Uses

  • It is useful in schemes for data compression and in the allocation of resources by urban planners.
  • For example, in 1949 he claimed that the largest city in a country is about twice the size of the next largest, three times the size of the third-largest, and so forth.

Sources

Related Articles

Internet Shutdowns

An internet shutdown may be defined as an intentional disruption of internet or electronic communications, rendering them inaccessible or effectively unusable, for a specific population or within a location, often to exert control over the flow of information.

Responses