WordFrequency.frink

Download or view WordFrequency.frink in plain text format


/** This implements the "Word Frequency" example on Rosetta Code:
     https://rosettacode.org/wiki/Word_frequency

    There are a few interesting things to note:
  
   * Frink has a Unicode-aware function, wordList[str], which intelligently
     enumerates through the words in a string (and correctly handles compound
     words, accented characters, etc.)  It returns words, spaces, and
     punctuation marks.  Results that do not contain alphanumeric characters
     are filtered out.
  
    * The file fetched from Project Gutenberg is supposed to be encoded in
      UTF-8 character encoding, but their servers incorrectly send that it
      is Windows-1252 encoded, so this program fixes that.

    * Frink has a Unicode-aware lowercase function, lc[str] which correctly
      handles accented characters.

    * This program uses the dictionary.increment[key, num] method to help
      count words.
*/


d = new dict
for w = select[wordList[read["https://www.gutenberg.org/files/135/135-0.txt", "UTF-8"]], %r/[[:alnum:]]/ ]
   d.increment[lc[w], 1]

println[joinln[first[reverse[sort[array[d], byColumn[1]]], 10]]]


/** Also see this one-liner that works in newer versions of Frink */

// formatTable[first[countToArray[select[wordList[lc[normalizeUnicode[read["https://www.gutenberg.org/files/135/135-0.txt", "UTF-8"]]]], %r/[[:alnum:]]/ ]], 10], "right"]


Download or view WordFrequency.frink in plain text format


This is a program written in the programming language Frink.
For more information, view the Frink Documentation or see More Sample Frink Programs.

Alan Eliasen was born 19966 days, 6 hours, 4 minutes ago.