X hits on this document

Powerpoint document

Python for NLP and the Natural Language Toolkit - page 21 / 47

142 views

0 shares

0 downloads

0 comments

21 / 47

Example

from nltk.token import WSTokenizer                          Extract a list of words from the corpus                       corpus = open('corpus.txt').read()                              # Count up how many times each word length occurs wordlen_count_list = []                                    wordlen = len(token.type())                          # Add zeros until wordlen_count_list is long enough while wordlen >= len(wordlen_count_list): wordlen_count_list.append(0)   # Increment the count for this word length wordlen_count_list[wordlen] += 1 Plot(wordlen_count_list)

Document info
Document views142
Page views142
Page last viewedSun Dec 11 04:32:36 UTC 2016
Pages47
Paragraphs392
Words1978

Comments