X hits on this document

Powerpoint document

Python for NLP and the Natural Language Toolkit - page 21 / 47

149 views

0 shares

0 downloads

0 comments

21 / 47

Example

from nltk.token import WSTokenizer                          Extract a list of words from the corpus                       corpus = open('corpus.txt').read()                              # Count up how many times each word length occurs wordlen_count_list = []                                    wordlen = len(token.type())                          # Add zeros until wordlen_count_list is long enough while wordlen >= len(wordlen_count_list): wordlen_count_list.append(0)   # Increment the count for this word length wordlen_count_list[wordlen] += 1 Plot(wordlen_count_list)

Document info
Document views149
Page views149
Page last viewedMon Jan 16 19:38:23 UTC 2017
Pages47
Paragraphs392
Words1978

Comments