580532 (3) [Avatar] Offline
#1
Hi

In section 6.1.3, a word index is created on the train data of the imdb dataset.
The word_index is created with the function text_tokenizer(num_words = max_words) and max_words = 10,000.

The resulting word index contains 88,524 words... Why not just 10,000 max. ?