The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

PentahoFan (6) [Avatar] Offline
Chapter 8 when using Random Forest

from sklearn.ensemble import RandomForestClassifier
model2 = RandomForestClassifier(n_estimators=100), d_train.sentiment)

Traceback (innermost last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/sklearn/ensemble/", line 257, in fit check_ccontiguous=True)
File "/usr/lib/python2.7/dist-packages/sklearn/utils/", line 220, in check_arrays
raise TypeError('A sparse matrix was passed, but dense '
TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

however using below code launches a Memory error, d_train.sentiment)
henrik.brink (22) [Avatar] Offline

I cannot reproduce your issue... can you find which version of Scikit-Learn you are using?

$ pip search scikit-learn

And how much memory you have?

$ cat /proc/meminfo


- Henrik
PentahoFan (6) [Avatar] Offline
It is a virtual machine with 4 Gbs assigned and using scikit 0.14 version (not latest)
henrik.brink (22) [Avatar] Offline
Ok. I'm using 0.16.1 (also not latest, but newer).

I think they (relatively) recently added support for sparse features, which may be what's causing this to fail for you. Do you have a way to test out 0.16 on your box?

- Henrik