The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

PentahoFan (6) [Avatar] Offline
#1
Chapter 8 when using Random Forest

from sklearn.ensemble import RandomForestClassifier
model2 = RandomForestClassifier(n_estimators=100)
model2.fit(features, d_train.sentiment)

Traceback (innermost last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/sklearn/ensemble/forest.py", line 257, in fit check_ccontiguous=True)
File "/usr/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 220, in check_arrays
raise TypeError('A sparse matrix was passed, but dense '
TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.


however using below code launches a Memory error

model2.fit(features.toarray(), d_train.sentiment)
henrik.brink (22) [Avatar] Offline
#2
Hi,

I cannot reproduce your issue... can you find which version of Scikit-Learn you are using?

$ pip search scikit-learn


And how much memory you have?

$ cat /proc/meminfo


Thanks!

- Henrik
PentahoFan (6) [Avatar] Offline
#3
It is a virtual machine with 4 Gbs assigned and using scikit 0.14 version (not latest)
henrik.brink (22) [Avatar] Offline
#4
Ok. I'm using 0.16.1 (also not latest, but newer).

I think they (relatively) recently added support for sparse features, which may be what's causing this to fail for you. Do you have a way to test out 0.16 on your box?

- Henrik