Hi Jesse,

Here's the notebook you request https://drive.google.com/open?id=1NmNysYPV_yGpMu6FfS7eRvdMJIFO3WPW. Please let me know if you anything else.

I've also tried with the example presented still getting errors for dropping columns with the word violation

# removing a column names that meet a certain criteria
violationColumnNames = filter(lambda columnName: 'Violation' in columnName, nyc_data_raw.columns)
 
with ProgressBar():
    print(nyc_data_raw.drop(violationColumnNames, axis=1).head())


Result

[                                        ] | 0% Completed |  2.3s

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-31-14fd89717e56> in <module>
      3 
      4 with ProgressBar():
----> 5     print(nyc_data_raw.drop(violationColumnNames, axis=1).head())
      6 
      7 # Produces the following output:

~/miniconda3/lib/python3.6/site-packages/dask/dataframe/core.py in head(self, n, npartitions, compute)
    874 
    875         if compute:
--> 876             result = result.compute()
    877         return result
    878 

~/miniconda3/lib/python3.6/site-packages/dask/base.py in compute(self, **kwargs)
    154         dask.base.compute
    155         """
--> 156         (result,) = compute(self, traverse=False, **kwargs)
    157         return result
    158 

~/miniconda3/lib/python3.6/site-packages/dask/base.py in compute(*args, **kwargs)
    395     keys = [x.__dask_keys__() for x in collections]
    396     postcomputes = [x.__dask_postcompute__() for x in collections]
--> 397     results = schedule(dsk, keys, **kwargs)
    398     return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
    399 

~/miniconda3/lib/python3.6/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, **kwargs)
     74     results = get_async(pool.apply_async, len(pool._pool), dsk, result,
     75                         cache=cache, get_id=_thread_get_id,
---> 76                         pack_exception=pack_exception, **kwargs)
     77 
     78     # Cleanup pools associated to dead threads

~/miniconda3/lib/python3.6/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs)
    499                         _execute_task(task, data)  # Re-execute locally
    500                     else:
--> 501                         raise_exception(exc, tb)
    502                 res, worker_id = loads(res_info)
    503                 state['cache'][key] = res

~/miniconda3/lib/python3.6/site-packages/dask/compatibility.py in reraise(exc, tb)
    110         if exc.__traceback__ is not tb:
    111             raise exc.with_traceback(tb)
--> 112         raise exc
    113 
    114 else:

~/miniconda3/lib/python3.6/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
    270     try:
    271         task, data = loads(task_info)
--> 272         result = _execute_task(task, data)
    273         id = get_id()
    274         result = dumps((result, id))

~/miniconda3/lib/python3.6/site-packages/dask/local.py in _execute_task(arg, cache, dsk)
    250     elif istask(arg):
    251         func, args = arg[0], arg[1:]
--> 252         args2 = [_execute_task(a, cache) for a in args]
    253         return func(*args2)
    254     elif not ishashable(arg):

~/miniconda3/lib/python3.6/site-packages/dask/local.py in <listcomp>(.0)
    250     elif istask(arg):
    251         func, args = arg[0], arg[1:]
--> 252         args2 = [_execute_task(a, cache) for a in args]
    253         return func(*args2)
    254     elif not ishashable(arg):

~/miniconda3/lib/python3.6/site-packages/dask/local.py in _execute_task(arg, cache, dsk)
    251         func, args = arg[0], arg[1:]
    252         args2 = [_execute_task(a, cache) for a in args]
--> 253         return func(*args2)
    254     elif not ishashable(arg):
    255         return arg

~/miniconda3/lib/python3.6/site-packages/dask/dataframe/core.py in apply_and_enforce(func, args, kwargs, meta)
   3691             if not np.array_equal(np.nan_to_num(meta.columns),
   3692                                   np.nan_to_num(df.columns)):
-> 3693                 raise ValueError("The columns in the computed data do not match"
   3694                                  " the columns in the provided metadata")
   3695             else:

ValueError: The columns in the computed data do not match the columns in the provided metadata


Please advise. I'm currently using:

dask 0.20.1 py36_0
dask-core 0.20.1 py36_0


Hi,

I tried running this code from nyc data all of it Chapter 5: Cleaning and transforming dataframes. Please help debug this since this was a similar paradigm in the hello dask notebook and it worked on a subset of data. But doesn't worked in all the full DAG of the data.

# finding the number of missing values per column
missing_values = nyc_data_raw.isnull().sum()

# finding the percentage of missing values in each column
percent_missing = ((missing_values / nyc_data_raw.index.size) * 100)
print(percent_missing.compute()[:10])

# drop columns that meet this threshold that is, greater than or equal to 50 percent
columns_to_drop = missing_values[percent_missing >= 50].index

# remove columns that meet this threshold
nyc_data_clean_stage1 = nyc_data_raw.drop(columns_to_drop, axis=1)

My output:
# drop columns that meet this threshold that is, greater than or equal to 50 percent

columns_to_drop = missing_values[percent_missing >= 50].index

?

# remove columns that meet this threshold

nyc_data_clean_stage1 = nyc_data_raw.drop(columns_to_drop, axis=1)

?

# df_dropped = df.drop(columns_to_drop, axis=1).persist()

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda3/lib/python3.6/site-packages/dask/dataframe/utils.py in raise_on_meta_error(funcname, udf)
    136     try:
--> 137         yield
    138     except Exception as e:

~/miniconda3/lib/python3.6/site-packages/dask/dataframe/core.py in _emulate(func, *args, **kwargs)
   3596     with raise_on_meta_error(funcname(func), udf=kwargs.pop('udf', False)):
-> 3597         return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
   3598 

~/miniconda3/lib/python3.6/site-packages/dask/utils.py in __call__(self, obj, *args, **kwargs)
    693     def __call__(self, obj, *args, **kwargs):
--> 694         return getattr(obj, self.method)(*args, **kwargs)
    695 

~/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3696                                            level=level, inplace=inplace,
-> 3697                                            errors=errors)
   3698 

~/miniconda3/lib/python3.6/site-packages/pandas/core/generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3110             if labels is not None:
-> 3111                 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
   3112 

~/miniconda3/lib/python3.6/site-packages/pandas/core/generic.py in _drop_axis(self, labels, axis, level, errors)
   3142             else:
-> 3143                 new_axis = axis.drop(labels, errors=errors)
   3144             result = self.reindex(**{axis_name: new_axis})

~/miniconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in drop(self, labels, errors)
   4403                 raise KeyError(
-> 4404                     '{} not found in axis'.format(labels[mask]))
   4405             indexer = indexer[~mask]

KeyError: "['a' 'b'] not found in axis"

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-9-c6cbaf1c3030> in <module>
      3 
      4 # remove columns that meet this threshold
----> 5 nyc_data_clean_stage1 = nyc_data_raw.drop(columns_to_drop, axis=1)
      6 
      7 # df_dropped = df.drop(columns_to_drop, axis=1).persist()

~/miniconda3/lib/python3.6/site-packages/dask/dataframe/core.py in drop(self, labels, axis, errors)
   2824         axis = self._validate_axis(axis)
   2825         if axis == 1:
-> 2826             return self.map_partitions(M.drop, labels, axis=axis, errors=errors)
   2827         raise NotImplementedError("Drop currently only works for axis=1")
   2828 

~/miniconda3/lib/python3.6/site-packages/dask/dataframe/core.py in map_partitions(self, func, *args, **kwargs)
    541         >>> ddf.map_partitions(func).clear_divisions()  # doctest: +SKIP
    542         """
--> 543         return map_partitions(func, self, *args, **kwargs)
    544 
    545     @insert_meta_param_description(pad=12)

~/miniconda3/lib/python3.6/site-packages/dask/dataframe/core.py in map_partitions(func, *args, **kwargs)
   3634 
   3635     if meta is no_default:
-> 3636         meta = _emulate(func, *args, udf=True, **kwargs2)
   3637 
   3638     if all(isinstance(arg, Scalar) for arg in args):

~/miniconda3/lib/python3.6/site-packages/dask/dataframe/core.py in _emulate(func, *args, **kwargs)
   3595     """
   3596     with raise_on_meta_error(funcname(func), udf=kwargs.pop('udf', False)):
-> 3597         return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
   3598 
   3599 

~/miniconda3/lib/python3.6/contextlib.py in __exit__(self, type, value, traceback)
     97                 value = type()
     98             try:
---> 99                 self.gen.throw(type, value, traceback)
    100             except StopIteration as exc:
    101                 # Suppress StopIteration *unless* it's the same exception that

~/miniconda3/lib/python3.6/site-packages/dask/dataframe/utils.py in raise_on_meta_error(funcname, udf)
    152                 "{2}")
    153         msg = msg.format(" in `{0}`".format(funcname) if funcname else "", repr(e), tb)
--> 154         raise ValueError(msg)
    155 
    156 

ValueError: Metadata inference failed in `drop`.

You have supplied a custom function and Dask is unable to 
determine the type of output that that function returns. 

To resolve this please provide a meta= keyword.
The docstring of the Dask function you ran should have more information.

Original error is below:
------------------------
KeyError("['a' 'b'] not found in axis",)

Traceback:
---------
  File "/Users/.../miniconda3/lib/python3.6/site-packages/dask/dataframe/utils.py", line 137, in raise_on_meta_error
    yield
  File "/Users/.../miniconda3/lib/python3.6/site-packages/dask/dataframe/core.py", line 3597, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "/Users/.../miniconda3/lib/python3.6/site-packages/dask/utils.py", line 694, in __call__
    return getattr(obj, self.method)(*args, **kwargs)
  File "/Users/.../miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3697, in drop
    errors=errors)
  File "/Users/.../miniconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 3111, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "/Users/.../miniconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 3143, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "/Users/.../miniconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4404, in drop
    '{} not found in axis'.format(labels[mask])


Made the code work when I read them in separate chunks cells on Jupyter with
dask 0.20.1 py36_0
dask-core 0.20.1 py36_0
Hi Jesse,

Thanks for the response. I agree. I just found it interesting since I've thought of some IoT applications where I wanted to try and implement it but, I haven't found good resources or tutorials that explain how I can do it well apart from what was demonstrated by Matthew.
Hi,

I've gone through the table of contents of the book but, I've not seen if there is any coverage of streaming dataframes which I learned about here https://matthewrocklin.com/slides/anacondacon-2018.html#/ (click on play along with the live examples). Are there plans to add that in full book?
Hey! Try and run the cells above the one you're running. That should fix it. The error you've submitted means you haven't made the variable yet.
Hi,

Thanks for the response. Looking forward to it.
When are the next chapters coming out?
The exercise after this video especially the first question Spark provides APIs for which of the following languages? One answer is not accept yet it is provided as answer. In the explanation.
I'd like to commend the author for writing a book about dask. I've been having a lot of fun doing interesting things with this library and the chapters coming soon make more excited about this library especially interractive visualizations with bokeh and machine learning.
Thanks. I'll just stick to TensorFlow. smilie
Hi,

I've just purchase the liveVideo. I was trying to move to theano and i'm getting an error. Before i show my error here how my json file looks like

{
    "floatx": "float32",
    "backend": "theano",
    "image_data_format": "channels_first",
    "epsilon": 1e-07
}

Then i change the backend to "theano" and image_data_format to "channels_last".

This is the error i'm getting.

/Users/brendamainye/anaconda/envs/py35/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using Theano backend.

You can find the C code in this temporary file: /var/folders/rd/xmt7lsxs4j95cdrp1ddl62v00000gn/T/theano_compilation_error_j055tq63
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
~/anaconda/envs/py35/lib/python3.5/site-packages/theano/gof/lazylinker_c.py in <module>()
74 if version != getattr(lazylinker_ext, '_version', None):
---> 75 raise ImportError()
76 except ImportError:

ImportError:

During handling of the above exception, another exception occurred:
ImportError                               Traceback (most recent call last)
~/anaconda/envs/py35/lib/python3.5/site-packages/theano/gof/lazylinker_c.py in <module>()
     91             if version != getattr(lazylinker_ext, '_version', None):
---> 92                 raise ImportError()
     93         except ImportError:

ImportError: 

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
<ipython-input-1-88d96843a926> in <module>()
----> 1 import keras

~/anaconda/envs/py35/lib/python3.5/site-packages/keras/__init__.py in <module>()
      1 from __future__ import absolute_import
      2 
----> 3 from . import utils
      4 from . import activations
      5 from . import applications

~/anaconda/envs/py35/lib/python3.5/site-packages/keras/utils/__init__.py in <module>()
      4 from . import data_utils
      5 from . import io_utils
----> 6 from . import conv_utils
      7 
      8 # Globally-importable utils.

~/anaconda/envs/py35/lib/python3.5/site-packages/keras/utils/conv_utils.py in <module>()
      7 from six.moves import range
      8 import numpy as np
----> 9 from .. import backend as K
     10 
     11 

~/anaconda/envs/py35/lib/python3.5/site-packages/keras/backend/__init__.py in <module>()
     78 elif _BACKEND == 'theano':
     79     sys.stderr.write('Using Theano backend.\n')
---> 80     from .theano_backend import *
     81 elif _BACKEND == 'tensorflow':
     82     sys.stderr.write('Using TensorFlow backend.\n')

~/anaconda/envs/py35/lib/python3.5/site-packages/keras/backend/theano_backend.py in <module>()
      5 from collections import defaultdict
      6 from contextlib import contextmanager
----> 7 import theano
      8 from theano import tensor as T
      9 from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams

~/anaconda/envs/py35/lib/python3.5/site-packages/theano/__init__.py in <module>()
    108     object2, utils)
    109 
--> 110 from theano.compile import (
    111     SymbolicInput, In,
    112     SymbolicOutput, Out,

~/anaconda/envs/py35/lib/python3.5/site-packages/theano/compile/__init__.py in <module>()
     10 from theano.compile.function_module import *
     11 
---> 12 from theano.compile.mode import *
     13 
     14 from theano.compile.io import *

~/anaconda/envs/py35/lib/python3.5/site-packages/theano/compile/mode.py in <module>()
      9 import theano
     10 from theano import gof
---> 11 import theano.gof.vm
     12 from theano import config
     13 from six import string_types

~/anaconda/envs/py35/lib/python3.5/site-packages/theano/gof/vm.py in <module>()
    671     if not theano.config.cxx:
    672         raise theano.gof.cmodule.MissingGXX('lazylinker will not be imported if theano.config.cxx is not set.')
--> 673     from . import lazylinker_c
    674 
    675     class CVM(lazylinker_c.CLazyLinker, VM):

~/anaconda/envs/py35/lib/python3.5/site-packages/theano/gof/lazylinker_c.py in <module>()
    125             args = cmodule.GCC_compiler.compile_args()
    126             cmodule.GCC_compiler.compile_str(dirname, code, location=loc,
--> 127                                              preargs=args)
    128             # Save version into the __init__.py file.
    129             init_py = os.path.join(loc, '__init__.py')

~/anaconda/envs/py35/lib/python3.5/site-packages/theano/gof/cmodule.py in compile_str(module_name, src_code, location, include_dirs, lib_dirs, libs, preargs, py_module, hide_symbols)
   2357             # difficult to read.
   2358             raise Exception('Compilation failed (return status=%s): %s' %
-> 2359                             (status, compile_stderr.replace('\n', '. ')))
   2360         elif config.cmodule.compilation_warning and compile_stderr:
   2361             # Print errors just below the command line.

Exception: Compilation failed (return status=1): In file included from /Users/brendamainye/.theano/compiledir_Darwin-15.6.0-x86_64-i386-64bit-i386-3.5.4-64/lazylinker_ext/mod.cpp:1:. In file included from /Users/brendamainye/anaconda/envs/py35/include/python3.5m/Python.h:25:. /Users/brendamainye/anaconda/envs/py35/include/c++/v1/stdio.h:108:15: fatal error: 'stdio.h' file not found. #include_next <stdio.h>.               ^~~~~~~~~. 1 error generated.. 


My Keras version 2.1.3

Please assist me to fix this.
Thanks in advance.
This chapter is far-out! I've been reading about the Nelder-mead algorithm here [https://livebook.manning.com#!/book/deep-learning-with-python/chapter-7/point-1244-287-287-0]. I've been trying to implement random search first. Then i found the author of the book's implementation here [https://github.com/fchollet/nelder-mead].

Has anyone used this successfully? If so please provide an example.
I got version 4 of the book and it seems to me that the first paragraphs of the chapter are missing. Initially, they were typos and right now i can see that its very different. For instance, i think "Here’s what you have learned in this chapter:" should have been placed around the end of the chapter.
I got the source code which is now available and run the jupyter notebook for example the imdb dataset this is what i got. I think as long as the dots and lines are labeled everything's fine. I've attached it.

I hope so. Is the appendix available yet? I can't find it at the moment.
I am. Yes, it works now.
Hi,

I'm using [iPython 5.1.0, python 3.5.2, Keras 2.0.4, tensorflow 1.1.0 and theano 0.9.0]

I tried importing the imdb data with the code provided in the book. But i kept getting a BadZipFile: File is not a zip file error. Is anyone else experiencing this? I checked out https://keras.io/datasets/ and used this code and it downloaded the data correctly. Last time, i did it, i had to interrupt the download due to a slow internet connection. Maybe this could've cause the problem now.

from keras.datasets import imdb
   ...: 
   ...: (train_data, train_labels), (test_data, test_labels) = imdb.load_data(pa
   ...: th="imdb.zip",num_words=10000)
Hi, I've been having difficulties with reinforcement learning and I'd hoped i would see an example in the book where we would go over it. Are there plans to add an example in the final book? Or did i miss it?

Moreover, i'm enjoying the book and discovering how cool keras is. As well, as different ways of solving the problems e.g the MNIST dataset.