This is a thread for various minor errata which I have come across.

  • Chapter 6.5 (Viterbi decode) - The exposition for Listing 6.8 says "In listing 7.8, let's define a TensorFlow op to update the viterbi cache...

  • Chapter 6.6.1 (Modeling a video) - The discussion centers around gait; however, in several places the homophone gate is used.
  • I have good news and bad news for you.

    The Bad News
    The BregmanToolkit library is actually going to break for far more users than just windows users. The library will cause failures for users who use any OS with an updated numpy version >=1.12. In your second notebook in chapter 5 you got the following warnings:

    /usr/local/lib/python2.7/dist-packages/bregman/ VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
      mxnorm = P.empty(self._cqtN) # Normalization coefficients
    /usr/local/lib/python2.7/dist-packages/bregman/ VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
      for i in P.arange(self._cqtN)])

    This issue is what you're being warned about.

    See my reply to issue: for the potential workarounds.

    The good news
    I got the library working with python3 on windows using gpu tensorflow. Here is my fork with the appropriate changes:

    I did require two small changes to the ipython notebook code to make this work, however:

        def get_next_chromogram(sess):
     -      audio_file =
     +      audio_file ="utf-8")
            print('Loading {}'.format(audio_file))
            F = Chromagram(audio_file, nfft=16384, wfft=8192, nhop=2205)
            return F.X, audio_file

        with tf.Session() as sess:
            X, names = get_dataset(sess)
            centroids = initial_cluster_centroids(X, k)
            i, converged = 0, False
            while not converged and i < max_iterations:
                i += 1
                Y = assign_cluster(X, centroids)
                centroids =, Y))
            print(zip(, names))

        def get_dataset(sess, audio_file):
            chromo_data = get_chromogram(audio_file)
            print('chromo_data', np.shape(chromo_data))
            chromo_length = np.shape(chromo_data)[1]
            xs = []
     -      for i in range(chromo_length // segment_size):
     +      for i in range(chromo_length // segment_size):
                chromo_segment = chromo_data[:, i*segment_size:(i+1)*segment_size]
                x = extract_feature_vector(sess, chromo_segment)
                if len(xs) == 0:
                    xs = x
                    xs = np.vstack((xs, x))
            return xs
    Wow! Thanks for the quick reply. I didn't expect to hear back for some time.

    I looked into this issue more deeply. There is already an outstanding PR for python3 support; however it has been pending for over a year ( )

    I have reached out to the coordinator of BregmanStudios to see if they can put me in contact with a maintainer of this library.

    The above mentioned PR will not work as intended. See below for fork with full changes required.
    Overall great book. Lots of good content packed into a single text for a very fair price.

    The only issue thus far is that the use of the bregman toolkit makes the example code unusable for windows users because tensorflow only supports python 3.5 and 3.6 on Windows. This won't be a problem for me as I also have linux and mac readily available (albeit without GPU acceleration); however, it may cause grief for others.

    There are two things we can do from here:

    1. You might want to add a note in the installation section mentioning this unfortunate incompatibility for chapter 5.

    2. We should try the py2to3 tool to automatically convert the code to be compatible with python3. If this works then we would have a good workaround for users of windows using native tensorflow on windows. Please let me know if you've already tried this -- if not I'd be happy to give it a shot for you.