I feel that some points used in exercises 2.1 and 2.2 might be worth being explained in the main text:
Multiplying a tensor A by a scalar s returns the tensor you get when multiplying all components of A by s.
Subtracting a scalar s from a tensor A returns the tensor you get when subtracting s from all components of A.
Exercise 2.2 also raises some questions:
1. Is there a reason why you use tf.neg(A) rather than just A?
2. What should you use for x? (Something like x = tf.convert_to_tensor(np.linspace(...)) perhaps?)
3. Why do you use tf.pow(sigma, 2.0) rather than sigma**2.0 ?
4. Shouldn't you use mean=0.0 and sigma=1.0 ?
Finally, I think that referring to Figure 2.3 makes exercise 2.2 harder rather than easier.
