Python for Data Science for Dummies Errata on Page 124

Python for Data Science for Dummies contains an error in the example that appears on the top half of page 124. In the first of the two grey boxes, the code computes the results of four print statements. The bottom-most print statement, print x[1:2, 1:2], is supposed to compute a result based on rows 1 and 2 of columns 1 and 2, and the bottom grey box seems to confirm that interpretation by the showing the result as [[[14 15 16] [17 18 19]] [[24 25 26] [27 28 29]]]. However, the answer provided for this example in the downloadable source code is [[[14 15 16]]], which doesn’t agree with that in the text.

The good news is that the downloadable source contains the correct code. The error appears only in the book. The last print statement in the book is wrong. Here is the correct code (with output) for this example:

x = np.array([[[1, 2, 3], [4, 5, 6], [7, 8, 9],],
 [[11,12,13], [14,15,16], [17,18,19],],
 [[21,22,23], [24,25,26], [27,28,29]]])

print x[1,1]
print x[:,1,1]
print x[1,:,1]
print
print x[1:3, 1:3]
[14 15 16]
[ 5 15 25]
[12 15 18]

[[[14 15 16]
 [17 18 19]]

[[24 25 26]
 [27 28 29]]]

Please let me know if you have any questions about this example at John@JohnMuellerBooks.com. I’m sorry about the error that appears in the book and appreciate the readers who have pointed it out.

 

Author: John

John Mueller is a freelance author and technical editor. He has writing in his blood, having produced 99 books and over 600 articles to date. The topics range from networking to artificial intelligence and from database management to heads-down programming. Some of his current books include a Web security book, discussions of how to manage big data using data science, a Windows command -line reference, and a book that shows how to build your own custom PC. His technical editing skills have helped over more than 67 authors refine the content of their manuscripts. John has provided technical editing services to both Data Based Advisor and Coast Compute magazines. He has also contributed articles to magazines such as Software Quality Connection, DevSource, InformIT, SQL Server Professional, Visual C++ Developer, Hard Core Visual Basic, asp.netPRO, Software Test and Performance, and Visual Basic Developer. Be sure to read John’s blog at http://blog.johnmuellerbooks.com/. When John isn’t working at the computer, you can find him outside in the garden, cutting wood, or generally enjoying nature. John also likes making wine and knitting. When not occupied with anything else, he makes glycerin soap and candles, which comes in handy for gift baskets. You can reach John on the Internet at John@JohnMuellerBooks.com. John is also setting up a website at http://www.johnmuellerbooks.com/. Feel free to take a look and make suggestions on how he can improve it.

2 thoughts on “Python for Data Science for Dummies Errata on Page 124”

  1. Hi,

    Just got your book and I am enjoying working through it. I have no background in Python or Data Science so I felt this would be a good place to start.

    In Part II, Chapter 6 in the example for Creating categorical variables

    import pandas as pd

    car_colors = pd.Series([‘Blue’, ‘Red’, ‘Green’], dtype=’category’)

    car_data = pd.Series(
    pd.Categorical([‘Yellow’, ‘Green’, ‘Red’, ‘Blue’, ‘Purple’],
    categories=car_colors, ordered=False))

    find_entries = pd.isnull(car_data)

    print(car_colors)
    print()
    print(car_data)
    print()
    print(find_entries[find_entries == True])

    The line print(car_data) fails with a long stack trace, ending with

    ValueError: object __array__ method not producing an array

    When I check the version of pandas installed in my Anaconda environment it says

    0.17.1

    if I change print(car_data) to print(car_data.cat.categories) it runs ok but with slightly different output than expected.

    I am sure you can tell that I am using Python 3 by the changes to the print statement, but I was hoping you might be able to tell me why the print(car_data) is not working right – is this just a change in the way this version of pandas works?

    Thanks
    Greg

    1. You’re correct in saying that the example will require changes in order to work the latest version of Anaconda. Luca and I have confirmed this issue. However, that’s why we both encourage readers to use the correct version of Anaconda with the book. The most recent post on this issue, Using Jupyter with Anaconda, discusses how you can use the latest version of Anaconda, but it also tells you how to get the correct version of Anaconda to use with the book. If the book sells well enough, the publisher will ask Luca and I to update the current edition of the book. At that time we’ll also update the version of the software used for the example. In the meantime, please do use the correct version of Anaconda, 2.1.0, which is available in the Continuum archive.

Comments are closed.