Python for Data Science for Dummies Errata on Page 145

Python for Data Science for Dummies contains two errors on page 145. The first error appears in the second paragraph on that page. You can safely disregard the sentence that reads, “The use_idf controls the use of inverse-document-frequency reweighting, which is turned off in this case.” The code doesn’t contain a reference to the use_idf parameter. However, you can read about it on the Scikit-Learn site. This parameter defaults to being turned on, which is how it’s used for the example.

The second error is also in the second paragraph. The discussion references the tf_transformer.transform() method call. The actual method call is tfidf.transform(), which does appear in the sample code. The discussion about how the method works is correct, just the name of the object is wrong.

Please let me know if you have any questions about either of these changes at I’m sorry about any errors that appear in the book and appreciate the readers who have pointed them out.


Adding Vinegar to the Chicken Water

It’s winter in Wisconsin and the chicken coop isn’t heated. In fact, the chicken coop lacks an electrical connection as well, so except for taking pots of heated water in on the coldest days, trying to heat the coop must come from other sources. The slant of the roof and placement of the window ensure that the coop receives maximum winter heat. The tree that normally shields the coop from the sun during the summer months is bare, letting the sun come through. Even with all these measures, the coop is cold enough to let the chicken’s water freeze.

My goals for various activities on my small farm include doing things in a manner that makes my carbon footprint small and keeps costs low. Consequently, I always look for solutions that don’t involve much in the way of high technology, such as obtaining heated chicken waterers. I did seriously look at a solar powered unit for a while, but decided that the chickens would probably destroy it in short order. The better solution turned out to be adding vinegar to the chicken water.

It turns out that vinegar has both a lower freezing temperature and higher boiling point than water. The freezing temperature of vinegar is 28 degrees, but that level increases when you add more water. I tried various levels of vinegar in the chicken water and found that ½ cup per gallon seems to keep the water from freezing for about an hour longer when the outside temperature is in the 15 to 30 degree range. Above 30 degrees, it kept the water from freezing at all.

Adding vinegar to the water also keeps anything from growing inside the waterer, which means that the water is better for the chickens longer. This feature of adding vinegar is especially important during the summer, when all kinds of green gunk grows inside the waterer and is quite hard to keep out.

If you look on other websites, you find that other people attribute all sorts of other benefits to using vinegar. Other websites warn against using vinegar. I haven’t personally tested any of these claims, so I’m not here to tell you that the chickens derive any benefit whatsoever from the vinegar in the water. However, I did try a simple experiment this past summer and found that given two buckets, precisely the same size, color, and make, one with vinegar and one without, the chickens always drank the vinegar water first. My feeling is that they seem to like it. So even if the chickens don’t gain any solid benefits from the vinegar, you can view it as a treat that helps keep the water from freezing longer and keeps their waterer cleaner. Let me know your thoughts on adding vinegar to the chicken water at


Time to Check the Larder

The seed catalogs begin to arrive in the mail and you look upon them as a bit of pure heaven—the announcement that spring is on the way. Your eyes nearly pop out as you see the multicolored carrots, juicy tomatoes, and fragrant herbs. The new kinds of fruit trees immediately attract your attention, and what about that amazing new berry bush that will pack your freezer with sumptuous berries? You go into a mix of information and appetite overload and you consider just how those new offerings will satiate your cravings for all things fresh. However, before you go into a swoon over the latest delights, consider the fact that you probably don’t need them all. Your larder is craving things too! The items you’ve used up have created gaps in the deliciousness that your larder can provide during the winter months when fresh simply isn’t an option.

Of course, everyone loves to experiment. After all, that’s how I found kabocha squash this past summer—that delectable mix of sweet and savory that will likely find its way into a pie this upcoming fall. Had I known then what I know now, I would have planted more and canned the extra as an alternative to using pumpkin for pies. Lesson learned, more kabocha squash will find their way into the mix this year, alongside the butternut and acorn squash I love so well.

Back to the larder though. You probably don’t have any idea of where the holes are right now and you really do need to find out. That’s why you need to perform an inventory of your larder. The inventory will tell you about the items you need most. This year I’ve decided to try canning three bean salad, which means growing green, yellow wax, and kidney beans. However, I already have enough green beans in quarts in the larder, so I won’t make a big planting of green beans.

Your larder inventory should include more than a simple accounting. As you go through your larder, you should also perform these tasks:

  • Ensure all of the canned goods are still sealed
  • Wipe the jars down to remove the dust
  • Verify all of the oldest products are in the front
  • Make a list of products that are more than five years old so you can use them up
  • Place all the empty jars in one area
  • Sort the jars by type (both size and the kind of lid used)

Taking these extra steps will help you get a better handle on your larder. You should have a good idea of what your larder contains at all times and the only way to achieve that goal is to actually look at the containers. Let me know your thoughts about larder management at


Using Jupyter with Anaconda (Updated)

A few readers have recently written to me regarding the use of Jupyter with the downloadable source for Python for Data Science for Dummies. The version of Anaconda recommended for the book, 2.1.0, doesn’t rely on Jupyter, which is why the book doesn’t mention Jupyter. The book relies on IPython Notebook, which is what you should use to obtain the best reading experience. You can obtain the proper version from the Continuum archive. However, if you choose to download the current version of Anaconda, then using Jupyter becomes a possibility; although, many of the procedures found in the book will require tweaking and the screenshots won’t match precisely.

In order to use Jupyter, you must still import the downloaded files into your repository. The source code comes in an archive file that you extract to a location on your hard drive. The archive contains a list of .ipynb (IPython Notebook) files containing the source code for this book (see the Introduction for details on downloading the source code). The following steps tell how to import these files into your repository:

  1. Click Upload at the top of the page. What you see depends on your browser. In most cases, you see some type of File Upload dialog box that provides access to the files on your hard drive.
  2. Navigate to the directory containing the files you want to import into Notebook.
  3. Highlight one or more files to import and click the Open (or other, similar) button to begin the upload process. You see the file added to an upload list, as shown here. The file isn’t part of the repository yet—you’ve simply selected it for upload.

    Click Upload when you want to upload files to the repository.
    Upload Source Files to the Repository
  4. Click Upload. Notebook places the file in the repository so that you can begin using it.

It’s important to both Luca and me that you have the best possible learning experience with our book. This means using the right version of Anaconda for most people. Using the latest version shouldn’t cause problems, but we’d like to know if it does. Please feel free contact me at with your book-specific questions.


It has come to our attention since this post first published that using the latest version of Anaconda with Python for Data Science for Dummies is problematic. Some of the examples won’t work without rewriting because the Pandas Categorical class has changed. This is the only change we’ve confirmed so far, but there are no doubt other changes. In order to get the proper results from the examples in the book, you must use the correct version of Anaconda, version 2.1.0.

Please do keep those questions coming. It’s because a reader took time to write that Luca and I became aware of this problem. We truly do want you to have a great learning experience, so these questions are important!


Technology and Child Safety

I recently read an article on ComputerWorld, Children mine cobalt used in smartphones, other electronics, that had me thinking yet again about how people in rich countries tend to ignore the needs of those in poor countries. The picture at the beginning of the article says it all, but the details will have you wondering whether a smartphone really is worth some child’s life. That’s right, any smartphone you buy may be killing someone and in a truly horrid manner. Children as young as 7 years old are mining the cobalt needed for the batteries (and other components) in the smartphones that people seem to feel are so necessary for life (they aren’t you know).

The problem doesn’t stop when someone gets the smartphone. Other children end up dismantling the devices sent for recycling. That’s right, a rich country’s efforts to keep electronics out of their landfills is also killing children because countries like India put these children to work taking them apart in unsafe conditions. Recycled wastes go from rich countries to poor countries because the poor countries need the money for necessities, like food. Often, these children are incapable of working by the time they reach 35 or 40 due to health issues induced by their forced labor. In short, the quality of their lives is made horribly low so that it’s possible for people in rich countries to enjoy something that truly isn’t necessary for life.

I’ve written other blog posts about the issues of technology pollution. One of the most recent is More People Noticing that Green Technology Really Isn’t. However, the emphasis of these previous articles has been on the pollution itself. Taking personal responsibility for the pollution you create is important, but we really need to do more. Robotic (autonomous) mining is one way to keep children out of the mines and projects such as The Utah Robotic Mining Project show that it’s entirely possible to use robots in place of people today. The weird thing is that autonomous mining would save up to 80% of the mining costs of today, so you have to wonder why manufacturers aren’t rushing to employ this solution. In addition, off world mining would keep the pollution in space, rather than on planet earth. Of course, off world mining also requires a heavy investment in robots, but it promises to provide a huge financial payback in addition to keeping earth a bit cleaner (some companies are already investing in off world mining, but we need more). The point is that there are alternatives that we’re not using. Robotics presents an opportunity to make things right with technology and I’m excited to be part of that answer in writing books such as Python for Data Science for Dummies and Machine Learning for Dummies (see the posts for this book).

Unfortunately, companies like Apple, Samsung, and many others simply thumb their noses at laws that are in place to protect the children in these countries because they know you’ll buy their products. Yes, they make official statements, but read their statements in that first article and you’ll quickly figure out that they’re excuses and poorly made excuses at that. They don’t have to care because no one is holding them to account. People in rich countries don’t care because their own backyards aren’t sullied and their own children remain safe. So, the next time you think about buying electronics, consider the real price for that product. Let me know what you think about polluting other countries to keep your country clean at


Calcium Nodules on Eggs

At some point during your time of working with chickens, you might encounter eggs that look like they have insect eggs on them. The view can be disquieting at first—all sorts of images could go through your mind. However, it’s more likely that what you’re actually seeing are calcium nodules that merely look like insect eggs. Here is an egg that has such nodules on it.

An egg may have harmless calcium nodules that look like insect eggs deposited on it.
Calcium nodules can look like insect eggs.

These nodules are completely harmless. In fact, you can wash them off the eggs quite easily. When crushed, the nodules feel gritty, much like crushed eggshell would feel. These nodules typically appear for two reasons:

The first reason is the one that occurs most often. Five of my hens are now four years old and one is five years old. The five year old hen (a Black Australorp) laid this egg, so the nodules aren’t unusual at all. (Most factory settings keep laying hens for one or two years after they start laying eggs, I’ve found that four years in optimal settings works well.) This spring I’ll replace two of the hens with new layers (the other four are pets and will die of old age). I also had one hen eaten by hawks and another died of an impacted egg, so I’ll actually get four new layers this spring.

I’m thinking of trying Barred Rocks (a kind of Plymouth Rock) because I’ve never had them before and they’re quite pretty. According Henderson’s Chicken Chart, they’re cold hard and produce large eggs. A friend of mine has them in her flock and feels that they’re a good investment. The point is that when you start seeing these nodules on one or two eggs and not on the eggs of your flock as a whole, you may need to start thinking about replacing the bird that laid it. Let me know your thoughts about keeping a healthy flock at


Is Bring Your Own Device (BYOD) Going Away?

The Bring Your Own Device (BYOD) phenomena has gone on for a number of years now, but no one really knows for certain how it impacts organizations today. If you read surveys, you might get the idea that BYOD is either exploding or fading. The surveys that readers of Security for Web Developers, HTML5 Programming with JavaScript for Dummies, and CSS3 for Dummies are most likely to read say that BYOD is fading. The problem with those surveys is that they’re taken by IT professionals in large organizations that have an official policy of not allowing the device. Disallowing BYOD doesn’t mean that users actually follow the policy.

In reading other articles, you might be the idea that BYOD is actually exploding. The problem with these articles is that they’re based on supposition, not fact. There is no data to back up the claim that BYOD is becoming more prevalent in the workplace. Therein lies the problem. The only official surveys talk to IT personnel on the record and not to users off the record. No one would admit to using a disallowed device—potentially throwing their job away over the purity of information.

Human nature being what it is, my feeling is that people are probably employing BYOD when they feel they can get away with it. After all, why use multiple devices to perform work when a single device does it all? Users don’t care about hardware, software, data, or anything else for that matter. They care about getting their work done, getting off on time, and getting paid—end of question. Consequently, it makes sense that if users feel that it’s possible to get by using a single device to do everything, they’ll do so. However, I have absolutely no data to back this feeling up and you have to accept my claim for what it is—a feeling.

Something that I’ve been emphasizing in my books is this idea of risk. In order to create applications that work well, yet protect organizational assets, it’s important to assess the risk of every policy and every action. Being overly cautious means that applications will work slowly, lack features, and possibly crash a lot. Users don’t like cautious applications and won’t use them if at all possible. Opening the flood gates is a bad idea too. Yes, the application will run quickly and allow a user to do just about anything, but the user won’t thank you for having to stay extra hours at work to fix problems created by an application that loses data or causes other problems because it doesn’t provide an acceptable level of risk avoidance.

No matter what survey you look at, BYOD is still a presence in the workplace, so you need to write applications in such a manner that they deal with the risks presented by BYOD in a reasonable manner. What this means is checking every bit of data you receive from anywhere for potential risks, but not unnecessarily hobbling the user with policies that really won’t mitigate any risk. Let me know your thoughts on the effects of BYOD in your organization and the actual level of BYOD use at


Python for Data Science for Dummies Errata on Page 124

Python for Data Science for Dummies contains an error in the example that appears on the top half of page 124. In the first of the two grey boxes, the code computes the results of four print statements. The bottom-most print statement, print x[1:2, 1:2], is supposed to compute a result based on rows 1 and 2 of columns 1 and 2, and the bottom grey box seems to confirm that interpretation by the showing the result as [[[14 15 16] [17 18 19]] [[24 25 26] [27 28 29]]]. However, the answer provided for this example in the downloadable source code is [[[14 15 16]]], which doesn’t agree with that in the text.

The good news is that the downloadable source contains the correct code. The error appears only in the book. The last print statement in the book is wrong. Here is the correct code (with output) for this example:

x = np.array([[[1, 2, 3], [4, 5, 6], [7, 8, 9],],
 [[11,12,13], [14,15,16], [17,18,19],],
 [[21,22,23], [24,25,26], [27,28,29]]])

print x[1,1]
print x[:,1,1]
print x[1,:,1]
print x[1:3, 1:3]
[14 15 16]
[ 5 15 25]
[12 15 18]

[[[14 15 16]
 [17 18 19]]

[[24 25 26]
 [27 28 29]]]

Please let me know if you have any questions about this example at I’m sorry about the error that appears in the book and appreciate the readers who have pointed it out.


Getting the Fastest Question Response

I always want to be sure that you get fast, courteous responses to your book-specific questions. Even though I don’t check my e-mail every day, I do check it most days of the week, so that’s the fastest way to contact me regarding issues that you have with my books. Of course, you can make the response even faster by doing a few simple things when sending your email:

  • Be sure to include the name of the book and the book edition in the message subject line.
  • Tell me which page, figure, or listing number to look at in the book.
  • Document the steps you took.
  • Provide me with the exact error message you’re seeing.
  • Tell me about your platform (operating system, the version of any software you’re using, and so on).

If you provide these basic pieces of information, I can usually answer your questions much faster—often without asking for additional information. E-mail communication can be difficult at times because it lacks that in person body language element and you can’t show me what you’re seeing on your machine. Remote diagnostics are harder than you might think.

It’s also important that you understand that I focus on book-specific questions. I’ve discussed this issue before in Sending Comments on My Books and Sending Comments and Asking Questions. The bottom line is that I want you to be happy with your book experience, but I also don’t have time to provide free consulting. Please let me know if you have any questions or concerns about contacting me at


Checking for Mobile Friendliness

I’ve been writing computer books for over 29 years now and some people might think that’s long enough to know everything there is to know about computers. Actually, my involvement with computers spans over 40 years and I haven’t learned everything yet—nor will I. Every day is a new adventure, which is why I keep going. Besides the other projects I’ve been working on (which includes discovering the inner workings of both Near Field Communications, NFC, and machine learning), I’ve also been working through this whole concept of mobile device friendliness. In fact, I’ve discovered that there is actually a difference between sites that are mobile friendly and those that are mobile responsive, in that a mobile responsive design does a lot more for the mobile users (and is always mobile friendly by default).

During the time I wrote HTML5 Programming with JavaScript for Dummies and CSS3 for Dummies, the concept of mobile development was still quite new. There weren’t any good tools for testing mobile friendliness. Consequently, both books do try to address the topic to a small extent (the extent possible at the time), but neither book says anything about testing. Fortunately, vendors such as Google are now making it possible for you to verify that your site is mobile friendly with an easy to use check. All you need to do is point your browser to, enter an URL, and click Analyze. You get a quick answer to your question as shown here within a few seconds.

Verify that your site will support mobile users by performing a mobile friendly check.
Output from a Successful Mobile Friendly Check

The page contains more than just a validation of the mobile friendliness of your site. When you scroll down, you see a simulated output of your site when viewed on a smartphone. The view is important because it helps you understand how a mobile user will see your site, versus the view that you provide to desktop and tablet users. It’s important not to assume that mobile users have the same functionality as other users do. Here’s the simulated view for my site.

Mobile users may see something different than you expect, even when your site is mobile friendly.
Verify the Smartphone View of Your Site

As more and more people rely on mobile devices to access the Internet, you need to become more aware of what they’re seeing and whether they can use your site at all. According to most authorities, more users access the Internet using mobile devices today, than other devices, such as laptops, desktops, or tables. If you don’t support mobile devices correctly, you lose out on the potential audience for your site. This means that you may make less money than you otherwise could from sales and that the influence of your site is far less. Let me know your thoughts about mobile device access at