Missing Machine Learning for Dummies Downloadable Source Files

A number of people have contacted me to tell me that the downloadable source for Machine Learning for Dummies isn’t appearing on the Dummies site as described in the book. I’ve contacted the publisher about the issue and the downloadable source is now available at http://www.dummies.com/extras/machinelearning. Please look on the Downloads tab, which you can also find at http://www.dummies.com/DummiesTitle/productCd-1119245516,descCd-DOWNLOAD.html and navigate to Click to Download to receive the approximately 485 KB source code file.

When you get the file, open the archive on your hard drive and then follow the directions in the book to create the source code repository for each language. The repository instructions appear on Page 60 for the R programming language and on Page 99 for Python. I apologize for any problems that the initial lack of source code may have caused. If you experience any problems whatsoever in using the source code, please feel free to contact me at John@JohnMuellerBooks.com. Luca and I want to be certain that you have a great learning experience, which means being able to download and use the book’s source code because using hand typed code often leads to problems.

 

Apathy, Sympathy, and Empathy in Books

I’ve written more than a few times about the role that emotion plays in books, even technical books. Technical books such as Accessibility for Everybody: Understanding the Section 508 Accessibility Requirements are tough to write because they’re packed with emotion. The author not only must convey emotion and evoke emotions in the reader, but explore the emotion behind the writing. In this case, the author’s emotions may actually cause problems with the book content. The writing is tiring because the author experiences emotions in the creation of the text. The roller-coaster of emotions tends to take a toll. Three common emotions that authors experience in the writing of a book and that authors convey to the reader as part of communicating the content are apathy, sympathy, and empathy. These three emotions can play a significant role in the suitability of the book’s content in helping readers discover something new about the people they support, themselves, and even the author.

It’s a mistake to feel apathy toward any technical topic. Writers need to consider the ramifications of the content and how it affects both the reader and the people that the reader serve. For example, during the writing of both Python for Data Science for Dummies and Machine Learning for Dummies Luca and I discussed the potential issues that automation creates for the people who use it and those who are replaced by it in the job market. Considering how to approach automation in an ethical manner is essential to creating a positive view of the technology that helps people use it for good. Even though apathy is often associated with no emotion at all, people are emotional creatures and apathy often results in an arrogant or narcissistic attitude. Not caring about a topic isn’t an option.

I once worked with an amazing technical editor who told me more than a few times that people don’t want my sympathy. When you look at sympathy in the dictionary, the result of having sympathy toward someone would seem positive, but after more than a few exercises to demonstrate the effects of sympathy on stakeholders with special needs, I concluded that the technical editor was correct—no one wanted my sympathy. The reason is simple when you think about it. The connotation of sympathy is that you’re on the outside looking in and feel pity for the person struggling to complete a task. Sympathy makes the person who engages in it feel better, but does nothing for the intended recipient except make them feel worse. However, sympathy is still better than apathy because at least you have focused your attention on the person who benefits from the result of your writing efforts.

Empathy is often introduced as a synonym of sympathy, but the connotation and effects of empathy are far different from sympathy. When you feel empathy and convey that emotion in your writing, you are on the inside, with the person you’re writing for, looking out. Putting yourself in the position of the people you want to help is potentially the hardest thing you can do and certainly the most tiring. However, it also does the most good. Empathy helps you understand that someone with special needs isn’t looking for a handout and that they don’t want you to perform the task for them. They may, in fact, not feel as if they have a special need at all. It was the realization that using technology to create a level playing field so that the people I wanted to help could help themselves and feel empowered by their actions that opened new vistas for me. The experience has colored every book I’ve written since that time and my books all try to convey emotion in a manner that empowers, rather than saps, the strength the my reader and the people my reader serves.

Obviously, a good author has more than three emotions. In fact, the toolbox of emotions that an author carries are nearly limitless and its wise to employ them all as needed. However, these three emotions have a particular role to play and are often misunderstood by authors. Let me know your thoughts on these three emotions or about emotions in general at John@JohnMuellerBooks.com.

 

Automation and the Future of Human Employment

It wasn’t long ago that I wrote Robotics and Your Job to consider the role that robots will play in human society in the near future. Of course, robots are already doing mundane chores and those list of chores will increase as robot capabilities increase. The question of what sorts of work humans will do in the future has crossed my mind quite a lot as I’ve written Build Your Own PC on a Budget, Python for Data Science for Dummies, and Machine Learning for Dummies. In fact, both Luca and I have discussed the topic at depth. It isn’t just robotics, but the whole issue of automation that is important. Robots actually fill an incredibly small niche in the much larger topic of automation. Although articles like The end of humans working in service industry? seem to say that robots are the main issue, automation comes in all sorts of guises. When writing A Fuller Understanding of the Internet of Things I came to the conclusion that the services provided by technologies such as Smart TVs actually take jobs away from someone. In this case, a Smart TV rids us of the need to visit a video store, such as Blockbuster (assuming you were even around to remember these stores). Imagine all the jobs that were lost when Blockbuster closed its doors.

My vision for the future is that people will be able to work in occupations with lower risks, higher rewards, and greater interest. Unfortunately, not everyone wants a job like that. Some people really do want to go to work, clock in, place a tiny cog in a somewhat large wheel all day, clock out, and go home. They want something mindless that doesn’t require much effort, so losing service and assembly line type jobs to automation is a problem for them. In Robots are coming for your job the author states outright that most Americans think their job will still exist in 50 years, but the reality is that any job that currently pays under $20.00 an hour is likely to become a victim of automation. Many people insist that they’re irreplaceable, but the fact is that automation can easily take their job and employers are looking forward to the change because automation doesn’t require healthcare, pensions, vacation days, sick days, or salaries. Most importantly, automation does as its told. In the story The rise of greedy robots, the author lays out the basis for an increase in automation that maximizes business profit at the expense of workers. Articles such as On the Phenomenon of Bullshit Jobs tell why people are still working a 40 hour work week when it truly isn’t necessary to do so. In short, if you really do insist on performing a task that is essentially pointless, the government and industry is perfectly willing to let you do so until a time when technology is so entrenched that it’s no longer possible to do anything about it (no, I’m not making this up). Even some relatively essential jobs, such as security, have a short life expectancy with the way things are changing (see How much security can you turn over to AI? and The eerie math that could predict terrorist attacks for details).

The question of how automation will affect human employment in the future remains. Theoretically, people could work a 15 hour work week even now, but then we’d have to give up some of our consumerism—the purchase of gadgets we really don’t need. In the previous paragraph, I talked about jobs that are safer, more interesting, and more fulfilling. There are also those pointless jobs that the government will doubtless prop up at some point to keep people from rioting. However, there is another occupation that will likely become a major source of employment, but only for the nit-picky, detail person. In The thin line between good and bad automation the author explores the problem of scripts calling scripts. Even though algorithms will eventually create and maintain other algorithms, which in turn means that automation will eventually build itself, someone will still have to monitor the outcomes of all that automation. In addition, the search for better algorithms continues (as described in The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World and More data or better models?). Of course, these occupations still require someone with a great education and a strong desire to do something significant as part of their occupation.

The point of all this speculation is that it isn’t possible to know precisely how the world will change due to the effects of automation, but it will most definitely change. Even though automation currently has limits, scientists are currently working on methods to extend automation even further so that the world science fiction authors have written about for years will finally come into being (perhaps not quite in the way they had envisioned, however). Your current occupation may not exist 10 years from now, much less 50 years from now. The smart thing to do is to assume your job is going to be gone and that you really do need a Plan B in place—a Plan B that may call for an increase in flexibility, training, and desire to do something interesting, rather than the same mundane task you’ve plodded along doing for the last ten years. Let me know your thoughts on the effects of automation at John@JohnMuellerBooks.com.

 

IPython Magic Functions

Both Python for Data Science for Dummies and Machine Learning for Dummies rely on a version of Anaconda that uses IPython as part of its offering.Theoretically, you could also use Anaconda with Beginning Programming with Python For Dummies, but that book is designed to provide you with an experience that relies on the strict Python offerings (without the use of external tools). In other words, the procedures in this third book are designed for use with IDLE, the IDE that comes with Python. IPython extends the development environment in a number of ways, one of which is the use of magic functions. You see the magic functions in the code of the first two books as calls that begin with either one or two percent signs (% or %%). The most common of these magic functions is %matplotlib, which controls how IPython Notebook or Jupyter Notebook display plot output from the code.

You can find a listing of the most common magic functions in the Python for Data Science for Dummies Cheat Sheet. Neither of the first two books use any other magic functions, so this is also a complete list of magic functions that you can expect to find in our books. However, you might want to know more. Fortunately, the site at https://damontallen.github.io/IPython-quick-ref-sheets/ provides you with a complete listing of the magic commands (and a wealth of other information about IPython).

Of course, you might choose to use another IDE—one that isn’t quite so magical as Anaconda provides through IPython. In this case, you need to remove those magic commands. Removing the commands won’t affect functionality of the code. The example will still work as explained in the book. However, the way that the IDE presents output could change. For example, instead of being inline, plots could appear in a separate window. Even though using a separate window is less convenient, either method works just fine. If you ever do encounter a magic function-related problem, please be sure to let me know at John@JohnMuellerBooks.com.

 

Spaces in Paths

A number of readers have recently written me about an error they see when attempting to compile or execute an application or script in books such as, C++ All-In-One for Dummies, 3rd EditionBeginning Programming with Python For Dummies, Python for Data Science for Dummies, and Machine Learning for Dummies. Development environments often handle spaces differently because they’re designed to perform tasks such as compiling applications and running scripts. I had touched on this issue once before in the Source Code Placement post. When you see an error message that tells you that a file or path isn’t found, you need to start looking at the path and determine whether it contains any spaces. The best option is to create a directory to hold your source code and to place that directory off the root directory of your drive if at all possible. Keeping the path small and simple is your best way to avoid potential problems compiling code or running scripts.

The problem for many readers is that the error message is buried inside a whole bunch of nonsensical looking text. The output from your compiler or interpreter can contain all sorts of useful debugging information, such as a complete listing of calls that the compiler, interpreter, or application made. However, unless you know how to read this information, which is often arcane at best, it looks like gobbledygook. Simply keep scanning through the output until you see something that humans can read and understand. More often than not, you see an error message that helps you understand what went wrong, such as not being able to find a file or path. Please let me know if you ever have problems making the code examples in my books work, but also be sure to save yourself some time and effort by reading those error messages. Let me know if you have any thoughts or concerns about spaces in directory paths at John@JohnMuellerBooks.com.

 

Installing Python Packages (Part 2)

In the Installing Python Packages (Part 1) post, you discovered the easiest method of installing new packages when working with Beginning Programming with Python For Dummies, Python for Data Science for Dummies, and Machine Learning for Dummies. Using the pip command is both fast and easy. However, it doesn’t provide much in the way of feedback when things go wrong. To overcome this issue, you can use the conda command in place of pip when you have Anaconda installed on your system. Like pip, conda supports a wide variety of commands. You can find a listing of these commands at http://conda.pydata.org/docs/using/pkgs.html.

You need to know a few things about working with conda. The first is that you need to open an Anaconda prompt to use it. For example, when working with Windows, you use the Start ⇒ All Programs ⇒ Anaconda<Version> ⇒ Anaconda Prompt command to open a window like the one shown here where you can enter commands. (Your Anaconda Prompt may look different than the one shown based on the platform you use and the version of Anaconda you have installed.)

Use the Anaconda Prompt to gain access to the conda command.
The Anaconda Prompt

You can easily discover the features the conda command supports by typing conda -h and pressing Enter. You see a list of command line switches similar to the ones shown here:

Use the conda command line switches to perform various tasks.
A Listing of Conda Switches

As you can see, there are quite a few tasks you can perform. To determine whether you have a package installed, use the Conda search <package name> command.  For example, if you want to determine if you have Pandas installed, you type Conda search Pandas and press Enter.  You see a list of Pandas versions installed, assuming that Pandas is installed, like this:

Use the search switch to locate a particular package installation.
A Listing of Pandas Information

The information you get from conda is far more in depth than pip provides. To determine what you have installed, just go down the list and determine whether you have the version of Pandas that you need.  If you don’t, then type Conda update pandas and press Enter (notice the case used).  On the other hand, let’s say you want to install BeautifulSoup.  Well, the first time through, try typing Conda install BeautifulSoup and pressing Enter.  You see an error message that tells you what to type like this:

The conda command provides you with helpful error information.
Using Error Information

Since you want to install the latest BeautifulSoup, type Conda install beautiful-soup and press Enter.  After searching for the required update information, conda will ask if you want to proceed.  Type y and press Enter.  You’ll see a whole bunch of activity take place, but eventually, you have a new version of BeautifulSoup, plus all the supporting functionality, installed correctly in the correct locations.  Here’s how things looked on my system:

Conda provides detailed information about the installation process.
Viewing the Result of an Installation

At this point, you have BeautifulSoup installed. Installing other packages follows the same path. Using conda does require a little more expertise than using pip, but you also gain additional flexibility and garner more information. When everything goes well, either tool does an equally good job of getting the installation or update task done, but conda excels in helping you past troublesome installations. Let me know your thoughts about using conda to install the packages required by my books at John@JohnMuellerBooks.com.

 

Installing Python Packages (Part 1)

My Python-related books, Beginning Programming with Python For Dummies, Python for Data Science for Dummies, and Machine Learning for Dummies use various libraries to perform book-specific tasks. The books do provide instructions as needed, but, based on reader input, sometimes these instructions aren’t as clear as necessary, located in precisely the right location, or possibly as specific as needed. This post will help you get the packages containing the libraries you need installed in order to get more from the books.

It’s essential to remember that Beginning Programming with Python for Dummies relies on the 3.3.4 version of Python. The other two books rely on Python 2.7.x versions. The reason for using the older version of Python in these two books is that these books rely on libraries that Python 3.x doesn’t support. If you try to install these libraries on Python 3.x, you’ll get an error message of somewhat dubious usefulness.

In most cases, the easiest way to install a package is to open a command prompt with Administrator privileges and rely on the pip (for Python 2.x) or pip3 (for Python 3.x) command to perform the installation. For example, to install BeautifulSoup, you can type pip install beautifulsoup4 and press Enter. Installing any other package follows about the same route.

The only problem with the pip utility is that you don’t get it with every version of Python. When using an older version of Python, such as 3.3.4, you actually need to install the pip utility to use it. Fortunately, the installation instructions at https://pip.pypa.io/en/latest/installing/ aren’t difficult to use and you’ll be up and running in a few minutes.

Some readers have also complained that pip doesn’t provide much information when it comes to errors. The lack of information can prove problematic when an installation doesn’t go as planned. Next week I plan to cover the conda utility that comes with Anaconda. This utility isn’t as easy to use in some respects as pip, but it does provide considerably more information. If you have any questions about using the pip utility with my books, please contact me at John@JohnMuellerBooks.com.

 

Mac Gatekeeper Error

A number of my books, such as C++ All-In-One for Dummies, 3rd EditionBeginning Programming with Python For Dummies, Python for Data Science for Dummies, and Machine Learning for Dummies ask readers to download an IDE or other code and install it on their Mac systems. The problem is that the Mac system won’t always cooperate. For example, you might see an error dialog like the one shown for Code::Blocks:

The Gatekeeper error tells you that it won't allow you to install software from unknown publishers.
Your Mac won’t let you install software.

The problem is one of permissions. The default permissions set for newer Mac systems restrict you to getting your apps from the Mac App Store or from vendors who have signed their files. Fortunately, you can overcome this problem either temporarily or permanently, depending on how you want to use your Mac. The Fix the “App can’t be opened because it is from an unidentified developer” Error in Mac OS X blog post provides you with illustrated, step-by-step directions to perform the task using either method. Let me know if you encounter any other problems getting your Mac to install the software required to use my books at John@JohnMuellerBooks.com.

 

Working with Code in e-Books

Most of my technical readers now use e-books instead of paper books. Of course, there is a convenience factor to storing your entire library on a Kindle, even if it’s a software version of the Kindle. Of course, there are all sorts of e-book formats for your desktop system as well. The point is that electronic format makes a lot of sense when dealing with technical books.

However, e-books can cause some interesting problems and I’ve encountered a few with a number of readers now. The most important consideration is that you can’t cut and paste code from an e-book directly into your IDE and expect it to work. There are all sorts of reasons for this exclusion. For example, cutting and pasting may insert special characters into the output stream or the resulting paste may not have white space in the right places. A common problem is that publishers often convert regular single and double quotes into curly quote equivalents. The two kinds of quotes (both single and double) are completely different and the second type definitely won’t compile.

The best option when working with an e-book is to view the code in the e-book, but still get the downloadable source code for the book from the publishers website. I always provide a blog post detailing where to obtain the downloadable source for a book, when you need source code to use the book. If you can’t find the downloadable source, always feel free to contact me at John@JohnMuellerBooks.com. I want to be sure you have a great reading experience, which means having source code that actually runs in your development environment.

Another potential problem with e-books is that you may see unfortunate code breaks (despite the efforts of the publisher and myself). When you need to understand how white space works with a programming language, always review the downloadable source. The fact that the downloadable source compiles and runs tells you that all the of white space is in the right place and of the correct type. Typing the source code directly out of your e-book could result in added carriage returns or other white space errors that will cause the code to fail, even though the commands, variables, and other parts of the code are all correct.

As always, I’m open to your questions about my books. If you don’t understand how things work, please contact me—that’s why I’m here.

 

Security Breaches and the Potential Effect on Big Data

There are two interacting forces in big data today that few people are talking about. Perhaps it just hasn’t occurred to anyone that there truly is a serious threat. This particular post is going to talk about big data used for healthcare, but the same issue applies to any use of big data. Organizations, such as Penn Medicine, are using big data to perform real world tasks that really make difference. For example, it’s now possible to predict the potential for diseases well in advance of any critical fallout now—at least for some diseases such as sepsis. The ability to predict an event before it becomes critical is important for all sorts of reasons, but the most important is improving overall health. Of course, it also affects the cost of healthcare and the need to use healthcare in the first place.

However, while writing both Python for Data Science for Dummies and Machine Learning for Dummies, I’ve discovered the fallout of data errors is more critical than anyone can imagine. Ensuring correct data entry is a large part of the solution, but there are other concerns. Yes, algorithms can learn to determine which data is useful and which data isn’t, but the purer the data at the outset, the better.

While writing Security for Web Developers I reviewed many sorts of security breach, some of which involve modifying organizational data. What this means is that an outsider could potentially corrupt the big data used to make assumptions about medical conditions. Do you see where I’m going with this? Having bad data, data that is modified by an outsider and therefore not as likely to gain the attention of someone who can fix it, will cause those algorithms to make some invalid assumptions. Humans help correct the assumptions, but humans aren’t perfect and make assumptions about the behavior of the algorithm. The bottom line is that security breaches of the wrong sort could end up costing lives. It’s something to think about anyway.

The potential for error in big data analysis is just one of a whole bunch of reasons that I’m happy to read that the government is finally looking into ways to bolster the devices used to work with medical data. I’m almost positive that medical practitioners will fight tooth and nail against the new security measures, just like users of every persuasion do, but the security measures really are more important than just protecting individual patient data. As data becomes the centerpiece of all sorts of human endeavors, ensuring it remains as pristine as possible becomes ever more important. Security has to take a bigger role in data management in the future. Let me know your thoughts on securing data that could be used for medical analysis at John@JohnMuellerBooks.com.