Getting a Good Anaconda Install

Some people may have misinterpreted the content at the beginning of Chapter 3 in Python for Data Science for Dummies. It isn’t necessary to install the products listed in the Considering the Off-the-Shelf Cross-Platform Scientific Distributions section starting on Page 39. These products are for those of you who would like to try a development environment other than the one used in the book, which is Anaconda 2.1.0. However, unless you’re an advanced user, it’s far better to install Anaconda 2.1.0 so that you can follow the exercises in the book without problem. Installing all of the products listed in Chapter 3 will result in a setup that won’t work at all because the various products will conflict with each other.

Because Continuum has upgraded Anaconda, you need to download the 2.1.0 version from the archive at https://repo.continuum.io/archive/.There are separate downloads for Windows, Mac OS X, and Linux.  The chapter tells you precisely which file to download.  For example, for Windows you’d download Anaconda-2.1.0-Windows-x86_64.exe. The point is to use the same version of Anaconda as you find in the book. You can find the installation instructions on Page 41 if you have a Windows system, Page 45 if you have a Linux system, or Page 46 if you have a Mac OS X system.  Make sure you download the databases for the book by using the procedures that start on page 47.

Following this process is the best way to ensure you get a good installation for Python for Data Science for Dummies. Luca and I want to make certain that you can use the book to discover the wonders of data science without having to jump through a lot of hoops to do it. Please feel free to contact me at John@JohnMuellerBooks.com if you have any questions about the installation process.

 

IPython Magic Functions

Both Python for Data Science for Dummies and Machine Learning for Dummies rely on a version of Anaconda that uses IPython as part of its offering.Theoretically, you could also use Anaconda with Beginning Programming with Python For Dummies, but that book is designed to provide you with an experience that relies on the strict Python offerings (without the use of external tools). In other words, the procedures in this third book are designed for use with IDLE, the IDE that comes with Python. IPython extends the development environment in a number of ways, one of which is the use of magic functions. You see the magic functions in the code of the first two books as calls that begin with either one or two percent signs (% or %%). The most common of these magic functions is %matplotlib, which controls how IPython Notebook or Jupyter Notebook display plot output from the code.

You can find a listing of the most common magic functions in the Python for Data Science for Dummies Cheat Sheet. Neither of the first two books use any other magic functions, so this is also a complete list of magic functions that you can expect to find in our books. However, you might want to know more. Fortunately, the site at https://damontallen.github.io/IPython-quick-ref-sheets/ provides you with a complete listing of the magic commands (and a wealth of other information about IPython).

Of course, you might choose to use another IDE—one that isn’t quite so magical as Anaconda provides through IPython. In this case, you need to remove those magic commands. Removing the commands won’t affect functionality of the code. The example will still work as explained in the book. However, the way that the IDE presents output could change. For example, instead of being inline, plots could appear in a separate window. Even though using a separate window is less convenient, either method works just fine. If you ever do encounter a magic function-related problem, please be sure to let me know at John@JohnMuellerBooks.com.

 

Installing Python Packages (Part 2)

In the Installing Python Packages (Part 1) post, you discovered the easiest method of installing new packages when working with Beginning Programming with Python For Dummies, Python for Data Science for Dummies, and Machine Learning for Dummies. Using the pip command is both fast and easy. However, it doesn’t provide much in the way of feedback when things go wrong. To overcome this issue, you can use the conda command in place of pip when you have Anaconda installed on your system. Like pip, conda supports a wide variety of commands. You can find a listing of these commands at http://conda.pydata.org/docs/using/pkgs.html.

You need to know a few things about working with conda. The first is that you need to open an Anaconda prompt to use it. For example, when working with Windows, you use the Start ⇒ All Programs ⇒ Anaconda<Version> ⇒ Anaconda Prompt command to open a window like the one shown here where you can enter commands. (Your Anaconda Prompt may look different than the one shown based on the platform you use and the version of Anaconda you have installed.)

Use the Anaconda Prompt to gain access to the conda command.
The Anaconda Prompt

You can easily discover the features the conda command supports by typing conda -h and pressing Enter. You see a list of command line switches similar to the ones shown here:

Use the conda command line switches to perform various tasks.
A Listing of Conda Switches

As you can see, there are quite a few tasks you can perform. To determine whether you have a package installed, use the Conda search <package name> command.  For example, if you want to determine if you have Pandas installed, you type Conda search Pandas and press Enter.  You see a list of Pandas versions installed, assuming that Pandas is installed, like this:

Use the search switch to locate a particular package installation.
A Listing of Pandas Information

The information you get from conda is far more in depth than pip provides. To determine what you have installed, just go down the list and determine whether you have the version of Pandas that you need.  If you don’t, then type Conda update pandas and press Enter (notice the case used).  On the other hand, let’s say you want to install BeautifulSoup.  Well, the first time through, try typing Conda install BeautifulSoup and pressing Enter.  You see an error message that tells you what to type like this:

The conda command provides you with helpful error information.
Using Error Information

Since you want to install the latest BeautifulSoup, type Conda install beautiful-soup and press Enter.  After searching for the required update information, conda will ask if you want to proceed.  Type y and press Enter.  You’ll see a whole bunch of activity take place, but eventually, you have a new version of BeautifulSoup, plus all the supporting functionality, installed correctly in the correct locations.  Here’s how things looked on my system:

Conda provides detailed information about the installation process.
Viewing the Result of an Installation

At this point, you have BeautifulSoup installed. Installing other packages follows the same path. Using conda does require a little more expertise than using pip, but you also gain additional flexibility and garner more information. When everything goes well, either tool does an equally good job of getting the installation or update task done, but conda excels in helping you past troublesome installations. Let me know your thoughts about using conda to install the packages required by my books at John@JohnMuellerBooks.com.

 

Using Jupyter with Anaconda (Updated)

A few readers have recently written to me regarding the use of Jupyter with the downloadable source for Python for Data Science for Dummies. The version of Anaconda recommended for the book, 2.1.0, doesn’t rely on Jupyter, which is why the book doesn’t mention Jupyter. The book relies on IPython Notebook, which is what you should use to obtain the best reading experience. You can obtain the proper version from the Continuum archive. However, if you choose to download the current version of Anaconda, then using Jupyter becomes a possibility; although, many of the procedures found in the book will require tweaking and the screenshots won’t match precisely.

In order to use Jupyter, you must still import the downloaded files into your repository. The source code comes in an archive file that you extract to a location on your hard drive. The archive contains a list of .ipynb (IPython Notebook) files containing the source code for this book (see the Introduction for details on downloading the source code). The following steps tell how to import these files into your repository:

  1. Click Upload at the top of the page. What you see depends on your browser. In most cases, you see some type of File Upload dialog box that provides access to the files on your hard drive.
  2. Navigate to the directory containing the files you want to import into Notebook.
  3. Highlight one or more files to import and click the Open (or other, similar) button to begin the upload process. You see the file added to an upload list, as shown here. The file isn’t part of the repository yet—you’ve simply selected it for upload.

    Click Upload when you want to upload files to the repository.
    Upload Source Files to the Repository
  4. Click Upload. Notebook places the file in the repository so that you can begin using it.

It’s important to both Luca and me that you have the best possible learning experience with our book. This means using the right version of Anaconda for most people. Using the latest version shouldn’t cause problems, but we’d like to know if it does. Please feel free contact me at John@JohnMuellerBooks.com with your book-specific questions.


Update

It has come to our attention since this post first published that using the latest version of Anaconda with Python for Data Science for Dummies is problematic. Some of the examples won’t work without rewriting because the Pandas Categorical class has changed. This is the only change we’ve confirmed so far, but there are no doubt other changes. In order to get the proper results from the examples in the book, you must use the correct version of Anaconda, version 2.1.0.

Please do keep those questions coming. It’s because a reader took time to write that Luca and I became aware of this problem. We truly do want you to have a great learning experience, so these questions are important!

 

Warnings in Python and Anaconda

It seems as if Python developers are having more than a few problems at the moment from a number of sources. I recently wrote about the potential issues for readers of Beginning Programming with Python For Dummies and Python for Data Science for Dummies from Windows 10 (Python and Windows 10). However, some readers have come back afterward to say they’re still seeing warnings. It wasn’t until one of the beta readers for Machine Learning for Dummies also saw some of these warnings that it became apparent that some other problem is at work. A recent upgrade to NumPy 1.10.1 has created these warnings. You can see some message threads about the issue at:

The important thing to remember is that you’ll see warnings, not errors (unless there is a problem Luca, my coauthor for Python for Data Science for Dummies, and I haven’t seen yet). For now, updating all of the Anaconda components is the only way to actually get rid of the warnings, which can prove to be quite a pain. However, the warnings are just that, warnings. The code in the books will still run just fine. The best way to avoid a lot of work and potentially creating yet more problems is to ignore the warnings for now. In order to ignore the warnings, type the following two lines of code:

import warnings
warnings.simplefilter("ignore")

Obviously, the situation is inconvenient for everyone, but the various libraries will get in sync sometime soon and then the warnings will disappear until the next set of updates. Please let me know if you continue to see problems after making this fix at John@JohnMuellerBooks.com.

 

Download Site for Python

I recently received an e-mail from a reader who had a bad install with Python 3.3.4 on a laptop with 64-bit Windows 7 installed. No matter what the reader did, the installation wouldn’t work. The application would fail with an error stating that pythonw.exe was unable to start and it included an error of 0xc000007b. He had downloaded the code from https://www.python.org/download/releases/3.3.4/, which is the site mentioned on page 25 of Beginning Programming with Python For Dummies. However, downloading a copy from http://continuum.io/downloads#py34 or https://store.continuum.io/cshop/anaconda/ did provide a copy of Python 3.4.3 (not the version 3.3.4 that is used in the book) that does work on his system.

The problem with this solution is that installing a copy from this second site also installs Anaconda—a product that isn’t covered in the book. In order to work with the IDLE examples in the book, you must open a copy of IDLE in the Anaconda\Scripts folder of the Anaconda installation. You’ll likely find this folder in your personal folder of your system. If you do find that you can’t get the copy of the product from the Python download site to work on your system, try this second solution and please let me know about the issue at John@JohnMuellerBooks.com. I would strongly encourage you to try the setup found in the book, however, because using Anaconda will cause extra work for you and this book is truly meant to help someone who has little or no programming experience discover the joys of working with Python.

As a side note, I have tried the book’s source code with the latest Python release, 3.4.3 (the book was originally written to use version 3.3.4). All of the source code works on my test system, but I’d love to hear if it works on your system as well. You can obtain this updated version of Python at https://www.python.org/downloads/release/python-343/ or http://continuum.io/downloads#py34 (if you don’t mind installing Anaconda as well).

When using the 3.4.3 version of Python, your screenshots may vary some from those found in the book. All version-specific information will change, so you need to take this change into account as you read. Please let me know if you experience any problems using this updated version on your system. In the meantime, happy reading!