Machine Learning for Dummies, 2nd Edition, MovieLens Dataset

Updated March 15, 2023 to clarify the usage instructions.

The movies.dat file found in the Trudging through the MovieLens dataset section of Chapter 19 of Machine Learning for Dummies, 2nd Edition has been updated on the source site, so it no longer works with the downloadable source. Luca and I want to be sure you have a great learning experience. Fortunately, we do have a copy of the version of movies.dat found in the book. You can download the entire MovieLens dataset here:

To obtain your copy of the MoveLens dataset for your local Python setup, please follow these steps:

  1. Click the link or the Download button. The ml-1m.zip file will appear on your hard drive.
  2. Remove the files from the archive. You should see four files in a folder named ml-1m: movies.dat, ratings.dat, README, and users.dat.
  3. Place the files in the downloadable source directory for this book on your system.

Note that you may not be able to use automatic downloads with my site, which is a security measure on my part. In addition, it ensures that you get all of the MovieLens dataset files, including the README, which contains licensing, citation, and other information. This solution may not work well if you’re using an online IDE and I apologize in advance for the inconvenience. Please let me know if you have any other problems with this example at [email protected].

Locating the Machine Learning for Dummies, 2nd Edition Source Code

A reader recently wrote to say that the source code for Machine Learning for Dummies, 2nd Edition on GitHub is incomplete. Actually, that wasn’t originally one of the download sources for the book’s code and we had used that site for an intermediary code location, so it wasn’t complete. The GitHub site code is complete now, but we’d still prefer that you download the code from one of the two sites listed in the book: On my website at http://www.johnmuellerbooks.com/source-code/ (just click the book’s link) or from the Wiley site at https://www.wiley.com/en-us/Machine+Learning+For+Dummies%2C+2nd+Edition-p-9781119724056 (just click the Downloads link, then the download link next to the source code you want). The two preferred sites offer the source code in either Python or R form in case you don’t want to download both.

When you get the downloadable source, make sure you remove it from the archive as described in the UnZIPping the Downloadable Source post. Using the downloadable source helps you avoid some of the issues described in the Verifying Your Hand Typed Code post. Please let me know whenever you encounter problems with the downloadable source for a book at [email protected].

C++ All-in-One for Dummies Errata on Page 188

There is a mistake on page 188 of C++ All-in-One for Dummies, 4th Edition that is based on a supposed April Fool’s prank that was actually initiated on March 26, 2018 (see https://www.modernescpp.com/index.php/no-new-new) and spread throughout the Internet to sites such as: https://www.fluentcpp.com/2018/04/01/cpp-will-no-longer-have-pointers/.  The problem with pranks, especially pranks that linger because the people who perpetuate them haven’t removed them, is that other people tend to believe them, as in this post: https://stackoverflow.com/questions/59820879/are-new-and-delete-getting-deprecated-in-c#. Later, much later, as in the note on the Fluent C++ site, people admit that it was a joke, but still leave the errant material in place.

 After I had discovered that this information was a joke, I had meant to remove two sentences from the book, but somehow they stayed intact.  The two sentences in question appear in the “Understanding the Changes in Pointers for C++ 20” section:

Readers who already know something about pointers need to be aware of the changes in pointers for C++ 20, which is why it appears first. The essential thing to remember as you move to C++ 20 (where new is deprecated) and then to C++ 23 (where new is removed) is that pointers are going to change.

If you find any other references in the book that state that new is deprecated or removed, they too will be modified or eliminated during the next printing. I apologize for any problems that the error has caused, especially to readers who are new to C++, and have submitted an errata to the publisher so that the error is fixed during the next printing. If you have any questions at all about the book, please contact me at [email protected].

Completed! Book Drawing for C++ All-in-One for Dummies, 4th Edition

Five people now have a copy of C++ All-in-One for Dummies, 4th Edition coming their way. Please wait four to six weeks for delivery and let me know when you receive your book. These people are:

  • Eva Beattie
  • Thomas McQuillan
  • Michael Flores
  • Syam Poolla
  • Tom Taylor

I hope that each of you enjoys the book and will provide a review of it on Amazon. Thank you for your support, it’s really important to me. Your reviews will help other readers as well. If you have any questions at all about the book, please contact me at [email protected].

Book Drawing for C++ All-in-One for Dummies, 4th Edition

I’ve just released a new book, C++ All-in-One for Dummies, 4th Edition, and I’d love to give five people in the US a chance to read it for free (I can’t accept requests from other countries due to the amount of postage required to send a book to you). There’s only one catch. In exchange for the free book, I’d appreciate your review of it on Amazon.com. Your reviews are important because they give other people some idea of what the book is like outside of my opinion of it.

This new edition contains an amazing amount of changes from the 3rd Edition, many of which you requested. Of course, I started by updating everything, so you see the latest version of Code::Blocks used in this book. Working with Code::Blocks makes C++ coding a lot easier, but Code::Blocks tends not to hide the details or add any odd background code like some IDEs do. In addition to the updates, you can expect to see these changes:

  • Instructions on how to use your mobile device to write C++ code.
  • Updates on how to work with for loops.
  • Using functional programming techniques.
  • Employing new operators, such as the spaceship operator.
  • Understanding modifications to the Standard Library.

This new edition of the book comes in at a whopping 912 pages, so there is no expectation that you’ll read it cover-to-cover. What I would appreciate is your honest viewpoint on the topics that appeal to you most. If you’d like to participate in this drawing, please contact me at [email protected] by 8 March 2021 by email with a subject of “C++ Book Drawing”. I need your name and address. I’ll post the winners of the contest (sans email addresses) in a future blog post.

Python for Data Science for Dummies Errata on Page 221

The downloadable source for Python for Data Science for Dummies contains a problem that doesn’t actually appear in the book. If you look at page 221, the code block in the middle of the page contains a line saying import numpy as np. This line is essential because the code won’t run without it. The downloadable source for Chapter 12 is missing this line so the example doesn’t run. This P4DS4D; 12; Stretching Pythons Capabilities link provides you with a .ZIP file that contains the replacement source code. Simple remove the P4DS4D; 12; Stretching Pythons Capabilities.ipynb file from the archive and use it in place of your existing file.

Luca and I always want you to have a great experience with our book, so keep those emails coming. Please let me know if you have any questions about source code file update at [email protected]. I’m sorry about any errors that appear in the downloadable source and appreciate the readers who have pointed them out.

 

Python for Data Science for Dummies Errata on Page 145

Python for Data Science for Dummies contains two errors on page 145. The first error appears in the second paragraph on that page. You can safely disregard the sentence that reads, “The use_idf controls the use of inverse-document-frequency reweighting, which is turned off in this case.” The code doesn’t contain a reference to the use_idf parameter. However, you can read about it on the Scikit-Learn site. This parameter defaults to being turned on, which is how it’s used for the example.

The second error is also in the second paragraph. The discussion references the tf_transformer.transform() method call. The actual method call is tfidf.transform(), which does appear in the sample code. The discussion about how the method works is correct, just the name of the object is wrong.

Please let me know if you have any questions about either of these changes at [email protected]. I’m sorry about any errors that appear in the book and appreciate the readers who have pointed them out.

 

Python for Data Science for Dummies Errata on Page 124

Python for Data Science for Dummies contains an error in the example that appears on the top half of page 124. In the first of the two grey boxes, the code computes the results of four print statements. The bottom-most print statement, print x[1:2, 1:2], is supposed to compute a result based on rows 1 and 2 of columns 1 and 2, and the bottom grey box seems to confirm that interpretation by the showing the result as [[[14 15 16] [17 18 19]] [[24 25 26] [27 28 29]]]. However, the answer provided for this example in the downloadable source code is [[[14 15 16]]], which doesn’t agree with that in the text.

The good news is that the downloadable source contains the correct code. The error appears only in the book. The last print statement in the book is wrong. Here is the correct code (with output) for this example:

x = np.array([[[1, 2, 3], [4, 5, 6], [7, 8, 9],],
 [[11,12,13], [14,15,16], [17,18,19],],
 [[21,22,23], [24,25,26], [27,28,29]]])

print x[1,1]
print x[:,1,1]
print x[1,:,1]
print
print x[1:3, 1:3]
[14 15 16]
[ 5 15 25]
[12 15 18]

[[[14 15 16]
 [17 18 19]]

[[24 25 26]
 [27 28 29]]]

Please let me know if you have any questions about this example at [email protected]. I’m sorry about the error that appears in the book and appreciate the readers who have pointed it out.

 

Missing XMLData2.xml File

A number of readers have written to report that XMLData2.xml is missing from the downloadable source for Python for Data Science for Dummies. You encounter this file in Chapter 6, on page 108. The publisher has already added the file to the downloadable source, but you might be missing the file from your copy. If so, you can download it by clicking XMLData2.zip. I’m truly sorry about any problems that the missing file might have caused. Please be sure to let me know about your book specific question at [email protected].

 

Missing File from Python for Data Science for Dummies Downloadable Source

A reader recently contacted me regarding a missing file from the downloadable source for Python for Data Science for Dummies. This is the P4DS4D; 01; Quick Overview.ipynb you need for the first chapter. Simply click here to download P4DS4D; 01; Quick Overview.ipynb. I’m also asking the publisher to add the missing file to the downloadable source found on the Dummies site at http://www.dummies.com/store/product/Python-for-Data-Science-For-Dummies.productCd-1118844181,descCd-DOWNLOAD.html. If you encounter any other problems with the book, please be sure to let me know at [email protected]. Thank you for your patience!