Finding and Employing Data Science Tools

Python for Data Science for Dummies introduces you to a number of common libraries used for data science experimentation and discovery. Most of these libraries also figure prominently as part of a data scientist’s toolbox because they provide common functionality needed for every application. It is a great idea for those who are interested in expanding their knowledge in data science and how it can be applied to the field of Artificial Intelligence (AI). You can learn more about some of the basic principles such as applying, developing, leveraging and creating data science projects. However, these libraries are only the tip of the data science toolbox. Because data science is such a new technology, you can find all sorts of tools to perform a wide range of tasks, but there is little standardization and some of these tools are hard to categorize so that you know where they fit within your toolbox. That’s why I was excited to see, The data science ecosystem, the first of a three part series of articles that describe some of the tools available for use in data science projects. If you are interested in finding out more about data science, you might want to check out this data science bootcamp for more information. You can also find the other two parts of the article at:

The problem for people who want to explore data science and machine learning today might not be the lack of tools, but the lack of creativity in using them. In order to explore data science, it’s important to understand that the tools only work when your prepare the data properly, employ the correct algorithm, and define reasonable goals. No matter how hard you try, data science and machine learning can’t provide you with the correct numeric sequences for the next five lottery wins. However, data science can help you locate potential sources of fraud in an organization. The article, Machine learning and the strategic snake oil reserve, sums up what may be the biggest problem with data science today-people expect miracles without putting in the required work. Fortunately, there are new tools on the horizon to make languages, such as Python, and products, such as Hadoop, easier for even the less creative mind to use (see Python and Hadoop project puts data scientists first).

Even with a great imagination, the tools available today may not do the job you want as well as they should because the underlying hardware isn’t capable of performing the required tasks. The process is further hampered by a misuse of the skills that data scientists provide (see You’re hiring the wrong data scientists for details). As a result, you need a large number of specialized tools in order to perform tasks that shouldn’t require them. However, that’s the reason why you need to know about the availability of these tools so that you can produce useful results on today’s hardware with a minimum of fuss. Asking the question, “How would Alan Turing fix A.I.?” helps you understand the complexities of the data science and machine learning environments.

Data science, machine learning, data scientists with even greater skills, and better hardware will keep the momentum going well into the future. As the Internet of Things (IoT) continues to move forward and the problem of what to do with all that data becomes even larger, data science will take on a larger role in everyone’s daily life. Count on reading more articles like, Google a step closer to developing machines with human-like intelligence, that describe the proliferation of new hardware and new tools to make the full potential of data science and machine learning a reality. In the meantime, getting the tools you need and exploring the ways in which you can creatively use data science to solve problems is the best way to go for now. Let me know your thoughts on the future of data science at John@JohnMuellerBooks.com.

Missing Python for Data Science for Dummies Companion Files

For all those long suffering readers who have been missing the companion files for Python for Data Science for Dummies, they’re finally available at http://www.dummies.com/store/product/Python-for-Data-Science-For-Dummies.productCd-1118844181,descCd-DOWNLOAD.html. All you need to do is click the Click to Download link on the page. I’m truly sorry you needed to wait so long. Thank you to everyone who noticed the missing files and also the incorrect link in the book, which now appears in the book errata. Please let me know if you have any problems locating the files or downloading them at John@JohnMuellerBooks.com.

 

Getting Your Python for Data Science for Dummies Extras

The process of discovering how to use Python to perform data science tasks begins when you get your copy of Python for Data Science for Dummies. Luca and I spent a good deal of time making your data science learning experience easier and even fun. However, it only starts there. Like many of my other books, you can also find online content for Python for Data Science for Dummies in these forms:

I always want to hear your questions about my books. Be sure to write me about them at John@JohnMuellerBooks.com. In the meantime, I hope you enjoy your Python for Data Science for Dummies reading experience. Thank you for your continued support.


20 July 2015: Updated to show correct link for the companion files.

 

Defining the Need for Desktop Systems

I’ve been working on Build Your Own PC on a Budget for a while now and I’m nearing the end. A number of people have asked me precisely what market my book is for, especially now that smartphones and tablets are becoming the instruments of choice for consumer computing. In fact, someone recently sent me a ComputerWorld article entitled, Is your business ready for ‘stick’ PCs?. It’s important to understand that I really haven’t been living in a cave somewhere chanting a desktop PC mantra. The fact is that Build Your Own PC on a Budget is designed with the enthusiast in mind. This is the same person who would build a hot rod from scratch, even though they could probably get a nicer, more reliable, more fuel efficient car right off the lot.

The fact is that there are times when you want the flexibility that a desktop system can provide. If you want a system whose sole purpose is to check e-mail, do a little word processing, and possibly update your Facebook page, then you really don’t want a desktop system for the most part. The exception might be if you need a really large screen to see what you’re doing and many people simply plug their computers into the TV now in order to get the larger screen they need. For many people, a notebook, tablet, or smartphone really is all they need. When these stick PCs become popular, you can bet that a large number of people will use them for all their computing needs without any problem at all.

My book is designed around the needs of someone who needs a lot more than a simple computer. Of course, the gamer is the first person that comes to mind. When you read magazines like PC Gamer, you quickly find out that power says it all. These folks are constantly tweaking their systems to get out a little more power. Overclocking is something that these people talk about as casually as what they had for dinner last night.

However, I recently finished a book on data science and must admit that a tablet would never do the job. My desktop has power to spare and even it slowed down on some calculations (as in, I had time to get a cup of coffee while waiting for the processing to complete). A laptop would have a really hard time keeping up with even the minimal needs of the data scientist. In fact, many professional scientists and engineers really do need a super reliable, high power system. They can’t afford down time and they really don’t want to wait days for the results of a calculation. So, this is the second group for my book. They really aren’t looking for a stick PC.

The third group is experimenters. People who are interested in playing just to see what’s possible will love my book because I have all kinds of ideas in it for doing something interesting. Experimenters are those people who somehow manage to have these flashes of insight that result in major innovations. Many of the luxuries you enjoy now were the result of a mistake made by an experimenter. The mistake was turned into a profitable product only after someone looked at it from another angle.

A custom PC is also beneficial for specialized needs such as industrial automation or even for alarm systems. Special use PCs often require more ports than are available on something like a notebook, tablet, or smartphone. Just imagine trying to put enough cameras into the single USB port supplied with many smaller systems. So, I see a number of people who create special use systems buying this book as well.

Is the day of the desktop system as a commodity coming to an end? Yes, I definitely see consumers moving toward laptops, tablets, smartphones, smart watches, and even sticks in the future. If you don’t need the power a desktop can provide, there really isn’t a good reason to pay the price. Let me know your thoughts on the future of the desktop system at John@JohnMuellerBooks.com.

 

Beta Readers Needed for Python for Data Science for Dummies

Data science is the act of extracting knowledge from data. This may seem like a foreign concept at first, but you use data science all the time in your daily life. When you see a pattern a sequence of numbers, your mind has actually used data science to perform the task. What data science does is quantify what you do normally and make it possible to apply the knowledge to all sorts of different technologies. For example, robots use data science to discover objects in their surroundings.

Of course, data science is used for all sorts of applications. For example, data science is used with big data to perform tasks such as data mining or to predict trends based on various data sources. The fact that your browser predicts what you might buy based on previous purchases rests on data science. Even your doctor relies on data science to predict the outcome of a certain series of medications on a illness you might have.

Even though data science first appears easy to categorize, it’s actually huge and quite difficult to pin down. It relies on the inputs of three disciplines: computer science, mathematics, and statistics. There are all sorts of sub-disciplines used as well. Because of the depth and width of knowledge required, a data scientist often works as part of a team to tease out the meanings behind the data provided to solve a problem.

Python for Data Science for Dummies provides you with a beginning view of data science through the computer science discipline using a specific language, Python. The capabilities of Python as a language make it a perfect choice for this book. While reading this book, you’ll see these topics explained:

  • Part I: Getting Started with Data Science & Python
    • Chapter 1: Discovering the Match between Data Science and Python
    • Chapter 2: Introducing Python Capabilities and Wonders
    • Chapter 3: Setting Up Python for Data Science
    • Chapter 4: Reviewing Basic Python
  • Part II: Getting Your Hands Dirty with Data
    • Chapter 5: Working with Real Data
    • Chapter 6: Getting Your Data in Shape
    • Chapter 7: Shaping Data
    • Chapter 8: Putting What You Know in Action
  • Part III: Visualizing the Invisible (2 Pages)
    • Chapter 9: Getting a Crash Course in MatPlotLib
    • Chapter 10: Visualizing the Data
    • Chapter 11: Understanding Interactive Graphical and Computing Practice
  • Part IV: Wrangling Data
    • Chapter 12: Stretching Python’s Capabilities
    • Chapter 13: Exploring Data Analysis
    • Chapter 14: Reducing Dimensionality
    • Chapter 15: Clustering
    • Chapter 16: Detecting Outliers in Data
  • Part V: Learning from Data
    • Chapter 17: Exploring Four Simple and Effective Algorithms
    • Chapter 18: Performing Cross Validation, Selection and Optimization
    • Chapter 19: Increasing Complexity with Linear and Non-linear Tricks
    • Chapter 20: Understanding the Power of the Many
  • Part VI: Parts of Ten
    • Chapter 21: Ten Essential Data Resources
    • Chapter 22: Ten Data Challenges You Should Take

As you can see, this book is going to give you a good start in working with data science. Because of the subject matter, I really want to avoid making any errors in book, which is where you come into play. I’m looking for beta readers who use math, statistics, or computer science as part of their profession and think they might be able to benefit from the techniques that data science provides. As a beta reader, you get to see the material as Luca and I write it. Your comments will help us improve the text and make it easier to use.

In consideration of your time and effort, your name will appear in the Acknowledgements (unless you specifically request that we not provide it). You also get to read the book free of charge. Being a beta reader is both fun and educational. If you have any interest in reviewing this book, please contact me at John@JohnMuellerBooks.com and will fill in all the details for you.