Finding and Employing Data Science Tools

Python for Data Science for Dummies introduces you to a number of common libraries used for data science experimentation and discovery. Most of these libraries also figure prominently as part of a data scientist’s toolbox because they provide common functionality needed for every application. However, these libraries are only the tip of the data science toolbox. Because data science is such a new technology, you can find all sorts of tools to perform a wide range of tasks, but there is little standardization and some of these tools are hard to categorize so that you know where they fit within your toolbox. That’s why I was excited to see, The data science ecosystem, the first of a three part series of articles that describe some of the tools available for use in data science projects. You can find the other two parts of the article at:

The problem for people who want to explore data science and machine learning today might not be the lack of tools, but the lack of creativity in using them. In order to explore data science, it’s important to understand that the tools only work when your prepare the data properly, employ the correct algorithm, and define reasonable goals. No matter how hard you try, data science and machine learning can’t provide you with the correct numeric sequences for the next five lottery wins. However, data science can help you locate potential sources of fraud in an organization. The article, Machine learning and the strategic snake oil reserve, sums up what may be the biggest problem with data science today—people expect miracles without putting in the required work. Fortunately, there are new tools on the horizon to make languages, such as Python, and products, such as Hadoop, easier for even the less creative mind to use (see Python and Hadoop project puts data scientists first).

Even with a great imagination, the tools available today may not do the job you want as well as they should because the underlying hardware isn’t capable of performing the required tasks. The process is further hampered by a misuse of the skills that data scientists provide (see You’re hiring the wrong data scientists for details). As a result, you need a large number of specialized tools in order to perform tasks that shouldn’t require them. However, that’s the reason why you need to know about the availability of these tools so that you can produce useful results on today’s hardware with a minimum of fuss. Asking the question, “How would Alan Turing fix A.I.?” helps you understand the complexities of the data science and machine learning environments.

Data science, machine learning, data scientists with even greater skills, and better hardware will keep the momentum going well into the future. As the Internet of Things (IoT) continues to move forward and the problem of what to do with all that data becomes even larger, data science will take on a larger role in everyone’s daily life. Count on reading more articles like, Google a step closer to developing machines with human-like intelligence, that describe the proliferation of new hardware and new tools to make the full potential of data science and machine learning a reality. In the meantime, getting the tools you need and exploring the ways in which you can creatively use data science to solve problems is the best way to go for now. Let me know your thoughts on the future of data science at John@JohnMuellerBooks.com.

 

Selecting a Programming Language Version

Because I have worked with so many programming languages and reported on them in my blog, I get a lot of e-mails from people who wish to know which language they should use. It’s a hard question because I don’t really have inside information about the project, their skills, their organization, or the resources at their disposal. Usually I provide some helpful guidelines and hope that the sender has enough information to make a good selection. Of course, I’ve also discussed the benefits of various programming languages in this blog and direct people here as well. The next question people ask is which version of the language to use.

Choosing the right programming language version is important because a mistake could actually cause a project to fail. I was asked the question often enough that I decided to write an article recently entitled, How to Choose the Right Programming Language Version for Your Needs. This article helps you understand the various issues surrounding programming language version selection. As with choosing a programming language, I can’t actually tell you which version to choose and for the same reasons I can’t select a language for you. At issue are things like your own personal preferences. In many cases, the language version you choose depends as much on how you feel about a specific version as what the version has to offer you as a developer.

An interesting outcome of programming language selection requirements is that I have one book, Beginning Programming with Python For Dummies that uses Python 3.3 and another book, Python for Data Science for Dummies that uses Python 2.7. Of course, I’ve had books that cover two different versions of a language before, so there is nothing too odd about the version differences until you consider the fact that Python for Data Science for Dummies is the newer of the two books. The reasons for my selections appear in Where is Python 3?. The point is that even book authors need to made version choices at times and they’re almost never easy.

Precisely how do you choose language versions in your organization? Do these criterion differ from techniques you use for you own choices (if so how)? Let me know your thoughts on selecting a programming language version at John@JohnMuellerBooks.com.

 

Getting the Right Visual Studio Add-In

Give me the right tool and I can use it to create just about anything in code! It’s a bold statement and I’m sure that I’d have little trouble finding a lot of other developers who believe precisely the same thing. Unfortunately, finding the right tool is akin to finding a needle in a haystack (feel free to substitute any other worn cliche you can think about). How does someone quantify the usability of a tool or decide whether to even download it in the first place. I recently wrote an article for Software Quality Connection entitled, “Techniques for Finding Useful IDE Add-ins” that answers that question. The article proposes some techniques you can use to make the process of finding just the right add-in easier.

Of course, my great add-on is a piece of junk for you. Our needs, goals, programming skills, and programming tasks differ. So, what I’d like to know is how you look for add-ins and how you use them to create better applications. It’s nice to get another perspective on a given topic. Over the years I’ve talked with hundreds (perhaps thousands) of readers about the tools they use because the right tool in the right hands can do amazing things. Most of my books contain a healthy number of links to various tools and I often employ add-ins in my books to make tasks easier. Let me know about your favorite add-in and tell me why you think it’s so great at John@JohnMuellerBooks.com. (Please, no vendor e-mails. I already know your tool is great; I really want to hear from people in the trenches on this topic.)

Part of my reason for asking this sort of information is to improve the quality of my books and articles. Quality is extremely important to me. In fact, it’s the reason I created the beta reader program. A beta reader reviews my manuscript as I write it. You can find out more about the beta reader program in my Errors in Writing post. If you want to become a beta reader, just write me at John@JohnMuellerBooks.com for additional details. In the meantime, try out some new add-ins and have a bit of fun .