In Lesson 1, we learned about data science and what those words mean. As the poet said, “Yea, when a goal is vague, there is no way to succeed; but when I know my goal, that is indeed half the battle.” Thus, while we are a good way down the path, we still have a significant way to go. Unless you can complete the second half of this battle on your own, you need a data scientist. That is the subject of today’s correspondence lesson. Who are the people that work these miracles and how do we recognize them?
There are straightforward numbers that come from studies and polls. For example, a 2018 Burtch Works study found that data scientists are 85% male, 91% have an advanced degree (Masters or Ph.D.), they live mainly (66%) on the West Coast and the Northeast, and have a median of 6 yrs. of experience. While this may be a statistically accurate picture of a data scientist, I would suggest it is more of an advanced workforce population sample. For instance, if this is your guide, you could hire myself, or you could hire my brother – who could teach you all about Hemingway with his advanced degree in English. You would also bias yourself against the numerous fantastic data scientists that don’t fit this profile. Data science isn’t about the front-end of a person, their physical features that other people see; data science is about a person’s back-end, the inner workings that make them tick.
Step One: Find someone that understands the tools of the trade. . . or understands how to learn to use them. Data science is the ability to master new domains and techniques with a critical process, enabling them to solve problems and to understand how things work. Data scientist use tools such as analytics, statistics, math, plots, and so forth. These are all tools that are used as a means to an end and like most tools, they come with manuals. Though the instructions may look like gibberish to some, they exist. Dictionaries exist to define the terms and help a user grasp the meanings; they are available to all. There is a marvelous democratization of knowledge. In particular, in a business setting, the derivation of new laws of mathematics or the creation of novel statistical tests are rarely necessary. While not everyone may enjoy math, or find it intuitive, the mechanics can be learned and employed by a significant number of people. In other words, given an investment of time, analytics are an open book.
Step Two: Find someone that can critically think about a situation. On the other hand, insight and intuition are incredibly difficult to teach. At some level, they rely upon the innate curiosity and thought patterns of an individual. Hiring managers and proto-data scientists alike are oft stymied here. Why? The underlying question they want to answer is essentially: How can I know if this person is able to solve complex problems in a convincing manner and is then able to implement that solution such that others can benefit? I think back to my undergraduate years when taking freshman physics. This class was a prerequisite required by many majors. While the material clicked in my mind, I was drafted to be part of a large study group where, despite my best efforts, I never successfully helped some people understand the problems we were working. These were not unintelligent people; they would end up being successful doctors, chemists, advertisers and a myriad of other careers. They simply did not think in the same manner. They were unable to lay hold of elusive insights that unraveled the twisted knot of the problems.
All people use data to some extent. Not all people have the wherewithal to rapidly place raw data in a heretofore uncontemplated context and apply it to solve a problem that, to this point, did not have an answer. Insight with the ability to use data to answer questions is a rare talent. That is what a data scientist does and why they are so valuable. Note that this is a very general statement. It does not qualify the type of data. It does not limit the field of inquiry. It does not guide the sort of statistical tests that should be known. As a result, data scientists come in a variety of guises.
One data scientist may be more attuned to the nuances of statistical tests, whereas another may know the intricacies of neural networks. Some data scientists may be better coders and others might be great communicators. The common factor that unites them in a single category of humanity is the tenacious curiosity that leads them to find the insights that solve the problem before them. Despite where an individual data scientist’s strength may be, they are marked by an ability to quickly learn and grow – but learning, growth, and insight alone do not themselves define a job field.
Now that you know what data science is and how to locate a data scientist, check out Lesson 3: What is Machine Learning? where we shall explore one of the tools data scientists use to accomplish their tasks: machine learning.
Kevin Croxall is Director of Data Science for Expeed Software. He is a data and research scientist with more than a decade of comprehensive experience in data science project design and implementation. He has a broad range of experience in software development geared toward pipeline development, statistical analysis, and data visualization and presentation.