The wheel of data science turns, and algorithms come and pass, leaving solutions that become legend. Solutions fade to data, and even data is long forgotten when the business processes that gave it birth come again. There are neither beginnings or endings in data science. But this lesson chain was a beginning.*
It may have piqued your interest in what data science actually is or simply reawakened slumbering demons from the depths. Perhaps it taught you how to identify a data scientist or identify people that have aspects of data scientists. We spoke of what types of things machines can learn and how they accomplish this task. Now, this is the end.
But one end leads to new beginnings. The current beginning in which you find yourself is neither good nor bad, it simply is. Through our journey, you may have noticed a recurring theme: data science is about asking the right question and figuring out how to get at a solution, whether that approach be novel or not. Thus, as you begin from here on your journey, the proper question is do you know your place? What is your beginning? Do you know the question you are trying to ask? Do you know if you have the data to address that question or how to verify that the data is actionable? Do you know which algorithm to choose? Do you know how to evaluate the results? Are you driving, or are you asking the driver “where are you taking us?”
Before you can move forward, you have to know which of these questions are knowns and which are unknowns. Having identified that, you need to determine how you will answer the remaining questions. All of this must be done in an unbiased manner, which unfortunately is not where we as humans truly shine our brightest. We long for the ability to be unbiased and logical -to know the path forward with a full accounting of the relevant data at hand. This saudade for the answers to our problems is a human emotion. We can fret over it and mourn the ethereal natural as it slips through our clutching fingers. Or we can stop worrying and embrace the data science. We follow it not blindly but with due diligence and attention to understanding what is being done and why things result as they do, whether expected or not. Truly, the whole point of a data science machine is lost if you keep it secret from those it should inform.
So, give it a try. Answer some questions, for yourself first and foremost, then decide how you can best advance your cause:
1. What is your biggest problem?
2. How are you currently addressing (or planning to address) that problem? (Tools being used, approaches taken, types of data being leveraged)
3. What is the rough size of the effort needed to address this problem? One person? A team? Weeks? Months?
4. What type of data are needed?
5. How feasible is this effort? Easy peasy / I got this / Interesting / Pie in the sky / Hold my beer, watch this!
6. What can I do myself? What can my team do? Where do I need help?
Once you can answer those questions or even some of those questions, you can start to see your path forward. You can start to plan and identify where you will need extra help. Good luck and enjoy the ride!
Kevin Croxall is Director of Data Science for Expeed Software. He is a data and research scientist with more than a decade of comprehensive experience in data science project design and implementation. He has a broad range of experience in software development geared toward pipeline development, statistical analysis, and data visualization and presentation.