One of the most common questions I get asked is, “what skills do I need to be a data scientist?”. To help with this question I wanted level set on what I believe are the core competencies of data science and break down these dimensions into a format that could be used as a checklist to self evaluate yourself for your learning journey.
I view data science is as a multidisciplinary practice that has 4 dimensions, and to get the most out of your learning journey it is beneficial to learn these dimensions in a particular order.
The four dimensions are Mindset, Data Wrangling, Communicating Insight, and Algorithms.
The first and most important of these 4 dimensions is the Mindset, this is the foundation for what you need to be a successful data scientist and will help you as you deepen your skills in the other dimensions.
The second area is Data Wrangling. Before you will be able to distill value from data, you must first know how to extract and work with the data to make it meaningful. Once this is mastered, you can then move on to the third area, Communicating Insight.
Communicating Insight is all about turning data into information and sharing it in a meaningful way with your audience. This is where data visualization tools, storytelling and soft skills become key.
The last of the four dimensions is Algorithms. This is probably the most well known aspect of data science and receives the most attention for it’s “sexyness”. Please keep in mind though, there are three prior areas that are important to master before you can build meaningful models with your algorithm.
The dimensions listed above are a high level overview to help you navigate the disciplines of data science. However, I understand there are multiple layers to these dimensions and it can become overwhelming when trying to learn them all. In order to you help navigate the learning process, I like to break the dimensions into three levels which can be used as a checklist to evaluate your current skills and mindset.
In order to complete a data science problem from beginning to end, you need a minimum of the skills listed in level 1 for each of the 4 dimensions. Once you have completed a problem with these basic skill you can expand on these skills by learning level 2 and so on and so forth.
My recommendation is to use this list as a checklist to evaluate yourself on the areas of opportunity you have for learning so you can create a custom learning path. If you checked “yes” to the skills listed in all four areas in level 1, then complete a problem from beginning to end and move on to the level 2 skills. This will insure you are getting practical experience as soon as possible, which is how you will truly learn data science.
You will notice I don’t mention any specific tools or languages listed in the skills, this is intentional. I use a hammer to fix things around my house, but this does not make me a carpenter. The same can be said for data science tools or languages such as Python, R, Tableau, etc. Using a particular tool does not make you a data scientist, but solving a problem through data does. However if you are just starting out, or would like recommendations for tools and resources please stay tuned for part two of this series.
Remember though, the goal is to be able to complete a problem from beginning to end and derive meaning from the data, so the more you can work on real problems the better. Data science is not a skill that is mastered, but rather a journey of continual improvement and knowledge. I hope this overview provides you with an outline to start your journey.