Ultimate Q&A for Aspiring Data Scientists with Serious Guides
So…you want to become a data scientist? Cool. You’re a self-motivated person who is very passionate about data science and bringing values to companies by solving complex problems. Great. But you have ZERO experience in data science and have no clue how to get started in this field. I get it. I’ve been there and I definitely feel you. That’s why this post is dedicated to you — enthusiastic and aspiring data scientists — to answer the most common questions and challenges faced by most people.
If data is ‘the new oil’, then the data scientist functions much like an oil refinery, converting data into insights that can both save money and generate capital
— Eva Short
All the questions below are not created by me, but you — the vibrant data science community. Many thanks to all of you for your support on my previous LinkedIn post, including questions that I got from emails and other channels! Please be noted that the questions below are not in order, therefore feel free to skip to any part of the questions where you find suitable.
I hope that by sharing my experience in this post would shed light on how to pursue a data science career and give you some general guides to hopefully make your learning journey more enjoyable. Let’s get started!
What Is The Current Trend in Data Science Skills Gap?
The International Data Corporation (IDC) predicts that worldwide revenues for big data and business analytics will reach more than $210bn in 2020.
According to the LinkedIn WorkForce Report in August 2018 for the United States, there was a national surplus of people with data science skills in 2015. Three years later, the trend has changed tremendously in the opposite way as more companies are facing shortages of people with data science skills with big data being increasingly used to generate insights and make decisions.
Economically speaking, it is all about SUPPLY and DEMAND.
The good news is: The “tables” are now turned. The bad news is: With rising job opportunities in data science, still, a lot of aspiring data scientists are facing challenges in getting their foot in the door simply because of their lack of data science skills gap relative to the requirements in the current job market.
In the coming section, you’ll see how to improve data science skills to close the “gap”, stand out among pool of other candidates and eventually increase your chances in landing your dream job.
Questions & Answers
1. What are the skill sets required and how to cover them up?
I’ll be very honest with you. To learn ALL the skills sets in data science is next to impossible as the scope is way too wide. There’ll always be some skills (technical/non-technical) that data scientists don’t know or haven’t learned as different businesses require different skill sets.
In general — in my opinion based on my experience and learning from other data scientists — there are some CORE skill sets that must be learned to become a data scientist.
Technical skills. Math and statistics, programming, and business knowledge. Despite an excellent proficiency in programming regardless of the languages used, we— as a data scientist — should be able to explain our model results to stakeholders in the language of business context and supported by math and statistics.
To learn math and statistics (or more comprehensive data science resources), check this website out created by Randy Lao. Randy has been helping aspiring data scientists and his repository on the website is truly a gold mine!
I still remember when I first started out in data science I read this textbook — An Introduction to Statistical Learning — with Applications in R. I highly recommend this textbook for beginners as the book focuses on the fundamental concepts of statistical modelling and machine learning with detailed and intuitive explanations. If you are a mathematically hardcore person, perhaps you would prefer this book: The Elements of Statistical Learning.
To learn programming skills, especially for beginners without prior experience, I’d suggest to focus on learning one language (personally I prefer Python!😊) since the concepts are applicable to other languages if needed and Python is more easier to learn. The importance and and usage of Python or R has been a subject of debate in data science. Personally, I think the focus should be on how you can help businesses solve problems, regardless of the languages used.
Finally, I can’t stress enough that the understanding of business knowledge is extremely crucial as I have also included in one of my articles (You can refer to it here).
Soft skills. In fact, soft skills are more important than hard skills. Surprised? I hope not.
LinkedIn surveyed 2,000 business leaders and the soft skills that they’d most like to see their employees have in 2018 are: Leadership, Communication, Collaboration and Time Management. And I truly believe these soft skills play an essential part in data scientist’ day-to-day work. In particular, I learned the hard way on the importance of communication skills which you can read it here.
2. How to choose the right bootcamps and online courses when there are plenty of them out there?
With the hype surrounding AI and data science and many people jumping on the bandwagon, a lot of MOOCs, bootcamps, online courses, workshops (Free/Paid) are mushrooming to hopefully not “miss the boat”.
There are many resources out there. Be resourceful.
So the question is: How to choose the learning materials that are suitable to you?
My approach to filter and select the right online courses/workshops for me:
- Understand that there’s no single best course that can cover all the materials you need. Some courses overlap in some areas and it’s not worth the bucks to purchase different courses but repeat most of the teaching materials.
- Know what you need to learn in the very first place. NEVER dive into a course simply because of the fancy and catchy titles. Remember the technical skills mentioned earlier? Check out job descriptions of data scientists online and you’ll notice there are some common skills required by companies out there. Now you have known the skills needed and the skills that you’re lacking of. Fantastic. Go search for courses that can help you improve your knowledge (theoretical and hands-on).
- Research online on the best courses offered by different platforms.Once you’ve shortlisted a few courses that suit your needs, check out their respective reviews (very important!) by others before you pull out your wallet and get enrolled. On the other hand, there are also many FREE courses available on Coursera, Udemy, Lynda, Codecademy, DataCamp, Dataquest and many more. Did I also mention YouTube? Yes, you get it.
- TIPS: Some platforms might offer financial aid to subsidize your course fees (Coursera etc.). Give it a try!
- Some of my personal favourite courses that have helped me tremendously:
- Machine Learning taught by Andrew Ng, the co-founder of Coursera.
- Python for Data Science and Machine Learning Bootcamp taught by Jose Portilla.
- Deep Learning A-Z™: Hands-On Artificial Neural Networks taught by Kirill Eremenko and Hadelin de Ponteves.
- Python for Data Science Essential Training taught by Lillian Pierson.
- The Ultimate Hands-On Hadoop — Tame your Big Data! taught by Frank Kane.
3. Is learning from open source sufficient to become a data scientist?
I’d say that learning from open source is sufficient to get yourself started in data science and anything beyond is to develop your career further as a data scientist, again, depending on business needs.
4. Should a beginner (from a totally different background) start with reading materials to understand the basics? What book would you suggest?
There’s no fixed path in learning as all roads lead to Rome. Reading materials is definitely a great start to understand the fundamentals which I did the same way as well!
Just be aware of not trying to read and memorize nitty-gritty of the maths and algorithms. Because chances are, you’ll forget everything without really applying the concepts to real problems when it comes to coding.
Just know and understand enough to get yourself started and move on to the next step. Be practical. Don’t try to be perfect in knowing everything simply because perfectionism is unknowingly the best reason of procrastination and not moving forward.
Below are some of the books that I’d suggest to understand the basics of Python, machine learning and deep learning (Hope it helps!):
- Learning Python
- Python for Data Analysis
- An Introduction to Statistical Learning
- Machine Learning for Absolute Beginners
- Python Machine Learning
- Python Data Science Handbook
- Introduction to Machine Learning with Python
- Deep Learning with Python
- Deep Learning with Keras
5. How to balance between understanding business problems (formulating solutions) and developing technical skills (coding, core math knowledge etc.)?
I started off by developing my technical skills before going into understanding business problems and formulating solutions.
Business problems give you WHAT and WHY. To solve a business problem, one has to first how to solve the problem. And the HOW comes from technical skills. Again, the approach depends on situation and my suggestion is mainly based on personal experience.
6. How can we overcome the challenges of starting a career as a data scientist?
One of the major challenges faced by many aspiring data scientists (including me) is that data science is an ocean of information. We could easily lose our focus by getting overwhelmed with all the advice and resources (Online courses, workshops, webinars, meetups, you name it…) that come from different directions. Stay focused. Know what you have and what you need and ALL IN.
Throughout my data science journey, challenges are uncountable but are also what have shaped who I am today. I’ll try my best to explain the main challenges faced by me and how to overcome them:
- I was confused with so many resources when I first started out. I filtered out the noises through the hard way. Listening to podcast and watching webinars given by data scientists, reading plenty of data science articles on how to pursue career in this field, experimenting with different various online courses, engaging with data science community on LinkedIn and learn from them. Ultimately, I focused only helpful resources that I’ve shared in this post.
- There was a point when I almost gave up. The thought of giving up came across my mind when the learning curve was too steep and I started doubting myself. Am I really capable of doing this? Am I really pursuing the right path? Passion and patience redirected me back and let me stay on my path. Keep grinding and keep hustling day in, day out.
Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do.
— Steve Jobs
- Getting a job as a data scientist (or similar job scope but different title). I wish I could have read these articles earlier written by Favio Vázquez — How to get a job as a Data Scientist? and The two sides of Getting a Job as a Data Scientist. Getting a job was no easy task for me due to the competitive nature in the job market. I submitted tons of resume for job applications but to no avail. Something must be wrong as I was thinking deeply. I revamped my approach and started networking: attending meetups and seminars, sharing my learning experience online, approaching prospective employers in career fairs and sharing sessions in a more systematic way, giving follow-up upon submitting my resumes etc. Things started to change and opportunities started to knock on my door.
7. How to put my work experience in my resume so that I will be hired and my experience will be counted?
I believe there is a misconception here — you’ll not be hired solely based on the experience in your resume. In fact, your resume is one of the ways to get the first entrance ticket to your next stage of application — interview.
Therefore, learning how to write work experience in resume is truthfully important to get the entrance ticket. Studies have shown that the average recruiter scans a resume for six seconds before deciding if the applicant is a good fit for the role. In other words, to pass the resume test, your resume only has six seconds to make the right impression with a prospective employer. Personally, I referred to the following resources to polish my resume:
- Optimize Guide (Personal preference!)
- A Resume Expert Gives Career Advice
- How to Pass the 6-Second Resume Test
- How to tailor your Academic CV for Data Science roles
- What do Hiring Managers Look For in a Data Scientist’s CV?
- The 14 Things You Need On Your Resume To Land Your Dream Job
8. What kind of portfolio can help us to get a first job in data science or machine learning?
In my very first article on Medium, I mentioned the importance of building a portfolio. Having a well-polished resume is not enough to get you an interview without a good portfolio.
After the first glance at your resume, prospective employers want to understand more about your background and this is where your portfolio comes in. While you might wonder how to build a portfolio from scratch, start by documenting your learning journey. Share your learning experience, mistakes, takeaways — technical or non-technical — through social media platforms (LinkedIn, Medium, Facebook, Instagram, Personal blog — it doesn’t matter).
Interesting in talking in front of a video recorder? Then start by making videos (interview with other aspiring/well-established data scientists) and share on YouTube. Good at writing? Then start writing on the topics that you’re passionate about on different platforms. If you are not into visuals and writing, then reach out to others and conduct podcasts with them.
My point here is: The opportunities are seriously abundant with Internet to build your portfolio and gain traction, or potentially the attention of your prospective employers.
Once of the best decisions I’ve ever made is to engage with the data science community on LinkedIn and document my learning journey on Medium. I learned the most on LinkedIn with the close-knit data science community in such a conducive sharing-learning environment.
Gradually, I learned (still learning!) how to build my portfolio on LinkedIn with my experiences from different sources. Along the way, I got messages from different recruiters on job opportunities and I even got the chance to grab some coffee and have quick chat with some of them!
And now, I’m very excited to share what I’ve learned and contribute back to the community. So I’m giving out some useful guides on how to build your LinkedIn profile with some serious hacks!
Leave your email address in the comments below and I’ll send to your inbox straight. 😄
— More Resources —
At this point, you might be wondering how the heck that I have so many resources shared in this post, all at once. Well, my no-brainer secret: Bookmark websites and articles that I find useful and refer to them from time to time.
This proves to be extremely helpful and comes in handy to me when I need some reference and revision. I can definitely share the whole list of bookmarked websites but that’ll be too long for this post (perhaps in the other post I guess).
Nonetheless, I’ll just list down some of the useful resources:
- Towards Data Science, Quora, DZone, KDnuggets, Analytics Vidhya, DataTau, fast.ai
- Webinars — Data Science Office Hours, Data Science Connect, Humans of Data Science (HoDS)
- Storytelling with Data: A Data Visualization Guide for Business Professionals
- A Badass’s Guide to Breaking Into Data
- 10 Must Have Data Science Skills
- My Data Science & Machine Learning, Beginner’s Learning Path
- Machine Learning Mastery
- 24 Ultimate Data Science Projects To Boost Your Knowledge and Skills
Follow Inspiring Data Scientists and Professionals
Data science community on LinkedIn is awesome and I strongly encourage you to follow the inspiring data scientists and professionals mentioned below:
- Randy Lao
- Kyle McKiou
- Favio Vázquez
- Vin Vashishta
- Eric Weber
- Sarah Nooravi
- Kate Strachnyi
- Tarry Singh
- Karthikeyan P.T.R.
- Megan Silvey
- Imaad Mohamed Khan
- Andreas Kretz
- Andriy Burkov
- Carla Gentry
- Nic Ryan
- Beau Walker
You made it all the way here?! Thanks for reading.
This is the longest post so far and there is still so much more that I’m eager to share with you (Perhaps in the next post, who knows? 😊).
Don’t let anyone rush you with their timelines
I hope my sharing has answered the burning questions for you. Whenever you face any obstacles in your data science journey, remember that you’re not alone and we’re all here to help as a part of the community. Just ping me up and I’ll be more than happy to help!
Now that you’ve the answers to your questions (Leave your comments below if you have other questions), it’s time to take massive actions towards achieving your goals as an aspiring data scientist. No action is too small to make a difference. Just move forward one step at a time. When you’re on the verge of giving up, PERSISTENCE is key.
I know this was a long post, so please let me know in the comments which parts were most useful for you.
As always, if you have any questions or comments feel free to leave your feedback below or you can always reach me on LinkedIn. Till then, see you in the next post! 😄