Dipanjan Sarkar, or DJ as we all like to call him, is one of the leaders and influencers in Data Science who has been doing a lot for the community- be it through live sessions or courses. This interview with DJ is one of the most enriching ones as he unfolds all his learnings throughout his career spanning multiple types of works.
Dipanjan has been extremely generous in answering all the questions in-detail which will help you adjust your focus and know which skills to aquire to make it big in Data Science. He shared several insights like:
- What skills set great data scientists apart
- Why there is more to Data Science than just technical skills
- Importance of building your personal brand
- What skills he looks in a candidate for
So.. sit back and enjoy!
Platforms like LinkedIn and Twitter if used right, can definitely help you build a solid network of people who can not just help you get your next job but also help you towards gaining that exposure and experience you need to even make it on your own!-Dipanjan Sarkar
A data scientist should be solving problems with data, period. This can include the ‘science’ as well as ‘engineering’ aspects but there is a certain limit to what they can and should be doing.-Dipanjan Sarkar
Soft skills and effective presentation skills will make you not just sit and develop code day in and day out but be able to actually make an impact by presenting key progress and findings which will help you showcase yourself as someone who can do both – stakeholder management and doing data science.-Dipanjan Sarkar
CK: At what point did you realise that you wanted to pursue a career in data science (data & AI), and how did you get into it?
DJ: I have always been interested in computers which made me take up a bachelor’s degree in computer science and engineering degree. In my coursework, I encountered some interesting electives like data mining, artificial intelligence, neural networks and soft computing which definitely piqued my interest in the world of data analysis and artificial intelligence. This made me go for a master’s degree specializing in data science and software engineering. I would say that was probably the point that made me realize this is something I definitely want to do going forward.
My master’s degree, coupled with a lot of projects and additional courses I leveraged from massive open online platforms (Coursera was just getting started then!) enabled me to focus and apply for relevant jobs which were related to data science and analytics. Getting an internship at Intel also definitely helped me get my foot into the door.
CK: How being an influencer in the Data Science space has helped your career evolve? How important do you think is creating a strong digital presence on LinkedIn/Twitter and ultimately an influencer in this field?
DJ: I’ve always wanted to share my knowledge with the community and help out people on a large scale to help them learn from my mistakes because I never had a mentor when I was navigating this field. Having that passion to share your knowledge and communicate effectively can be very useful to becoming an influencer.
I started small by networking with other influencers in the field when I was starting out, attending events, conferences and learning everything there is about the data science landscape. Then gradually as I gained expertise in this field, I started sharing my knowledge in the form of articles, papers, books, webinars, conferences, workshops and various collaborations with the industry and academia.
I would say now that being an influencer has definitely helped me take my career to the next level where I don’t just need to depend on a traditional 9-to-5 job but I can collaborate and work with organizations across the globe doing what I love – consulting, training and building applications in data science.
I am someone who usually stays off most social media platforms because of too much noise and unnecessary distractions. However platforms like LinkedIn and Twitter if used right, can definitely help you build a solid network of people who can not just help you get your next job but also help you towards gaining that exposure and experience you need to even make it on your own! Building a digital presence and brand for yourself is very important where you can leverage platforms like LinkedIn, GitHub, Medium, Kaggle, YouTube and more. However, the most important thing should be your passion to share your knowledge to benefit the community.
CK: You’ve been a Beta Tester for Coursera for a long time. Is there any parameter where the MOOCs are still lacking? If yes, how can they improve?
DJ: I still remember taking courses from Coursera and EdX when they were just starting out. Forums used to be more active then and people genuinely enjoyed having discussions besides doing the course. Today almost all the MOOCs have forums that are pretty much dead. The interaction element is completely missing in MOOCs and sadly peer assignments don’t make up for it. It just ends up with people desparately posting their assignment links for peers to review before the deadline, making it just another thing to complete.
MOOCs can improve if in some way we can enable a good discussion between the students as well as the instructors. I have taken executive education programs where this has been possible to an extent with pre-recorded content and forums being there and also live sessions and office hours held periodically with the instructors and TAs. We also need a proper quality check for many courses in popular MOOC platforms because I have seen courses that could do with a proper revamp.
CK: You’ve been associated with fortune 500 companies as well as start-ups. Which one according to you is better for a beginner to join with respect to the opportunities, exposure and overall growth?
DJ: I think this would completely depend on the aspirations of said beginners.
- If you are looking to learn more in a short period of time, you are alright with wearing multiple hats, you don’t like unnecessary bureaucracy and also perhaps are ok with pushing yourself to the limits when it demands, working in a startup would be ideal.
- If you are perhaps more interested in growing your career in a slightly more stable job, perhaps have a better work-life balance with usually fixed hours and you are ok with sometimes bureaucracy slowing things down (which can sometimes be necessary), working in a large organization is more suitable.
- Having said the above, sometimes it can completely change based on how your team is structured. The main thing is to go for a role which suits your interests, the team should have a good and fostering management eco-system and you should be able to grow while doing what you love.
CK: What is one book that you would recommend young data scientists to read?
DJ: This is definitely hard to say. I will put in a few based on their specific focus areas.
- Data Science for Business, Foster Provost and Tom Fawcett to gain that overall big picture of the what, how and why of data science.
- The Hundred-Page Machine Learning Book, Andriy Burkov to dive into a no-nonsense essential guide into machine learning.
- Hands-on Machine Learning with Scikit-Learn and TensorFlow, Aurélien Géron to gain the necessary hands-on skills in machine learning.
CK: What has been your role as a Google Developer Expert in Machine Learning?
DJ: Being a Google Developer Expert has been definitely a phenomenal experience for me because of the following:
- Being recognized by Google as an expert in the field helps me gain easy access to engaging with the community as well as organizations to work on AI initiatives.
- Helps me be a part of conferences, events and workshops to share my knowledge on data science and AI at a much larger scale than what I could do previously on my own.
- Being able to engage with Nooglers and even have discussions on how existing and future products and platforms in Google’s suite of offerings can be improved.
- Having a solid network of Nooglers, fellow experts in machine learning helps me learn and in-turn share my knowledge by collaborating in various global events on machine learning.
CK: What skills and attitudes do you look for when hiring data scientists?
DJ: I would say the following skills matter the most:
- A genuine passion and positive mindset to work with data and derive meaningful insights.
- Having an analytical, problem-solving mindset – No one is going to hand you some clean data and tell you, “build a classification model”. It will be up to you to figure out the best way to frame and solve the problem.
- Solid expertise in understanding and handling data of different types – structured and unstructured.
- Hands-on skills and in-depth conceptual knowledge on basics of linear algebra, statistics and machine learning.
- Having experience in deep learning can be a plus but not always mandatory unless the role is specific to deep learning problems.
- Expertise in problem identification, formulation and solutioning.
- Data engineering skills are often useful to be self-sufficient and work hand-in-hand with the engineering team during deployments.
CK: What are the aspects that you love and hate about Data Science?
DJ: What I love about data science:
- The ability to take really messy data and build something beautiful out of it – a dashboard, an ML product or even a simple, yet effective visualization.
- Being able to build intelligent systems which can continuously improve over time with more data instead of me breaking my head trying to manually find useful patterns and hard-code necessary logic into the system.
- The ability to bring together people and organizations to solve really complex problems across literally every domain including healthcare, logistics, retail, fashion, oil & gas, infrastructure, finance, … I could go on!
What I hate about data science:
- Data science is not easy and it is not sequential in nature. Organizations or business teams often don’t get that.
- It is often a supportive enabler in most organizations and showing value can be challenging till you have perfect harmony between the business and technical teams – this is a tough nut to crack.
- The term “data science” itself has become hyped up and misused over several years now and we need to ensure people venturing into data science have a clear understanding of their roles and responsibilities and are not misled into doing something they don’t want to do. I am absolutely serious about this when I see articles by “data scientists” trying to justify their title and say that installing and maintaining software packages and databases are a part of being a data scientist. Sorry to burst the bubble but we have system administrations and database administrators. A data scientist should be solving problems with data, period. This can include the ‘science’ as well as ‘engineering’ aspects but there is a certain limit to what they can and should be doing.
What I love and hate about data science:
- This field is rapidly changing and we are getting new techniques, models and libraries almost every other week. I am definitely excited about where we are heading, however, it is very hard to keep up with everything often!
CK: You have consulted and mentored people from all the stages such as fresh college grads to C-level Executives, VPs, Directors and PhDs. In what all aspects are they similar to each other and in what aspects are they different?
DJ: Similarities in consulting and mentoring people of diverse backgrounds and seniority levels in their careers:
- They have a passion and zeal for learning all there is to data science so that they can get better at their current job, be able to get a new job or start building something of their own.
- While they are experts in their own field, they are humble enough to be mentored by someone (who can sometimes be much younger than them) to gain perspectives in a new field.
Differences in consulting and mentoring people of diverse backgrounds and seniority levels in their careers:
- Fresh grads or people in the early stage of their careers are often more eager to learn a lot of things together to showcase and build themselves up for getting a job in data science.
- People who are in a senior position in their career will be more focused on learning what is essential to either solve a problem with data science or pivot their career because they will also be busy with their daily life as well as current job responsibilities.
- C-level Execs and VPs will be laser-focused in knowing exactly how data science can be leveraged as an effective tool to solve business problems and yield value. These are no-nonsense conversations and consultations where you need to be very well prepared.
- People from academia can sometimes struggle when trying to make a transition to the industry so you need to have the relevant context and mindset to enable them to work towards that instead of telling them to not be in the industry. I have seen a lot of people from academia go on to be excellent data scientists in the industry!
What are the most important aspects of succeeding as a Data Scientist on the job? What factors separate ordinary Data Scientists from extra-ordinary ones?
The aspects which can make you grow faster and make you shine as a data scientist would include the following:
- Always know and realize the “big picture” – this will help you contextualize any problem and solve it in the most effective way vs. treating it as a traditional data science problem and solving it mechanically.
- Soft skills and effective presentation skills will make you not just sit and develop code day in and day out but be able to actually make an impact by presenting key progress and findings which will help you showcase yourself as someone who can do both (stakeholder management and doing data science).
- Problem-solving, conflict resolution, stakeholder management and leadership skills are something that every data scientist should inculcate over time. This will enable them towards growing into leadership positions in the long term and often much faster than others.
- Knowing when to stop – perhaps the most important power needed, as a data scientist. You should know based on the business problem, the success metrics, the time at hand for the project as to when you should stop your analysis and modelling activities and say, “this is good enough” or “this is the best which is possible with the current data and models available”. Do not fall into the trap of over-optimizing or over-tuning.
- Being able to influence people – this can be a very vital skill especially towards stakeholder management because you need to be able to speak their language to convince them what is possible and what may not be possible. It is always better to be upfront about things rather than saying yes to everything.
If you notice the above points carefully, you will observe that I have rarely mentioned anything to do with technical skills. This is because anyone and everyone can pick up technical skills if they spend enough time. But the above skills are something that happens with time, experience, having the right mindset and attitude.
CK: If you could build a team of 5 Data Scientists(including yourself) to solve the greatest Data Science challenges in real-life, who all would they be and why?
DJ: This is a tough one! Besides me, I would say the following folks would make a pretty good team!
- Raghav Bali – He has been my partner-in-crime in publishing books, consulting projects, conferences, events and can pretty much do anything we might want to work on.
- Sayak Paul – Another excellent gem who has solid expertise in deep learning and can translate almost any research into working code. Very useful if you want to dive into using state-of-the-art models to solve complex problems
- Sudalai Rajkumar – A close friend and because of course, you need a Kaggle Grandmaster!
- Srivatsan Srinivasan – Another friend and someone who is so hands-on he can pretty much conceptualize how to solve any problem and make a concrete solution for it.
CK: Tag one or two data science leaders that you would like to see answer these questions.
DJ: I would definitely want to hear from these folks:
- Srivatsan Srinivasan – Very hands-on and skilled and has a wealth of experience in all things pertaining to data science and engineering.
- Vin Vashishta – One of the most down-to-earth people having excellent insights in data science.
- Andriy Burkov – A skilled leader in data science. I am sure he will have some excellent insights to share.
- Cassie Kozyrkov – She proves the fact that you are a true expert and leader if you make the toughest things to be the most simple things to understand.