Interview with Rohan Rao, Data Scientist at H2O.ai

Rohan Rao is a data scientist, a Kaggle Grandmaster, and a 16-time National Sudoku/Puzzle Champion [Wikipedia].

We thank Rohan for taking part in the Data Science Interview Series 2020 and sharing several insights on the workflow and attitude of a top perfomer in data science, including:
1. How he keeps learning in an extremely rapidly advancing field.
2. His views about “staring at data”.
3. His encouraging advice for people who want to get into data science.

At what point did you realize that you wanted to pursue a career in data science, and how did you get into it?

I grew up being fascinated with numbers. Never a day would pass by in my childhood where I wouldn’t share a numerical fact.

There was that one day in my life when I had an admission confirmation for the MBA programme as well as a confirmed seat for the MSc programme. While I was breaking my head over which one to pick, my father asked me, “Do you want to be a specialist or a generalist?” That question cleared out the fog in my thoughts and I’ve never looked back since.

After completing my post graduation in Applied Statistics I took up a job in a machine learning consultancy firm and worked with various clients trying to solve data science problems. The work involved a lot of practical applications of mathematics and statistics on real data and when I saw the impact and value it added to businesses I knew this is what I wanted to pursue.

How is data science used to create value in your current project(s)?

Data Science is used in a wide variety of ways across our H2O products and platform. Whether it is automating a machine learning model or extracting insights from a dataset or evaluating business impact through data-driven decisions, we try to let the data speak for itself in as many ways as possible. The H2O ecosystem provides various tools to squeeze the most out of data and combining it with some amount of human intuition and intelligence is a powerful and successful combination which I work on enhancing and improving each day.

What is one of the best investments that has propelled your data science career the most?

Time. And opportunity cost or sacrifice is what goes with it hand-in-hand.

Data science is vast and there are always new things to learn in this rapidly advancing space. When I look back over the years, the sheer time I spent on understanding data, playing around with ideas, working on Kaggle competitions, marrying business objectives to data science pipelines has been invaluable in growing my career. I also find this a recurring trend in many of the top data science professionals I know and it is incredible to work together trying to push the limits of innovation in the field.

How do you keep current with the new developments?

Being active on Kaggle helps me implement many of the latest developments on actual datasets. Following some of the top Machine Learning professionals on Twitter and exploring the internet / research papers / open source projects helps me being updated with most developments related to my work.

What are the top challenges you currently face as a professional data scientist, and how do you go about tackling them?

Getting the buy-in of business for a data science solution and maintaining solutions in production would be the two biggest challenges I’ve faced as a data scientist. There are a lot of fancy applications of data science but the ones that really make an impact are those that align with business. And it is hard for getting them together and being on the same page.

Similarly a lot of work in data science can be done offline without getting into a live production system. The ability to ensure and optimize the models and solutions such that they perform as per expectations and are stable in production environments over a period of time is still an area where a lot more focus and effort needs to be put in. It’s no wonder that a large number of models end up failing when put to it’s true test.

How important is the domain knowledge of the business/industry you’re in as a data scientist, and how did you acquire it?

Having worked in a lot of different industries, my experience of domain knowledge has grown by reading about them and discussing ideas and approaches with experts in the fields. Its importance does vary but it helps in understanding the context of a data science problem better leading to more useful decisions on how to go about structuring the solution.

While in many industries I haven’t found it to help much with the data and features, it is incredibly useful in other aspects like choosing an appropriate evaluation metric or defining the validation framework.

What unusual or absurd thing do you practice or advocate for in your profession as a data scientist?

Stare at data. I wouldn’t call it absurd considering I have won data science competitions by doing just that.

But what I really mean is to spend a large amount of time on looking at the data in as many ways, visuals, dissections and transformations as possible. Sometimes the simplest solutions are right there in front of your eyes. Only if you’re looking at it.

What inspires you about working in data science?

I am a problem solver by heart. It irks me to see an unoptimized process or an unsolved puzzle. Data Science gives me that outlet to solve them and be at peace.

What advice would you give to someone who wants to get into data science today? What advice should they ignore?

Get your hands dirty. Be willing to put in a lot of time, effort and motivation to learn, practise and build data science skills.

There is no bar or pre-requisite to get into data science. It is a wide umbrella with various areas of expertise and there has always been a sweet spot that every Data Scientist can find to fit into.

About The Author

Scroll to Top
Share via
Copy link
Powered by Social Snap