Interview with Susan Walsh, Founder/MD, The Classification Guru Ltd

We thank Susan Walsh from The Classification Guru Ltd for taking part in this interview and sharing several insights, including:

  • Tips and Tricks of getting into Data Science
  • Common Data Issues and importance of business knowledge while dealing with data
  • Resources and Recommendations

#Getting into Data Science

A trend I do see going forward is more focus on data within organisations, and the emergence of more data engineers. Scientists move over! :)))

– Susan Walsh

How did you first get into data science and what kept you driving forward?

Like many people, it was a complete accident.  My first business had failed, and I needed a job quickly.  I found an ad online for some data classification work for a spend analytics company and thought I’d be ok at the job.  Little did I know that I’d love the work and be really good at it.

Over the next 5 years, I developed my skills, I loved learning and I saw an opportunity in the area of data cleansing and classification, no one seemed to be focussing on it, and it was the area that needed the most attention when working on projects.  And so The Classification Guru was born.


In your opinion, what have been the most relevant breakthroughs in the world of data over the last few years, and what trends do you see emerging going forward?

I’m not sure about the most relevant breakthroughs, there are so many ranging from the medical and scientific communities to the business community, but a trend I do see going forward is more focus on data within organisations, and the emergence of more data engineers.  Scientists move over! :)))


#Dealing with Data and Domain

It’s all about data quality.  Everything starts with that.

– Susan Walsh

What are some of the toughest data issues that you have dealt with?

It’s always data quality.  Typos, misspellings, duplicates… the list goes on.  The toughest challenge around data issues isn’t necessarily the data itself, but getting those who input the data to make sure they are being thorough and careful, and making sure they understand the importance and the consequences of what happens when they don’t get it right.


How important is the domain knowledge of the business/industry in realising the data science use cases to their true potential, and how did you acquire it?

I think it’s important for the person checking the output of the data to know the industry, but not critical for the data scientist.  In saying that however, I think it gives those data scientists that do have some knowledge, an edge over their peers.

How did I get mine?  Well, I had a number of jobs in different areas before I discovered data, which has helped me build a unique range of experience.  For those that don’t have that, I would suggest work shadowing with your colleagues to understand their roles and what they are doing with the data.


What are the top challenges you or other data science leaders currently face in realising the true potential of data science and AI, and how would you go about tackling them?

For me, it’s all about data quality.  Everything starts with that, and if it’s not right, it will affect coding, machine learning, and the outputs, as well as the decisions made based on that information.  But right now data quality is not viewed by decision-makers as a priority or important.  Data quality is an investment, not a cost.


#Recommendations & #Resources

LinkedIn is my greatest resource, I see links to all kinds of information, articles and publications through my connections that I might never find.

– Susan Walsh

What is one book that you would recommend young data scientists to read?

This is tough, as it’s such a broad question.  There are so many different areas of data science so I would suggest reading books specific to your field.  For telling your data story there’s Scott Taylor, data literacy there’s Jordan Morrow, and organisational data culture there’s Jason Foster.  And if you want to go more technical, Packt has a great range of books to cover every area, and finally Kate Strachnyi and Kirsten Kehrer’s Mothers of Data Science.

And look out for my own book in the next few months if you want to learn about fixing dirty data 🙂


How do you keep current with the new developments in data science? What are the top 3 resources that you use?

LinkedIn is my greatest resource, I see links to all kinds of information, articles, and publications through my connections that I might never find.


Tag one or two data science leaders that have inspired you to love data.

Kate Strachnyi, Harpreet Sahota

About The Author

Scroll to Top
Share via
Copy link
Powered by Social Snap