1) How did you get interested in working with data?
I think it’s a personality defect. I am sure my parents despaired of me listening to anything without a sound logically constructed argument.
I was never one to work on gut feel and was more of a ‘rationalist’ in my college days – I would never accept anything anyone said without proof, or at least without a debate backed by numbers.
Somewhere along the way, I got into analyzing data just for the heck of it. Cost/benefit analysis, the heuristic optimizations that we do on an everyday basis – these fascinated me. And then I discovered microeconomics and finance – there was a whole world out there that discussed rational decision making in terms of utility functions! Suddenly, when I learnt statistics, things sort of fell into place, the inherent conflicts in my data analysis and methodologies started having a name and a theory behind them. That was a moment of revelation (as much as passing the first stats course was 🙂
To me, data represents a move towards a single truth – a unified view that just ‘is’, the layers and stories it reveals and hides is simply fascinating. Everything that happens, that bugs us, that needs solving, the tools are just there to help us solve, if we have the data. Data science is the medley of statistics meets business meets urgent problems that need to be solved, and that calls out to me.
I didn’t set out to be a data scientist, and I didn’t set out to be a geek (honestly!). But when training meets passion, the possibilities are endless. Add belief to the mix – the relevance of data sciences and its ability to influence policy, business and I think that’s a winning combination.
2) What are your principal responsibilities as a data scientist?
I lead the Stats team at TEG Analytics. My role of to build the team, to make sure we build TEG’s competence in information storage and retrieval, statistical analysis, visualization and in business insights. – I get involved in projects, we brainstorm and innovate, and come up with amazing solutions that are state of art, cutting edge – and with relevance to the business context, the business issue/case we are trying to solve.
3) What innovations have you brought into this role?
The way I perceive my role is probably a little different to the traditional data scientist role. I am also here to invite our talent into a world of wonderful global innovations in machine learning, in AI, in building the next generation or suite of products and solutions that will solve real world business problems, to inspire them to reach beyond their current projects, to read and to upskill with ravenous hunger. I come from a teaching background. I have been a professor in business studies, and I work together with our teams to build a consulting perspective to our solutions across domains.
4) Can you share examples of any interesting projects where data science played a crucial role?
Some recent ones that have been interesting and challenging
1. A brand juice sentiment analysis project. This was interesting because of the complexities in the data and in the interpretation of sentiment scores.
2. A Medicare plan competitiveness analysis based on publicly available CMS data, using which we predicted enrollments in Medicare plans mimicking customers choice models.
5) Any words of wisdom for Data Science students or practitioners starting out?
More often than not, data science is seen entirely as a statistical/analytics effort, or as a business problem where numbers are incidental to the story. Data sciences is cross-disciplinary in nature – we need the stats acumen, and the business insights. Domain knowledge is essential – be willing to invest in it, as long as it takes. Knowing the right program and package is cool; to stitch the story together and influence budgetary allocations is more so.
6) What Data Science methods have you found most helpful?
Common sense, but that’s not really a data science method. I can’t call out a specific method – I personally like to use a judicious mix of parametric and model-free techniques, depending on the case. On a more serious note, irrespective of the method, or the machine learning, or the neural network package, there is merit in covering the basics. A data dictionary, good foundations, EDA and good design of experiments are mandatory. The rest is really going to change based on the task at hand.
7) What are your favorite tools / applications to work with?
I have used a variety of tools. I like Stata quite a lot. I am often asked if R is a better bet than SAS. SAS is a very powerful, accurate tool – its advantage is, if the program runs, the results are pretty much what you are looking for. In R, due to the multitude of packages, it’s easy for beginners to get confused, and the results are more dependent on the programmer’s skill levels.
8) With data science permeating nearly every industry, what are you most excited to see in the future?
IOT and AI are converging in a big way. There is tremendous potential, it’s an exciting field. Geo-spatial data is already big, it will get bigger with drone technology and geo-spatial visualization is a great field to look forward to.
In the sales and marketing analytics field AI/NN models for relevant 1:1 personalization, multi-touch attribution in media efficiencies, hidden Markov models/LSTM for sequence learning in text analysis – these are some of the things to look out for.
9) What lessons have you learned during your career that you would share with aspiring data scientists entering the field?
Three things I believe are important: First – Business trumps statistics, and that’s the natural order of this world. Second -The solution should be as complex as necessary, and no more – it’s important to embrace Occam’s razor. Fast failure is more important than the perfect model.
Third – and most important. There are principles and theories in statistics, information modelling, databases – and there are tools and techniques. It is imperative to keep oneself updated on the tools and programs and applications, but always to relate it back to the fundamentals, the principles and the theory.