Samuel Kamande is a Data Scientist at Nielsen and his presentation will focus on “Paradigm Shift in Research”.
We caught up with him and he shared a lot about his work at Nielsen, some of the projects he has worked on like “Digital Divide project in Trinidad and Tobago in 2013”,thoughts on the future of Data Science and something on Baidu’s Deep-Learning System among other things.
We’d like to hear your story of how you got into data science. What motivated you to work in data science?
I am a statistician by training (MSc. Statistics, University of Nairobi). One person quipped that a Data Scientist is a statistician living in San Francisco or using a mac – By those definitions, I guess I am still a statistician. Transforming data into information, knowledge and insights has been the key source of motivation for me. In my 3 years of working with data, I have seen various clients across various industries and disciplines as well as internal management teams make important and inflectional decisions based on the insights deduced from small and large data sets. It has been fulfilling and has kept me going – solving problems, identifying opportunities, predicting the future in the midst of uncertainty – just to name a few. I have had the pleasure of working in a few positions that have exposed me to both practical statistics and programming, necessitated by the huge amounts of data involved. That has also constantly maintained my zeal. In the wide field that is Data Science, there is always something new to learn and try out every day.
How does a typical day as a Data Scientist at Nielsen look like? What tools and algorithms do you use often and which are your favorite?
A typical day for me involves supporting the Client Service teams in provision of technical consultation to solve Data science related client queries. It also involves provision of technical support on product enhancement and improvement for various clients in Africa. In addition to the in-house platforms, VBA, R, SAS Python (increasingly) are often used. R is my personal favorite tool. I have overtime delved deeper into its amazing capabilities and I am still exploring, ever since I ran those 15 lines of time-series modeling code in sophomore year.
How did you acquire these skills?
Aside from my university education, I have had to constantly work on my skills based on various projects. I had the opportunity to work with amazing programmers in huge projects at mSurvey. These projects needed a lot of statistical rigor and algorithm. Additionally, my current job provides the opportunities to work with a very talented pool of Data Scientists from across the globe to solve problems, and I have done my best to leverage on that. There is also freedom to explore, innovate and self-disrupt. Needless to mention that various online courses and a lot of practice has been and is still paramount for me.
What is the most interesting Data science project you have participated in?
There are many, but my most interesting would have to be the Digital Divide project in Trinidad and Tobago in 2013.. To measure the Digital Divide in Trinidad and Tobago we administered the first survey of its magnitude on mobile with inbuilt filters to ensure representativity and efficiency. We further automated the calculation of the Digital Divide Indices to avail real time visibility to the stakeholders. I was still a very raw statistician, and this gave me exposure to both design of big research studies as well as working with algorithms and various tools.
Data science is very hot at the moment. The field has received considerable media attention lately. How in your opinion has data science and big data changed the world?
Decisions across companies and governments are increasingly being made based on data and not only gut intuitions, experience notwithstanding. That for me is the biggest stride. It cannot be said enough that we have so much data with us, and it is only right that we use it to better the world. Data Scientists are changing the world. Algorithm by algorithm, model by model.
There are a lot of exciting things happening in the field like Google open-sourcing its Tensor Flow machine learning library so have other companies like IBM. What is exciting you most at the moment in the field? What problems look most promising to be solved using ML?
Personally, two fronts really excite me; one is around the accuracy of ad targeting seeing as sufficient data is already available. The second is around the ability to better predict consumer behavior based on data from across platforms. For these, we’ll also see the move to more prescriptive outputs, more recommendations from the data. There are other fascinating fronts as well – Like Baidu’s Deep-Learning System that could rival People at Speech Recognition
With all this attention and interest, 5 years from now, how do you think the DS field WILL look like?
SEXIER (Quoting Val Harian, Chief Economist at Google). We’ll take over the world. Seriously though, I think every single critical decision across fields and countries will be made based on data. And that will need Data Scientists. The likes of Dj Patil already set this in motion, and we’ll reap generously from it going forward.
Has data science impacted your day to day life?
Yes – I now tend to approach problems differently due to the numerous tools at my disposal. The experience accrued from solving problems also ensures that the next problem is solved more efficiently, thus freeing up more time for innovation and skill improvement. Google, through their simple ML algorithms, have also made my life a bit easier.
We are very excited about your presentation. May I ask, what can we expect on 3rd March?
Coming from a statistical background, and still working in a heavily statistical environment, my main interest right now is the evident paradigm shift from traditional sample research and structured data, to an integrated approach of that and big data. Being the first meet-up, I will present my perspective around this, and then open it up for other points of view from across industries. This will provide the basis of the tangent on which we’ll take the meet-ups thereafter.
My high school teacher always gave us reading assignments before her next class. Any homework for those that plan to attend to ensure the meeting is very interactive?
I would like attendees to think about the various topics and areas they would like to leverage on from the Data Science meet-ups so that the organizers source for speakers with this in mind going forward. The discussions will get more technical, and we’ll have various subject matter experts making presentations. That is purely contingent to the recommendations of the attendees.
If you could give 1 piece of advice to your younger self about Machine Learning, what would you tell him?
Extend the Statistical modeling concepts (Regression, classification, PCA/FA, Outlier detection etc.) to bigger data sets and use ML algorithms. I would not have to do it in retrospect like I am doing now. Then, the knowledge was raw and I had more time in my hands.
Advice to anyone who is interested in Data science?
Learn something new every day – statistics and programming. It does not just end there, put it into practice. There is sufficient material online (Coursera, Lynda etc).
Have you watched star wars?
Yes. The force awakens was my first. Hopefully I am not too late to the Star Wars party.