Countries Visited India, US, UAE, Australia, Singapore, New Zealand, Japan, Costa Rica, Taiwan, Dominican Republic, Mexico, Ireland
Data Science in Production There is a lot that goes behind developing a model such as data cleaning, analysis, statistics, modeling, accuracy analysis etc. Even with all that, developing a model is just the tip of the iceberg when it comes to delivering a machine learning solution in production. I spoke at OSCON 2018
Serving Athletes* with Personalized Workout Recommendations My post on Workout Recommender system is now up on the Nike Engineering blog - https://medium.com/nikeengineering/serving-athletes-with-personalized-workout-recommendations-285491eabc3d The post provides insight into the Science and Engineering behind the machine learning system that powers the personalized recommendations in Nike Training Club app. Check it out and share your
Tuning Spark Jobs on EMR with YARN - Lessons Learnt Apache Spark is a distributed processing system that can process data at a very large scale. Even though Spark's memory model is optimized to handle large amount of data, it is no magic and there are several settings that can give you most out of your cluster. I
Is She Even a Developer? It has been 6 years since I overheard this question/comment about me. My co-workers were reviewing my pull request. One of them got frustrated and thought that it was appropriate to discuss the authenticity of my Computer Science background. I overheard his comment and also the laughter of the
Cross-Account S3 bucket settings for data transfer on Hadoop based systems While trying to write some data from one AWS account to another, I ran into several cross-account S3 settings issues. Google was coming out thin on my searches, hence documenting it in case somebody else runs into this. Problem Account 1 (let's call it Dumbledore) has a S3
Strata + Hadoop 2016 Conference in NY This past September, Nike approved my request to attend the Strata + Hadoop Conference [http://conferences.oreilly.com/strata/hadoop-big-data-ny] in New York. Strata is one of the biggest conferences for Engineers who are interested in Machine Learning and Big Data. The three day long conference covered various workshops and sessions