My Journey to Data Science — Part II

Why the Career Change
Previously, I talked about my experience with data-driven techs. As an engineer without any machine learning knowledge, these experiences made me realize how imperative data is. More importantly, it gave me an opportunity to re-evaluate my career plan at the time and the medical device industry in general. This ultimately led me to where I am right now:
I had a love and hate relationship with the medical device industry. First of all, I spent my undergrad and dedicated an entire graduate degree in this field. If it wasn’t passion, I don’t know what else it could be. Additionally, a medical device does not have to be life-supporting, but it truly feels motivational knowing that you are working on something that will enable or augment someone’s life in a very significant way. However, what that implies is that there are associated risks that companies need to be mindful of. These risks, depending on severity, need to be thoroughly mitigated and validated before commercialization. For consumer products and electronics industry, development normally involves an iterative process where a prototype is put together in a reasonably short time window, tested and revised. Agile methodology in software development also follows a similar structure to optimize software usability. In contrary, the development lifecycle of a medical device puts safety as the absolute priority and can last for couple of years. The slow-paced setting eventually became tedious to me despite my enthusiasm for the industry. On top of that, I was dealing with a lot of quality control and regulatory related tasks, which were indispensable but not the most interesting. During a career counselling session with my dad, I recognized the growth limitation on the path that I have chosen, and the harsh truth that I did not enjoy my work. But I was still thankful for this opportunity as well as all the wonderful people that I acquainted. I learned many valuable skills that I can bring to my future workplaces. The conversation also made me realize that I had a desire to do more programming and data orientated work. This self-reflection in addition to my fascination with data-driven technologies pointed me to Data Science.
One thing I want to clarify isthat my decision to becoming a Data Scientist was not a farewell to the healthcare industry. To me, Data Science was merely a ‘tool’ that could be applied to any business or domain as long as it’s done ‘properly’. While I’m still interested health, I would prefer working on projects that involve biometrics data rather than conventional eletrically or mechanically operated medical devices. And of course, I also want to explore Data Science applications in other industries.
Self-directed Learning
I started this transition in March of 2019 and I had a rough plan. The plan was simple and probably identical to most autodidacts — read books, take courses online and build a portfolio. I knew there were quite some knowledge gaps that needed to be filled. For instance, I learned that coding in data analytics requires different skillsets even though I had prior experiences in Python and MATLAB. So I first dabbled with the book Python for Data Analysis to learn how to use pandas package for data manipulation. This book was written by Wes McKinney who was the creator of pandas. He pretty much covered all the functions that pandas had along with tons of detailed examples, definitely an easy recommendation. I also didn’t feel confident with my knowledge in statistics, so I went for Naked Statistics: Stripping the Dread From Dread from the Data by Charles Wheelan. It was a good introduction to key statistical concepts such as hypothesis testing, confidence intervals and was able to present them in an easy yet relatable ways. However, it was neither theoretical nor applied, which make it not so practical. Another book I really enjoyed was Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking by Foster Provost and Tom Fawcett. This book unveiled underlaying concepts behind data science and delved deep into data-driven mindsets with real-world business problems such as customer churn. The book wasn’t super technical, but there was enough principles and algorithms to keep my brain actively thinking. And I highly recommend it to another who’s interested in Data Science.
While books are great resources of information, there were certain topics such as machine learning that would be better learned from an instructor. And that was my intention for taking courses on Coursera while having all these other stuff I described going on. Of course, I began with Andrew Ng’s Machine Learning course, which became the ‘standard’ for any aspiring Data Scientists. The course contained detailed walkthroughs of many classic ML algorithms as well as advices on building a ML system. The hands-on exercises gave me a chance to implement these algorithms from scratch, which further strengthened my understanding. I also took the IBM Data Science Professional Certificate specialization, which is more applied and it allowed me to practice with packages like scikit-learn and using them for Exploratory Data Analysis, model fitting and making predictions. This specialization also taught me the basics of relational database and how to effectively query data a database. To get myself started with the world of neural nets, I took the Deep Learning Specialization, which was also taught by Dr. Andrew Ng. Andrew did another fantastic job in demystifying CNNs and RNNs. This specialization was more advanced than the Machine Learning course and required a solid foundation in linear algebra. But it gave a robust understanding of different types of neural networks as well as various optimization strategies. I did take some more courses as as the TensorFlow Developer specialization, but these to me were the most valuable courses at building my knowledge base. In conclusion, they all offered me skills that were crucial to a Data Scientist, and allowed me to secure opportunities that I would otherwise not have been able to, and to have a head start in my current graduate study.
It’s hard to condense months of learning into just 2 paragraphs, but I hope what I have here is useful for anyone who’s interested in the books and courses that I mentioned.
Stay tuned for the next part :)