My Journey to Data Science — Part III

Michael Tang
5 min readOct 6, 2020


So far I talked about my motivation to becoming a Data Scientist, as well as my studying plan. In this final part, I will cover my reasons for doing another master degree.

If you have been following my last two posts in this series, you might think that everything was looking great with this self-guided experience, so why school all of a sudden? Well, it sure had been a fruitful journey for me and I could confidently say that I was more knowledgeable than a non-technical person, but it simply wasn’t enough.

Problems with Self-directed Learning

It was already December of 2019 and I did a gap analysis on the job market just to gauge what specific skills companies are looking for. I realized that I had two options. The first one option would be just to apply Data Scientist positions, which was obviously super competitive, especially for someone without a relevant background. On the other hand, I could go for an analyst position and use it as a step-stone for further advancement. This route did not require extensive knowledge base in modelling and potentially have a lower bar to enter. (P.S. I actually did get offered a Data Analyst position at the beginning of this year). However, this could be another detour and might not justify the opportunity cost depending on the position. In either case, I did not have enough experience nor work to showcase my ability. Therefore, I started working on a portfolio. And this was the starting point where I realized some pitfall of my original strategy:

1. Knowledge Gaps

My self-directed journey definitely filled some gaps, but there were still question that I didn’t realize I had until I started working on some these projects. Though these questions mostly resolved around feature engineering, but they were closely related to statistics. For instance, one critical step in producing an accurate and interpretable predictive model is to select or derive features that would optimize the result during training. The least scientific way would be trial and error. And while pair wise correlation might give us an idea of which features convey ‘similar’ information, it simply doesn’t tell us which one to keep. Of course I could take another course or just look for some online tutorials, but these resources were not comprehensive enough to gave me a solid understanding of the statistical workflows that would allow me to apply them in real-world problems. This was one of the many technical challenges that could be addressed only through a systematic training.

2. Lack of a Professional Credential

Last Christmas, I had an opportunity to speak with some people in the industry. One of them was a Data Scientist at Amazon who had a master degree in Data Science form New York University. According to her, because of the recent hype in Data Science, tech companies are more picky about who they hire. And having a professional degree from an accredited school speaks of your capability and differentiates you from other candidates who are just studying on their own. There were successful cases, but very rare. So she highly recommended a master degree in data science. I also spoke with my cousin who’s currently a post-doc in neuroscience. He told me stories of his Phd lab mates who quitted that research lab and became Data Scientists instead. But these people had years of experience working with data, conducting statistical analysis and building advanced neural networks, which made them ideal candidates for tech giants like Google. In the end, he gave me the same recommendation — get some training in a good school.

3. Limited Resources

The opportunities and resources that a university could offer is probably one of its greatest value, and I would argue that they are on a par with knowledge and technical skills offered by lectures. It is true that LinkedIn has made professional networking more accessible to everyone and I could just reach out to people with ‘enthusiastic’ messages, but it still cannot compare to the relationships that I could build with my cohort, professors, supervisors, and alumni in an academic setting. These are the people that I will be interact with on a regular basis and will more likely turn into a mutually beneficial relationship. Last but not the least, finding an internship is generally easier than a full-time position, which could only be applied if you were a student. So having that ‘student’ status really grants more professional opportunities.

These were the 3 main factors that motivated me to go back to school. And I was fortunate enough to get the support from the people I love and be admitted by the MIDS (Master in Disciplinary Data Science) program at Duke University.

About MIDS at Duke University

So how’s school so far? Well, it’s only been over a 1 month so I couldn’t make much comments. Plus the learning experience has completely changed since pretty much everything went virtual due to COVID19. I can’t speak for other programs or other schools, but MIDS has done a wonderful job in making sure that this transition is as smooth as possible and we get all the resources we need. And one advantage of having a small cohort is that it made it really easy for us to have a 1-on-1 interaction with professors, and for us students to build stronger bonds. So my verdict is, I have been enjoying the program so far and I’ve definitely learned a lot. There are courses such as Data Representation that addressed many problems that I had in the past and taught me how to interpret a model, which is sometimes more important than making predictions. And this goes back the question of why I started writing on Medium — I want to use it to review some concepts that I learned so far, and explain them in the simplest for readers who shared the same confusions.


That concludes my journey so far. I have made some sacrifices but I have also gained a lot. This journey will continue and it will not be an easy one, especially with the pandemic, but I will cope with it in the best way possible. Like I mentioned, I will use Medium as a way to consolidate my knowledge and share more stories. I hope other will find my posts useful and inspiring as well.




Michael Tang

M.S. in Data Science candidate, 2022 @ Duke University | Biomedical Engineer | Workout Enthusiast