Starting with 2019, the interest in Data Science education and getting an accreditation skyrocketed. Have a look below at the Google trend of the two search terms: Data Science course vs Data Science certificate: This shows that many people are looking to get formal training on Data Science. Generally speaking, a technical certification will be somewhat attractive on a CV, but a certification alone will not secure you a role. The majority of Data Science interviews will have at least one technical test and multiple discussions. Some interviewers might even question you more on the topics of the Certification. To boost my confidence in my Data Science skills, I also decided to pursue a Data Science Certification. I did my “Google research” and I was pleasantly surprised by the results: DELL EMC program scored high in the top Data Science certifications search. This meant for me, as a Dell employee, that I was able to access multiple learning materials to prepare for the exam. Structure Dell offers a two-level Data science certification: Associate and Specialist level. The Associate level exam consist of 60 questions and you have 90 minutes to answer them. The minimum score to pass the exam is 63 and the topics assessed are: MapReduce (15%) MapReduce framework and its implementation in Hadoop Hadoop Distributed File System (HDFS) Yet Another Resource Negotiator (YARN) Hadoop Ecosystem and NoSQL (15%) Pig Hive NoSQL HBase Spark Natural Language Processing (NLP) (20%) NLP and the four main categories of ambiguity Text Preprocessing Language Modeling Social Network Analysis (SNA) (23%) SNA and Graph Theory Communities Network Problems and SNA Tools Data Science Theory and Methods (15%) Simulation Random Forests Multinomial Logistic Regression and Maximum Entropy Data Visualization (12%) Perception and Visualization Visualization of Multivariate Data I recently (in January 2022) took my Associate level one and I am currently studying for the Specialist level, so it is an ideal time to write about my learning and exam experiences. Learning The official website page for the exam and course info is this. Here you will find details about the On Demand classes they offer, exam link and practice tests. You can also see more sample questions here and additional online practice tests. The Data Science and Big Data Analytics course prepares you for the Data Scientist Associate v2 (DCA-DS) Certification. Once you pass the exam, you receive a Dell Technologies Certified Associate(DCA-DS) Certification. Why is the Data Scientist Associate v2 (DCA-DS) a good certification for a junior data scientist: Going through the topics included in the material will give a good foundation of data science terminologies. It gives an intro into what big data is, the most basic algorithms, and an understanding of the responsibilities of a Data Scientist and the data science lifecycle. Learning all of this will enable immediate and effective participation in big data and other analytics projects. You’ll be hands-on Hadoop (including Pig, Hive, and HBase), Natural Language Processing, Social Network Analysis, Simulation, Random Forests, Multinomial Logistic Regression, and Data Visualization. The labs will prepare you to do data processing, apply algorithms and run data visualization in R. It will empower you to keep on studying and move forward to get the next level certificate as DCA-DS Certification is a prerequisite for DCS-DS. The Advanced Methods in Data Science and Big Data Analytics course prepares you for Specialist – Data Scientist, Advanced Analytics Version 1.0 (DCS-DS) Certification. (Note: I participate in the affiliate amazon program. This post may contain affiliate links from Amazon or other publishers I trust (at no extra cost to you). I may receive a small commission when you buy using my links, this helps to keep the blog alive! See disclosure for details.) If you don’t have availability to sit in a class for a full week (8 hours a day), you can study for the exam at your own pace. Dell Emc published the below book to help you prepare for the exam. It is rated very high and it’s now discounted on Amazon: When are you getting one of this? If this is not motivational enough, I’ll leave below an interesting Ted Talk on the influence of social network (one of the topics of the course / exam) and you’ll see why Data Science is so cool: This is a personal blog. My opinion on what I share with you is that “All models are wrong, but some are useful”. Improve the accuracy of any model I present and make it useful!
Category: Data science
I guess you’re already familiar with R and have a sexy Data Scientist job but you’ve heard the cool kids in the industry are also using Python. I said also, because as you can see from people’s job titles, Reddit ,LinkedIn and so on, there are plenty Data Scientists that use both. You have probably figured that I’m an R enthusiast myself. I realized a long time ago that I also have to learn Python (I wanna be a cool kid as well). I struggled at first. Why? My current and previous roles were very much R based. Nobody used Python. 1st try: I knew it’s easier to move into this direction together with the team, so I proposed that all 5 of us (Data Scientists) on the team go ahead and take a Python course. One colleague was on board. In the summer of 2018 we went and sat in on an in class Python training . The training wasn’t too bad, but it was general Python programming, not Data Science specific. In order to get the diploma we had to do an assignment, so we did a Data Science one and started discovering a bit of Python for Data Science. I can tell you that it was a bit painful, using pandas was just weird. I like how this guy describes the experience in the Reddit post below: This is pretty much the whole 2018 Python experience that I had. And I stopped. 2nd try: In 2019 I went on maternity leave to take care of Baby B (my 1st born). I felt like it was the perfect time to really learn Python for Data Science and put it on my CV. I bought a bunch of books. I also enlisted to a number of really good Pluralsight courses. Pluralsight is golden for a Data Scientist / ML Engineer etc: (Note: I participate in affiliate programs. This post may contain affiliate links from Amazon or other publishers I trust (at no extra cost to you). I may receive a small commission when you buy using my links, this helps to keep the blog alive! See disclosure for details.) Doing Data Science with Python Understanding Machine Learning with Python Pandas Fundamentals Exploring Web Scraping with Python Start a 10-day free trial at Pluralsight – Over 5,000 Courses Available I would say that in 2019 I got a good grasp of how Python works (compared to R): Python is a general programming language, that also knows how to do what R does best 😜: data wrangling, engineering, feature selection, etc and that pandas is trying to copy dplyr; There are many IDEs where you can write and run Python code and that the majority use Jupiter notebook (yuck 🤢); It’s a mess if you don’t know how to use a specific Python version and virtual environments; Python is pretty strong on deep learning, deployment and production of models; 3rd try: In 2021, I went back to work and I jumped on the first Python project in the team. In the meantime, another Data Scientist colleague switched to Python and built a robust neural network project for fraud. It was the perfect opportunity for me to put my new skills to test and see if my efforts in learning Python are paying off. They have: I was able to Validate the Fraud Project and give some recommendations. For every big project we have a Development and a Validation part – so that at least 2 Data Scientists are involved in one project. Was I able to develop my own project in Python? No, because I went on maternity leave (again 😅). 4th try: We’re now in 2022 and I think I found something that is super helpful in switching to Python: RStudio is now home for both R and Python (hurray!! ). How cool is this? When I found out, I jumped straight into updating my RStudio in order to be able to run Python. For the initial setup, just follow the official RStudio guidelines. Do I think this helped? Oh yeah – it helped me a lot. So much so that I even decided to even start blogging about my mummy & Data science experience. Python does not look so scary anymore: One major difference is that now you have to “pip install” your libraries in the Terminal , not in the Console / Script, the way you would do it in R – not a big deal. So, RStudio as the IDE + a personal project = the way to go. You know what they don’t say – 4th time is a charm :). I can thank RStudio for making me fall in love with Python. Now I’m trying to learn OOP for Data Science. I’ll update you soon! This is a personal blog. My opinion on what I share with you is that “All models are wrong, but some are useful”. Improve the accuracy of any model I present and make it useful!