Skip to content

Phase 2 Data Science Training - Microsoft Student Accelerator Program 2020

Notifications You must be signed in to change notification settings

NZMSA/2020-Phase-2-Data-Science

Repository files navigation

MSA Data Phase 2

Welcome to Phase 2 - Data Science of the MSA program!

We will be focusing on sentiment analysis for Phase 2, and this will involve training your own classifier as well as using prebuilt sentiment models.


What you will learn

  • Data exploration and preparation using the NLTK package
  • Sentiment analysis using the TextBlob and Vader libraries (part of NLTK)
  • Sentiment analysis using Recurrent Neural Network

Phase 2 Video Playlist

How to set up project

For windows

python -m venv venv
.\venv\Scripts\activate
pip install -r requirements.txt
jupyter notebook

For Mac

python3 -m venv venv
source ./venv/bin/activate
pip install -r requirements.txt
jupyter notebook

Feel free to contact Karim if you run into problems with the code for Mac! :)

If you are new to using a virtual environment for Python (venv) read more about it here.


Resources

These are OPTIONAL resources which will help you understand the content better


Assessment (General outline - refer to the assessment specifications document for full details)

Submissions close on 8AM 18th September 2020.

  1. Compete in the Kaggle challenge here: https://www.kaggle.com/t/eade3863494042b8b7e051aaa9efabd3
    You will have to build your own model for this challenge
  2. Develop a business case and extract data using either the Reddit API or webscraping techniques to solve it. You will need to perform general cleaning, exploration and sentiment analysis in your attempt to solve it.

Stuck?

Post your question on our facebook group or on our discord server

Want to contribute?

We welcome all students to help us improve documentation for other students. If you find a typo or find something is unclear, please open a pull request or an issue and assign it to LindaBot 😀

About

Phase 2 Data Science Training - Microsoft Student Accelerator Program 2020

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published