SI 618 - Data Manipulation and Analysis

This page contains my course work from SI 618 (Winter 2022)

  • Programming language: Python
  • Frameworks / Library: Pandas, Numpy, Matplotlib, Seaborn, Plotly, SciPy, Scikit-learn, Statsmodels, NLTK, spaCy
  • Topics:
    • Dimension reduction: PCA(Principal Component Analysis)
    • Clustering: K-means, T-SNE, Agglomerative clustering
    • Classification: K-NN, Linear SVM(Support Vector Machine), RBF(Radial Basis Function kernel) SVM, Gaussian process classifer, Decision tree, Random forest, Neural network, DadaBoost classifier, Gaussian naive bayes classifier
    • Others: EDA(Exploratory Data Analysis), Data manipulation, Data visualization, Linear regression, NLP(natural Language Process)

Homeworks

1. Data Manipulation

Topic: Data manipulation, EDA(Exploratory Data Analysis)

Read More

2. More Data Manipulation

Topic: Data manipulation, EDA(Exploratory Data Analysis)

Read More

3. Data Visualization

Topic: Data visualization

Read More

4. Visualization, Correlation, and Linear Models

Topic: Data visualization, Correlation, Linear models

Read More

5. Natural Language Processing

Topic: NLP(Natural Language Processing)

Read More

6. Machine Learning 1: Linear regression, PCA, and Clustering

Topic: Linear Regression, PCA, Agglomerative clustering, K-means clustering, t-SNE

Read More

7. Machine Learning 2: Classification

Topic: K-NN, Linear SVM, RBF SVM, Gaussian process classifier, Decision tree classifier, Randomforest classifer, Neural network, AdaBoost classifer, Gaussian naive bayes classifer, PCA, t-SNE

Read More