SI 618 - Data Manipulation and Analysis
This page contains my course work from SI 618 (Winter 2022)
- Programming language: Python
- Frameworks / Library: Pandas, Numpy, Matplotlib, Seaborn, Plotly, SciPy, Scikit-learn, Statsmodels, NLTK, spaCy
- Topics:
- Dimension reduction: PCA(Principal Component Analysis)
- Clustering: K-means, T-SNE, Agglomerative clustering
- Classification: K-NN, Linear SVM(Support Vector Machine), RBF(Radial Basis Function kernel) SVM, Gaussian process classifer, Decision tree, Random forest, Neural network, DadaBoost classifier, Gaussian naive bayes classifier
- Others: EDA(Exploratory Data Analysis), Data manipulation, Data visualization, Linear regression, NLP(natural Language Process)
Homeworks
4. Visualization, Correlation, and Linear Models
Topic: Data visualization, Correlation, Linear models
6. Machine Learning 1: Linear regression, PCA, and Clustering
Topic: Linear Regression, PCA, Agglomerative clustering, K-means clustering, t-SNE
7. Machine Learning 2: Classification
Topic: K-NN, Linear SVM, RBF SVM, Gaussian process classifier, Decision tree classifier, Randomforest classifer, Neural network, AdaBoost classifer, Gaussian naive bayes classifer, PCA, t-SNE