Python
jupyter notebook
google colaboratory
Tableau
Numpy for mathematical calculation and opeation.
>> https://numpy.org/
Pandas for loading data,manipulationg Dataframe, Preprocessing data with help of various method
>> https://pandas.pydata.org/
Matplotlib and Seaborn for various plot i.e barchart,scatterplot,boxplot
>>https://matplotlib.org/
>>https://seaborn.pydata.org/
>> sklearn for various algorithms
>>https://scikit-learn.org/
Linear Regression >> sklearn documentation:- https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
Logistic Regression >>sklearn documentation:- https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
Decison Trees
>> sklearn :- https://scikit-learn.org/stable/modules/tree.html
Random Forest
>> sklearn :-https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
Ada boost
>> documentation:- XGbooost
>> xgboost documentation:-https://xgboost.readthedocs.io/en/latest/
K-means
>>documentation:- https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html
Hierarchical clustering
>> documentation:- https://scikit-learn.org/stable/modules/clustering.html#hierarchical-clustering
principal component analysis >>documentaion:- https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
i.Loading Data with the help of Pandas
ii.preproecessing data
a. checking null value and imputing
b. checking duplicates
c. Handling Categorical columns which have so many cateogory in it
d. Understanding Data by by EDA
iii.splitting data in 70/30
iv.Scaling Data whereever required
v.Model building
vi.Model evaluation by various parameter
a. For regression :-RMSE
b.For CLassification :- Accuracy score or AUC-ROC SCOre(for imbalance data)
vii.Finalizing model and conclusion with respect to business term
Implemeataion of various Time-series models i.e holt's winter, ARIMA, SARIMA, SARIMAX