Takeaway Sheet: Solving Tasks Related to Machine Learning
Practice
1# visualizing a correlation matrix2cm = df.corr()3sns.heatmap(cm, annot = True, square=True)
1# rendering paired graphs2sns.scatterplot(df['Feature 1'], df['Feature 2'])
1# replacing categories with numeric values (label encoding)2from sklearn.preprocessing import LabelEncoder34encoder = LabelEncoder() # creating a variable of the LabelEncoder class5df['column'] = encoder.fit_transform(df['column']) # using the encoder to transform strings into numbers
1# transforming a categorical field into a set of binary ones (one-hot encoding)2df = pd.get_dummies(df)
1# setting an optimization criterion for a model2# possible criterion values can be found in documentation3model = RandomForestRegressor(criterion='mae')
1# obtaining the coefficients of linear regression2feature_weights = model.coef_3# obtaining the null coefficient4weight_0 = model.intercept_
1# getting feature importance for decision trees, random forests, and gradient boosting2importances = model.feature_importances_