Knowledge Base

Takeaway Sheet: Solving Tasks Related to Machine Learning

Practice

1# visualizing a correlation matrix
2cm = df.corr()
3sns.heatmap(cm, annot = True, square=True)

1# rendering paired graphs
2sns.scatterplot(df['Feature 1'], df['Feature 2'])

1# replacing categories with numeric values (label encoding)
2from sklearn.preprocessing import LabelEncoder
3
4encoder = LabelEncoder() # creating a variable of the LabelEncoder class
5df['column'] = encoder.fit_transform(df['column']) # using the encoder to transform strings into numbers

1# transforming a categorical field into a set of binary ones (one-hot encoding)
2df = pd.get_dummies(df)

1# setting an optimization criterion for a model
2# possible criterion values can be found in documentation
3model = RandomForestRegressor(criterion='mae')

1# obtaining the coefficients of linear regression
2feature_weights = model.coef_
3# obtaining the null coefficient
4weight_0 = model.intercept_

1# getting feature importance for decision trees, random forests, and gradient boosting
2importances = model.feature_importances_
Send Feedback
close
  • Bug
  • Improvement
  • Feature
Send Feedback
,