Knowledge Base

Anomaly Detection

Glossary

nomalies/outliers are observations with abnormal properties (i.e. those that deviate from the normal trend). Outliers indicate a problem in the data or that something is out of the ordinary.

Practice

1# obtaining the list of outliers from column
2boxplot = plt.boxplot(df['column'].values)
3outliers = list(boxplot["fliers"][0].get_data()[1])

1# Training the isolation forrest for anomaly detection
2from sklearn.ensemble import IsolationForest
3isolation_forest = IsolationForest(n_estimators=100)
4isolation_forest.fit(data)
5
6# Obtaining anomaly estimate.
7# Values from -0.5 to 0.5. A lower estimate indicates a higher chance that the observation is an outlier.
8anomaly_scores = isolation_forest.decision_function(data)
9
10# Obtaining anomaly prediction. -1 - outlier, 1 - normal observation
11estimator = isolation_forest.predict(data)
12
13# training and obtaining prediction
14estimator = isolation_forest.fit_predict(data)

1# KNN-based anomaly detection method
2
3from pyod.models.knn import KNN
4model = KNN()
5model.fit(data)
6
7# anomaly prediction: 1 - outlier, 0 - normal observation
8predictions = model.predict(data)
Send Feedback
close
  • Bug
  • Improvement
  • Feature
Send Feedback
,