Time Series Analysis
Glossary
Resampling: changing the interval with the values of the series. It is performed in two steps:
- Choose the new interval length. Note that the values from the existing interval are grouped.
- In each group, the aggregated value of the series is calculated. It can be median, mean, maximum or minimum.
Rolling mean/moving average: a method of smoothing the data in a time series. The method involves finding the values least susceptible to fluctuations, that is, the arithmetic mean.
Seasonality: cyclically repeating patterns in a time series.
Stochastic process: random variation, and its distribution changes over time.
- A stochastic process is stationary if its distribution does not change over time.
- If the distribution does change, then the stochastic process is nonstationary.
Time series: the sequences of numbers along the time axis. The interval between the values of the series is constant.
Time series differences: a set of differences between neighboring elements of a time series — i.e., the previous value is subtracted from each value.
Trend: a smooth change of the mean value of the series without repeating patterns.
Practice
1# date recognition and formation of new indices2# values of index_col = the list of column numbers or column names3# values of parse_dates = the list of column numbers or column names4data = pd.read_csv('filename.csv', index_col=[0], parse_dates=[0])
1# checking that the index is monotonic2print(data.index.is_monotonic)
1# resampling - mean for each hour2data.resample('1H').mean()34# resampling - maximum for each two weeks5data.resample('2W').max()
1# rolling mean with window size = 72data.rolling(7).mean()
1# decomposing the time series into trend, seasonality, and residuals2from statsmodels.tsa.seasonal import seasonal_decompose34decomposed = seasonal_decompose(data)56decomposed.trend # trend7decomposed.seasonal # seasonality8decomposed.resid # residuals
1# one step shift with filling the zero value2print(data.shift(fill_value=0))