Vectors and Vector Operations
Creating vectors
In mathematics, an ordered set of numerical data is a vector, or an arithmetic vector. In Python, operations with vectors are hundreds of times faster than operations with lists.
To work with vectors use the NumPy library.
1import numpy as np23numbers1 = [2, 3] # Python list4vector1 = np.array(numbers1) # NumPy array5print(vector1)
Vectors can be created without a temporary variable:
1import numpy as np2vector2 = np.array([6, 2])3print(vector2)
Vectors can be converted into lists:
1numbers2 = list(vector2) # List from vector2print(numbers2)
The column of the DataFrame structure in pandas is converted into a NumPy vector using the values
attribute:
1import pandas as pd23data = pd.DataFrame([1, 7, 3])4print(data[0].values)
Use the len()
function to determine the vector size (number of its elements):
1print(len(vector2))
Vector presentation
The vector is represented by a point or an arrow that connects the origin and the point with coordinates (x, y). We use arrows when we want to indicate the movements.
Vector elements are also called coordinates.
1import numpy as np2import matplotlib.pyplot as plt34vector1 = np.array([2, 3])5vector2 = np.array([6, 2])67plt.figure(figsize=(7, 7))8plt.axis([0, 7, 0, 7])9# 'ro' argument sets graph style10# 'r' - red11# 'o' - circle12plt.plot([vector1[0], vector2[0]], [vector1[1], vector2[1]], 'ro')13plt.grid(True)14plt.show()
Let's use arrows to draw the same vectors. Instead of plt.plot()
, call plt.arrow()
.
1import numpy as np2import matplotlib.pyplot as plt34vector1 = np.array([2, 3])5vector2 = np.array([6, 2])67plt.figure(figsize=(7, 7))8plt.axis([0, 7, 0, 7])9plt.arrow(0, 0, vector1[0], vector1[1], head_width=0.3,10 length_includes_head="True", color='b')11plt.arrow(0, 0, vector2[0], vector2[1], head_width=0.3,12 length_includes_head="True", color='g')13plt.plot(0, 0, 'ro')14plt.grid(True)15plt.show()
Addition and subtraction of vectors
Vectors of the same size have equal length. The result of their addition is the vector with each coordinate being equal to the sum of the coordinates of the summand vectors.
When adding or subtracting vectors, the operation is performed for each element of the vectors:
Vector | Coordinates |
---|---|
1import numpy as np23vector1 = np.array([2, 3])4vector2 = np.array([6, 2])5sum_of_vectors = vector1 + vector26subtraction_of_vectors = vector2 - vector1
If we plot a vector that is equal to the green vector1
in terms of length and direction from the end of the blue vector2
, we will get the red vector (sum_of_vectors
).
If each vector is a movement in a certain direction, then the sum of two added vectors is the movement along the first vector followed with the movement along the second one.
The difference of two vectors is a step — for example along vector2
— followed by a step along the direction opposite to vector1
.
Multiplication of a vector by a scalar
Besides addition and subtraction, vectors can be also multiplied by scalars. Each coordinate of the vector is multiplied by the same number:
Vector | Coordinates |
---|---|
If the number is negative, all coordinates also change their signs.
1import numpy as np23vector1 = np.array([2, 3])4vector3 = 2 * vector15vector4 = -1 * vector1
When multiplied by a positive number, vectors on the plane maintain direction, but the arrows change length. When multiplied by a negative number, vectors flip to the opposite direction.
Mean value of vectors
For the set of vectors (where is the total number of vectors), the mean value of vectors is the sum of all vectors multiplied by . This results in a new vector .
If the set consists of only one vector (), it will be equal to the mean: . The mean value of two vectors is . The mean value for a pair of two-dimensional vectors is the middle of the segment connecting and .
1import numpy as np23vector1 = np.array([2, 3])4vector2 = np.array([6, 2])5vector_mean = .5*(vector1+vector2)6print(vector_mean)
The first coordinate of the new vector is the mean value of the first coordinates of vector1
and vector2
, and the second coordinate is the mean value of the second coordinates of vector1
and vector2
.
That's how we draw these vectors on the plane: plot the vector1+vector2
vector and then multiply it by 0.5.
Vectorized Functions
If we use the np.array()
function after multiplying and dividing two arrays of the same size, we will obtain a new vector that will also have the same size:
1import numpy as np23array1 = np.array([2, -4, 6, -8])4array2 = np.array([1, 2, 3, 4])5array_mult = array1 * array26array_div = array1 / array27print("Product of two arrays: ", array_mult)8print("Quotient of two arrays: ", array_div)
If arithmetic operations are performed on an array and a single number, then the action is applied to each element of the array. And an array of the same size is formed.
1import numpy as np23array2 = np.array([1, 2, 3, 4])4array2_plus_10 = array2 + 105array2_minus_10 = array2 - 106array2_div_10 = array2 / 107print("Sum: ", array2_plus_10)8print("Difference: ", array2_minus_10)9print("Quotient: ", array2_div_10)
The same element-by-element principle works on arrays when we deal with standard mathematical functions like exponentiation or logarithms.
Let's raise an array to the second power:
1import numpy as np23numbers_from_0 = np.array([0, 1, 2, 3, 4])4squares = numbers_from_0**25print(squares)
All of that can be done with lists using loops as well, but operations with vectors in NumPy are much faster.
Here's the formula of the min_max_scale()
function:
To apply this function to all elements of the values
array, call the max()
and min()
methods. They will find its maximum and minimum values. As a result, we get an array of the same length, but with converted elements:
1import numpy as np2def min_max_scale(values):3 return (values - min(values)) / (max(values) - min(values))45print(min_max_scale(our_values))
To apply this function to all elements of the values
array, call the max()
and min()
methods.
As a result, we get an array of the same length, but with converted elements:
exp()
is the exponent function, it raises which approximately equals 2.718281828.

Perform logistic transformation:
1import numpy as np23def logistic_transform(values):4 return 1 / (1 + np.exp(- values))56print(logistic_transform(our_values))
Vectorization of metrics
Store a set of actual values to the target
variable, and predicted values to the predictions
variable. Both sets are np.array
type.
Use standard numpy
functions to calculate the evaluation metrics:
sum()
(to find the sum of the elements in an array)mean()
(to calculate the mean value)
Call them as follows: <array name>.sum()
and <array name>.mean()
.
Here's the formula to calculate the mean square error (MSE)
where is the length of each array and is the summation over all observations of the sample ( varies from to ). The ordinal elements of the vectors target and predictions are denoted by and .
1def mse1(target, predictions):2 n = target.size3 return((target - predictions)**2).sum()/n
Write the MSE formula using mean()
1def mse2(target, predictions):2 return((target - predictions)**2).mean()
Write the function to calculate MAE using mean()
1import numpy as np23def mae(target, predictions):4 return np.abs((target - predictions)).mean()56print(mae(target, predictions))
Vectorized functions can be used to calculate RMSE.
1import numpy as np23def rmse(target, predictions):4 return (((target-predictions)**2).mean())**0.556print(rmse(target, predictions))