Filtering Data with SQL
Data Slices in SQL
The beginning of the condition used to select data is marked with the WHERE command. The condition is evaluated in each row of the table. In conditions, comparison operators are used:
1SELECT2 column_1,3 column_2 --selecting column names4FROM5 table_name --specifying the table6WHERE7 condition; --defining the condition of row selection
The order of operators is strictly defined:
1)SELECT
2)FROM
3)WHERE
Like Python, SQL uses the logical operators AND
, OR
, NOT
. They let you make a selection with several conditions:
1SELECT2 *3FROM4 table_name5WHERE6 condition_1 AND condition_2;7--Select rows where both conditions are true8SELECT9 *10FROM11 table_name12WHERE13 condition_1 OR condition_2;14--Select rows where either or both conditions are true15SELECT16 *17FROM18 table_name19WHERE20 condition_1 AND NOT condition_2;21--Select rows where condition_1 is true and condition_2 is false
If you need to make a selection of rows for which a field's values fall within a certain range, use the BETWEEN statement. BETWEEN
includes the start and end boundaries in the resulting selection:
1SELECT2 *3FROM4 table_name5WHERE6 field_1 BETWEEN value_1 AND value_2;7-- Selecting rows where field_1 value is between value_1 and value_2 (inclusive)
If you need to make a selection of rows for which a field's values match those in a list, use the IN operator. Indicate the list of values after IN
:
1SELECT2 *3FROM4 table_name5WHERE6 column_name IN ('value_1','value_2','value_3');
If the values are numbers, they are separated from each other by commas: IN (3,7,9)
. If they're strings, they're put in single quotes and, again, separated by commas: IN ('value_1','value_2','value_3')
. Date and time are indicated as follows: IN ('yyyy-mm-dd','yyyy-mm-dd')
Searching for Empty Values
In SQL, empty cells are said to be NULL. The IS NULL operator searches for them:
1SELECT2 *3FROM4 table_name5WHERE6 column_name IS NULL;
Note that IS matters! The following won't work:
1SELECT2 *3FROM4 table_name5WHERE6 column_name = NULL; -- this code won't comply!
To exclude rows with NULL values from the selection, we use the NOT operator:
1SELECT2 *3FROM4 table_name5WHERE6 column_name IS NOT NULL;
The CASE construction is used to carry out actions when certain conditions are met. It's much like if-elif-else
in Python:
1CASE2 WHEN condition_1 THEN result_13 WHEN condition_2 THEN result_24 WHEN condition_3 THEN result_35 ELSE result_46END;
A condition follows the WHEN operator. If a table row matches this condition, the code returns the result indicated in THEN. Otherwise, the same row is tested with the next condition. If the row fails to match any of the conditions stated in WHEN, the code returns the value indicated after ELSE. The CASE construction is then closed with the END operator.
Searching for Data in a Table
The LIKE operator searches a table for values that follow a given pattern. You can search not only for a word, but also for a fragment of it.
Here's the syntax of LIKE statements:
1column_name LIKE 'regular expression'
Indicate the necessary column before LIKE and follow it with a regular expression.
Regular expressions in SQL are a bit different from those in Python. For example, the _
symbol replaces one substitute value (1 character) in a regular expression. The %
symbol replaces any amount of characters. A range or sequence of characters which a string must contain are written within square brackets []
. If the characters are to be excluded, the [^]
construction is used.
Range or sequence of characters
If we need to find a symbol from regular expression as a substring, we use the ESCAPE operator. It's passed a symbol such as an exclamation point. In the regular expression, the exclamation point means the symbol following it is not part of the expression, but rather the substring to be searched for. Here's a fragment of code that will find all substrings ending with the %
symbol (say, "100%") in a table:
1column_name LIKE '%!%' ESCAPE '!'2--finds all substrings ending with %