Chapter Summary: The Basics of Launching Scripts

Accessing the Command Line from Your Local Machine

You can run commands locally, without connecting to cloud services.

Users working in Linux already have access to the command line.

macOs is based on Linux, so it possesses all Linux features. If you work in macOS, go to the folder Applications/Utilities and find Terminal, this acts as the command line on macOS.

If you work in Windows, install a "parallel" fully functional operating system like Ubuntu, which is in the Linux family. To do this:

Press "Start" and open the Microsoft Store.
Find Ubuntu.
You will see the Ubuntu app window; press Install. Ubuntu will be installed on your computer.
Restart the system. Then press "Start" and enter Ubuntu.
Launch Ubuntu: wait till the system finishes the initial setup. Then you'll see the terminal window.

One thing to keep in mind when working with Ubuntu in Windows is that Ubuntu has access to the file system, but arranges file paths in a way that differs from Windows. For example, you can find the C drive root folder in Windows by the C:\ path. But in Ubuntu the files from this folder have the address /mnt/c/.

Basic Interface Commands in Linux

Here are the most common Linux commands:

whoami returns your username
cd allows to change the current folder
mkdir creates a folder
rm deletes a file or an empty folder
rm -r deletes a folder and the files it contains
cat prints out the contents of the file
echo \prints out a text or the contents of a variable.

This list of commands is just enough to set up basic automation.

Launching a Script From the Command Line

To automate processes, you need to teach your computer to launch programs with input parameters by schedule. That's where Python scripts come in.

Download a text editor (for example, Sublime Text or Notepad++) to edit your Python scripts. Create an empty file in the text editor and save it as test.py.

Insert a minimal Python script in the file (it will run in any operating system):


1#!/usr/bin/python
2
3if __name__ == "__main__":
4    print('Hello world.')

The system will output the message: 'Hello world.'

Your Python script consists of several components:

#!/usr/bin/python tells the OS what language the script is written in
# -*- coding: utf-8 -*- communicates that the script uses UTF-8 character encoding`. Strictly speaking, you don't need to indicate this for Python 3 — it's the default.
if __name__ == "__main__" is a condition containing the main part of the script to be executed.
In Python, scripts can be called in two ways:
- As the main program, if the script is launched directly from the command line
- As an imported module: via the import command inside another file

Executing a script from the command line is easy:


1python /script_path/script_name.py

Import the getopt and sys libraries. getopt ("get options") reads input parameters, or options. The sys imports system functions. You'll need the sys.exit() function in scripts since it stops script execution if it's launched without input parameters. Here's the script code:


1#!/usr/bin/python
2
3# Import the necessary libraries
4import sys
5import getopt
6
7if __name__ == "__main__":
8
9    # Set the format of input parameters
10    unixOptions = "s:e:"
11    gnuOptions = ["start_dt=", "end_dt="]
12
13    # Obtain the string with input parameters
14    fullCmdArguments = sys.argv
15    argumentList = fullCmdArguments[1:]
16
17    # Check whether input parameters match the format
18    # indicated in unixOptions and gnuOptions
19    try:
20        arguments, values = getopt.getopt(argumentList, unixOptions, gnuOptions)
21    except getopt.error as err:
22        print (str(err))
23        sys.exit(2)      # Stop execution if input parameters are incorrect
24
25    # Read the values from the string with input parameters
26    start_dt = ''
27    end_dt = ''
28    for currentArgument, currentValue in arguments:
29        if currentArgument in ("-s", "--start_dt"):
30            start_dt = currentValue
31        elif currentArgument in ("-e", "--end_dt"):
32            end_dt = currentValue
33
34    # Print the result
35    print(start_dt, end_dt)

Now let's study the code in detail. Here's how we import the libraries we need:


1import sys
2import getopt

Then we define the names of input parameters:


1unixOptions = "s:e:"
2gnuOptions = ["start_dt=", "end_dt="]

unixOptions = "s:e:" defines the names of the parameters in classic Unix style (Unix is a family of operating systems developed in the 1970s). Although the manner itself is outdated, it's become a tradition. The script won't run without it.
gnuOptions = ["start_dt=", "end_dt="] defines the names of the input parameters in the style of GNU, a Unix-like operating system.

You can call the script in two different ways, depending on the style of the input parameter names:


1python params_test.py -s '2019-01-01' -e '2019-09-01'
2# or
3python params_test.py --start_dt='2019-01-01' --end_dt='2019-09-01'

We suggest you stick to the second option. It reads better and ensures the compatibility of parameters from program to program, so that you don't mix up the names of variables.

Then the script stores the set of input parameters in argumentList.

In our sample, the input parameters are start_dt and end_dt. The system automatically reads them into sys.argv, which we store in the variable fullCmdArguments. When storing it in argumentList, we take all the parameters, except the very first one (index 0). This is the name of the script, which we won't need:


1 fullCmdArguments = sys.argv
2 argumentList = fullCmdArguments[1:]

Then the script checks whether the set of input parameters is empty. If there are no parameters, the sys.exit(2) command will stop the execution of the program. (2) means that the error which caused this stop was made in the command line parameters. For example, if the input parameters were indicated incorrectly or not indicated at all:Then the script checks whether the set of input parameters is empty. If there are no parameters, the sys.exit(2) command will stop the execution of the program. (2) means that the error which caused this stop was made in the command line parameters. For example, if the input parameters were indicated incorrectly or not indicated at all:


1 try:
2    arguments, values = getopt.getopt(argumentList, unixOptions, gnuOptions)
3 except getopt.error as err:
4    print (str(err))
5    sys.exit(2)

Then the script runs through all input parameters and distributes their values among its inner variables:


1start_dt = ''
2end_dt = ''
3for currentArgument, currentValue in arguments:
4   if currentArgument in ("-s", "--start_dt"):
5      start_dt = currentValue
6   elif currentArgument in ("-e", "--end_dt"):
7      end_dt = currentValue

Finally, the script prints the start_dt and end_dt values.

Let's launch the script in the command line and pass it test parameters:


1python params_test.py --start_dt='2019-01-01' --end_dt='2019-09-01'

Result:


1['--start_dt=2019-01-01', '--end_dt=2019-09-01']
2('2019-01-01', '2019-09-01')

You can pass any input parameters to the script, including numbers, logical values, dates and times, arrays, and filenames. Note that parameters can only be input as strings. It's entirely up to you to convert them to the correct format.

Launching a Script From the Command Line in AWS

To make a script work on a virtual machine, you need to transfer it from your local machine.

You need the program scp ("secure copy") to copy your files and use them on the virtual machine. From the command line, it's launched as follows (remember "Independent Task 1: Setting Up an AWS Account"):


1scp -i <path_to_private_key> <path_to_a_local_file> ubuntu@<public_dns>:

To transmit the file to the virtual machine, follow these steps:

Create a file called test.py and save it on your computer. Open the file in Sublime Text (or another text editor) and add the following text:


1#!/usr/bin/python
2
3if __name__ == "__main__":
4print('Hello world.')

Copy this file onto the virtual machine.

If you're working in Linux, Ubuntu, macOS or Windows 10, use the command:


1```bash
2scp -i <path_to_private_key> <path_to_a_local_file> ubuntu@<public_dns>:
3```
4
5The colon is crucial: it means "the user's home directory on the virtual machine." With other operating systems, the only difference is the way the file path is indicated. Be sure that you know how to find the path of a file in your OS.

Connect to the virtual machine:


1ssh -i <path_to_private_key>/test_pair.pem ubuntu@<public_dns>

In the command line of the virtual machine, run the commands to install pip (answer Yes to all questions):


1sudo apt update
2sudo apt install python3-pip
3pip3 --version

The system will print the following result:

Once you've connected, enter the command dir (directory). You'll be shown the contents of your home directory:

Start the test script with this command:


1python3 test.py

Scheduling Scripts

Let's find out how to make a script adhere to a work schedule.

Unix-like operating systems such as Linux, Ubuntu, and macOS have a special scheduling program called cron. It operates invisibly in the background and reads a special crontab schedule (tab=table).

Here's a sample cron schedule in the terminal:


15 6 * * 1 python -u -W ignore /home/my_user/script_A.py --start_dt=$(date +\%Y-\%m-\%d\ 00:00:00 -d "1 week ago") >> /home/my_user/logs/script_A_$(date +\%Y-\%m-\%d).log 2>&1
2#15 7 * * * python -u -W ignore /home/my_user/script_B.py --start_dt=$(date +\%Y-\%m-\%d\ 00:00:00 -d "1 week ago") >> /home/my_user/logs/script_B_$(date +\%Y-\%m-\%d).log 2>&1

Let's see what's happening here. First:

5 6 * * 1 — indicates the time the command should run. Its format is as follows:

* — any value.

So the first line specifies that the command is to start at 6:05 AM every Monday.

python -u -W ignore /home/my_user/script_A.py is the command itself. Here:
- The -u flag means that the results of the script's execution won't be buffered (accumulated in the computer's memory). They will be immediately stored in the log file instead. For example, if your script has print() commands, their results won't be stored in memory, but will go straight into the logs.
- -W ignore(ignore warnings) means that any warnings generated when the script runs won't be stored in the log file.
- /home/my_user/script_A.py — the name of the script to be scheduled.
--start_dt=$(date +\\%Y-\\%m-\\%d\\ 00:00:00 -d "1 week ago") — the script input parameter. Here:
- $(date +\\%Y-\\%m-\\%d\\ 00:00:00 … ) — the command line expression that allows you to get the current date in the format '%Y-%m-%d 00:00:00'. To get a better idea of how it works, run the following command in the command line interface:
```
1echo $(date +\%Y-\%m-\%d\ 00:00:00)
```

You'll see the current date on the screen.

-d "1 week ago" specifies the time interval by which we want to decrease the current date. script_A.py seems to be designed so that its input is the date on which the previous week started. It probably collects certain data for the seven preceding days. There are also other ways to define time intervals. For example:
- -d "yesterday" or -d "1 day ago"
- -d "N days ago"
- -d "N weeks ago"
- -d "1 month ago"
- -d "N months ago"
- -d "1 year ago"
- -d "N years ago"
>> /home/my_user/logs/script_A_$(date +\\%Y-\\%m-\\%d).log means that all the data the script prints will be stored in the file script_A_complianceDate.log in /home/my_user/logs/. It's very important to save logs if the scripts run automatically; this helps you detect any errors that occur. Without logs you won't have a clear idea how your automation system is actually working.
2>&1 means that all the results of the execution of the script (including errors) will be printed in the same place: the log file.

The current time is determined by the timezone on the machine where you are going to align cron. The majority of servers work with the UTC+0 time zone. This timezone is generally used when recording the date and time of banking operations and transactions. Try to stick to UTC+0 when analyzing and scheduling scripts in order to avoid discrepancies in report results.

Let's see how to edit the cron timetable. You can only do it on a local machine with Linux, Ubuntu, or macOS.

First, create an empty file in Sublime Text which will print out 'Hello world' and the current date and time:


1#!/usr/bin/python
2
3from datetime import datetime
4
5if __name__ == "__main__":
6    print('Hello world: {}'.format(datetime.now()))

We'll save it as /home/YOUR_USERNAME/cron_test.py.
Then we'll run the following command in the command line:


1python cron_test.py

Result:

In the command line we'll create the directory /home/YOUR_USERNAME/logs to store logs:


1mkdir /home/YOUR_USERNAME/logs
2

Then we call the schedule editor cron:


1crontab -e

You'll have to choose a text editor the first time you edit your schedule:

Type "1" and hit Enter to choose nano as the text editor for crontab.

Then you'll see the crontab text:

Add the following line at the end of the file:


1*/5 * * * * python -u -W ignore /home/YOUR_USERNAME/cron_test.py >> /home/YOUR_USERNAME/logs/cron_test.log 2>&1

Here */5 * * * * means "running every five minutes."

Press Ctrl+O to save. You'll see this system message:

Press Enter. Then press Ctrl+X to quit the crontab editor. You'll see the message:

This means that the new settings for the cron table have been adopted.