Chapter Summary: The Basics of Launching Scripts
Accessing the Command Line from Your Local Machine
You can run commands locally, without connecting to cloud services.
Users working in Linux already have access to the command line.
macOs is based on Linux, so it possesses all Linux features. If you work in macOS, go to the folder Applications/Utilities
and find Terminal
, this acts as the command line on macOS.
If you work in Windows, install a "parallel" fully functional operating system like Ubuntu, which is in the Linux family. To do this:
- Press "Start" and open the Microsoft Store.
- Find Ubuntu.
- You will see the Ubuntu app window; press Install. Ubuntu will be installed on your computer.
- Restart the system. Then press "Start" and enter
Ubuntu
. - Launch Ubuntu: wait till the system finishes the initial setup. Then you'll see the terminal window.
One thing to keep in mind when working with Ubuntu in Windows is that Ubuntu has access to the file system, but arranges file paths in a way that differs from Windows. For example, you can find the C drive root folder in Windows by the C:\
path. But in Ubuntu the files from this folder have the address /mnt/c/
.
Basic Interface Commands in Linux
Here are the most common Linux commands:
whoami
returns your usernamecd
allows to change the current foldermkdir
creates a folderrm
deletes a file or an empty folderrm -r
deletes a folder and the files it containscat
prints out the contents of the fileecho
\prints out a text or the contents of a variable.
This list of commands is just enough to set up basic automation.
Launching a Script From the Command Line
To automate processes, you need to teach your computer to launch programs with input parameters by schedule. That's where Python scripts come in.
Download a text editor (for example, Sublime Text or Notepad++) to edit your Python scripts. Create an empty file in the text editor and save it as test.py
.
Insert a minimal Python script in the file (it will run in any operating system):
1#!/usr/bin/python23if __name__ == "__main__":4 print('Hello world.')
The system will output the message: 'Hello world.'
Your Python script consists of several components:
#!/usr/bin/python
tells the OS what language the script is written in# -*- coding: utf-8 -*-
communicates that the script usesUTF-8
character encoding`. Strictly speaking, you don't need to indicate this for Python 3 — it's the default.if __name__ == "__main__"
is a condition containing the main part of the script to be executed.In Python, scripts can be called in two ways:
- As the main program, if the script is launched directly from the command line
- As an imported module: via the
import
command inside another file
Executing a script from the command line is easy:
1python /script_path/script_name.py
Import the getopt
and sys
libraries.
getopt
("get options") reads input parameters, or options.
The sys
imports system functions.
You'll need the sys.exit()
function in scripts since it stops script execution if it's launched without input parameters. Here's the script code:
1#!/usr/bin/python23# Import the necessary libraries4import sys5import getopt67if __name__ == "__main__":89 # Set the format of input parameters10 unixOptions = "s:e:"11 gnuOptions = ["start_dt=", "end_dt="]1213 # Obtain the string with input parameters14 fullCmdArguments = sys.argv15 argumentList = fullCmdArguments[1:]1617 # Check whether input parameters match the format18 # indicated in unixOptions and gnuOptions19 try:20 arguments, values = getopt.getopt(argumentList, unixOptions, gnuOptions)21 except getopt.error as err:22 print (str(err))23 sys.exit(2) # Stop execution if input parameters are incorrect2425 # Read the values from the string with input parameters26 start_dt = ''27 end_dt = ''28 for currentArgument, currentValue in arguments:29 if currentArgument in ("-s", "--start_dt"):30 start_dt = currentValue31 elif currentArgument in ("-e", "--end_dt"):32 end_dt = currentValue3334 # Print the result35 print(start_dt, end_dt)
Now let's study the code in detail. Here's how we import the libraries we need:
1import sys2import getopt
Then we define the names of input parameters:
1unixOptions = "s:e:"2gnuOptions = ["start_dt=", "end_dt="]
unixOptions = "s:e:"
defines the names of the parameters in classicUnix
style (Unix is a family of operating systems developed in the 1970s). Although the manner itself is outdated, it's become a tradition. The script won't run without it.gnuOptions = ["start_dt=", "end_dt="]
defines the names of the input parameters in the style ofGNU
, a Unix-like operating system.
You can call the script in two different ways, depending on the style of the input parameter names:
1python params_test.py -s '2019-01-01' -e '2019-09-01'2# or3python params_test.py --start_dt='2019-01-01' --end_dt='2019-09-01'
We suggest you stick to the second option. It reads better and ensures the compatibility of parameters from program to program, so that you don't mix up the names of variables.
Then the script stores the set of input parameters in argumentList
.
In our sample, the input parameters are start_dt
and end_dt
. The system automatically reads them into sys.argv
, which we store in the variable fullCmdArguments
. When storing it in argumentList
, we take all the parameters, except the very first one (index 0). This is the name of the script, which we won't need:
1 fullCmdArguments = sys.argv2 argumentList = fullCmdArguments[1:]
Then the script checks whether the set of input parameters is empty. If there are no parameters, the sys.exit(2)
command will stop the execution of the program. (2)
means that the error which caused this stop was made in the command line parameters. For example, if the input parameters were indicated incorrectly or not indicated at all:Then the script checks whether the set of input parameters is empty. If there are no parameters, the sys.exit(2)
command will stop the execution of the program. (2)
means that the error which caused this stop was made in the command line parameters. For example, if the input parameters were indicated incorrectly or not indicated at all:
1 try:2 arguments, values = getopt.getopt(argumentList, unixOptions, gnuOptions)3 except getopt.error as err:4 print (str(err))5 sys.exit(2)
Then the script runs through all input parameters and distributes their values among its inner variables:
1start_dt = ''2end_dt = ''3for currentArgument, currentValue in arguments:4 if currentArgument in ("-s", "--start_dt"):5 start_dt = currentValue6 elif currentArgument in ("-e", "--end_dt"):7 end_dt = currentValue
Finally, the script prints the start_dt
and end_dt
values.
Let's launch the script in the command line and pass it test parameters:
1python params_test.py --start_dt='2019-01-01' --end_dt='2019-09-01'
Result:
1['--start_dt=2019-01-01', '--end_dt=2019-09-01']2('2019-01-01', '2019-09-01')
You can pass any input parameters to the script, including numbers, logical values, dates and times, arrays, and filenames. Note that parameters can only be input as strings. It's entirely up to you to convert them to the correct format.
Launching a Script From the Command Line in AWS
To make a script work on a virtual machine, you need to transfer it from your local machine.
You need the program scp
("secure copy") to copy your files and use them on the virtual machine. From the command line, it's launched as follows (remember "Independent Task 1: Setting Up an AWS Account"):
1scp -i <path_to_private_key> <path_to_a_local_file> ubuntu@<public_dns>:
To transmit the file to the virtual machine, follow these steps:
Create a file called
test.py
and save it on your computer. Open the file in Sublime Text (or another text editor) and add the following text:1#!/usr/bin/python23if __name__ == "__main__":4print('Hello world.')Copy this file onto the virtual machine.
If you're working in Linux, Ubuntu, macOS or Windows 10, use the command:
1```bash2scp -i <path_to_private_key> <path_to_a_local_file> ubuntu@<public_dns>:3```45The colon is crucial: it means "the user's home directory on the virtual machine." With other operating systems, the only difference is the way the file path is indicated. Be sure that you know how to find the path of a file in your OS.
Connect to the virtual machine:
1ssh -i <path_to_private_key>/test_pair.pem ubuntu@<public_dns>In the command line of the virtual machine, run the commands to install pip (answer Yes to all questions):
1sudo apt update2sudo apt install python3-pip3pip3 --versionThe system will print the following result:
Once you've connected, enter the command
dir
(directory). You'll be shown the contents of your home directory:Start the test script with this command:
1python3 test.py
Scheduling Scripts
Let's find out how to make a script adhere to a work schedule.
Unix-like operating systems such as Linux, Ubuntu, and macOS have a special scheduling program called cron
. It operates invisibly in the background and reads a special crontab
schedule (tab=table).
Here's a sample cron
schedule in the terminal:
15 6 * * 1 python -u -W ignore /home/my_user/script_A.py --start_dt=$(date +\%Y-\%m-\%d\ 00:00:00 -d "1 week ago") >> /home/my_user/logs/script_A_$(date +\%Y-\%m-\%d).log 2>&12#15 7 * * * python -u -W ignore /home/my_user/script_B.py --start_dt=$(date +\%Y-\%m-\%d\ 00:00:00 -d "1 week ago") >> /home/my_user/logs/script_B_$(date +\%Y-\%m-\%d).log 2>&1
Let's see what's happening here. First:
5 6 * * 1
— indicates the time the command should run. Its format is as follows:
*
— any value.
So the first line specifies that the command is to start at 6:05 AM every Monday.
python -u -W ignore /home/my_user/script_A.py
is the command itself. Here:- The
-u
flag means that the results of the script's execution won't be buffered (accumulated in the computer's memory). They will be immediately stored in the log file instead. For example, if your script hasprint()
commands, their results won't be stored in memory, but will go straight into the logs. -W ignore
(ignore warnings) means that any warnings generated when the script runs won't be stored in the log file./home/my_user/script_A.py
— the name of the script to be scheduled.
- The
--start_dt=$(date +\\%Y-\\%m-\\%d\\ 00:00:00 -d "1 week ago")
— the script input parameter. Here:$(date +\\%Y-\\%m-\\%d\\ 00:00:00 … )
— the command line expression that allows you to get the current date in the format'%Y-%m-%d 00:00:00'
. To get a better idea of how it works, run the following command in the command line interface:1echo $(date +\%Y-\%m-\%d\ 00:00:00)
You'll see the current date on the screen.
-d "1 week ago"
specifies the time interval by which we want to decrease the current date.script_A.py
seems to be designed so that its input is the date on which the previous week started. It probably collects certain data for the seven preceding days. There are also other ways to define time intervals. For example:-d "yesterday"
or-d "1 day ago"
-d "N days ago"
-d "N weeks ago"
-d "1 month ago"
-d "N months ago"
-d "1 year ago"
-d "N years ago"
>> /home/my_user/logs/script_A_$(date +\\%Y-\\%m-\\%d).log
means that all the data the script prints will be stored in the filescript_A_complianceDate.log
in/home/my_user/logs/
. It's very important to save logs if the scripts run automatically; this helps you detect any errors that occur. Without logs you won't have a clear idea how your automation system is actually working.2>&1
means that all the results of the execution of the script (including errors) will be printed in the same place: the log file.
The current time is determined by the timezone on the machine where you are going to align cron. The majority of servers work with the UTC+0
time zone. This timezone is generally used when recording the date and time of banking operations and transactions. Try to stick to UTC+0
when analyzing and scheduling scripts in order to avoid discrepancies in report results.
Let's see how to edit the cron
timetable. You can only do it on a local machine with Linux, Ubuntu, or macOS.
First, create an empty file in Sublime Text which will print out
'Hello world'
and the current date and time:1#!/usr/bin/python23from datetime import datetime45if __name__ == "__main__":6 print('Hello world: {}'.format(datetime.now()))We'll save it as
/home/YOUR_USERNAME/cron_test.py
.Then we'll run the following command in the command line:
1python cron_test.py
Result:
In the command line we'll create the directory /home/YOUR_USERNAME/logs
to store logs:
1mkdir /home/YOUR_USERNAME/logs2
Then we call the schedule editor cron
:
1crontab -e
You'll have to choose a text editor the first time you edit your schedule:
Type "1" and hit Enter to choose nano as the text editor for crontab.
Then you'll see the crontab text:
Add the following line at the end of the file:
1*/5 * * * * python -u -W ignore /home/YOUR_USERNAME/cron_test.py >> /home/YOUR_USERNAME/logs/cron_test.log 2>&1
Here */5 * * * *
means "running every five minutes."
Press Ctrl+O
to save. You'll see this system message:
Press Enter. Then press Ctrl+X
to quit the crontab editor. You'll see the message:
This means that the new settings for the cron table have been adopted.