You may wish to enroll the course by Ardit Sulce, I will not reveal the username and password on his ftp.pyclass.com, to be fair to him in order to use his resource for practice and learning data science you got to enroll to his course.
The code presented here has 5 parts:
security.crypto
, for encrypting and decrypting credentialsftp_cmd
, for downloading file from ftp.init_cred
, to first put in username and password for the ftp server, the json file will be returned that stores the encrypted credential filename and the key namemain.py
, this will execute the codes offtp_cmd
network_threads.get_files
uses threading to download all files
security.crypto
Contains functions of decrypting, and encrypting, this is to protect the username and password and store as encrypted text file, this is used when there is no vault available. You will notice I have been using a lot of cryptography.fernet
when doing encryption and decryption due to its ease of use, fernet is a recipe that uses AES 128 bit.
This section I will break the functions down to bite size instead of putting the entire code with comments for clearer documentation.
check_key()
function
This function is used to check if there is any symmetric key, if there is no encryption key yet create one, and write to a file. The caveat is if the key is lost and the key was used to encrypt the data then the data is lost forever.
def check_key(key_file): if not exists(key_file): key = Fernet.generate_key() with open(key_file, "wb") as file: file.write(key)
prompt_credential()
This function prompts user to enter username and password, getpass()
gets the password from user, when user types the password the password is obscured on the console.
This function will then returns the dictionary of the username and password.
def prompt_credential(): username = input("Username:") password = getpass() return { "username": username, "password": password }
use_key(key_file)
This function read the key binary from the key file, and return the key binary.
def use_key(key_file): check_key(key_file) with open(key_file, "rb") as file: key_byte = file.read() return key_byte
encrypt(filename, data, key)
This function encrypts the data to a specified filename with the key generated from the fernet recipe.
def encrypt(filename, data, key): key_byte = use_key(key) cipher = Fernet(key_byte) cipher_text = cipher.encrypt(data.encode('utf-8')) with open(filename, "wb") as file: file.write(cipher_text)
def decrypt(filename, key, convert_to_json=False)
This function decrypts the data read from the encrypted file which contains the credential dictionary, the default is not to convert to json.
The binary has to be decoded with utf-8 to become a string.
def decrypt(filename, key, convert_to_json=False): key_byte = use_key(key) with open(filename, "rb") as file: cipher_text = file.read() cipher = Fernet(key_byte) plain_text = cipher.decrypt(cipher_text) if convert_to_json: return dict(json.loads(plain_text.decode('utf-8'))) else: return plain_text.decode('utf-8')
init_credential()
This function is used to get the username and password, then encrypt the data and saved as an encrypted file, this main function uses the encrypt, and check_key functions.
init_credential()
function returns a dictionary of key filename and encrypted filename for future use.
def init_credential(): cred = prompt_credential() key_filename = input("Key file name:") cipher_text_filename = input("Filename for encrypted file:") print("Creating key {}...\n".format(key_filename)) check_key(key_filename) print("Key {} created...\n".format(key_filename)) encrypt(cipher_text_filename, json.dumps(cred), key_filename) print("Encrypted data into file {}".format(cipher_text_filename)) return { "key_filename": key_filename, "encrypted_filename": cipher_text_filename }
Entire code in action
from cryptography.fernet import Fernet from os.path import exists from getpass import getpass import json def check_key(key_file): if not exists(key_file): key = Fernet.generate_key() with open(key_file, "wb") as file: file.write(key) def prompt_credential(): username = input("Username:") password = getpass() return { "username": username, "password": password } def use_key(key_file): check_key(key_file) with open(key_file, "rb") as file: key_byte = file.read() return key_byte def encrypt(filename, data, key): key_byte = use_key(key) cipher = Fernet(key_byte) cipher_text = cipher.encrypt(data.encode('utf-8')) with open(filename, "wb") as file: file.write(cipher_text) def decrypt(filename, key, convert_to_json=False): key_byte = use_key(key) with open(filename, "rb") as file: cipher_text = file.read() cipher = Fernet(key_byte) plain_text = cipher.decrypt(cipher_text) if convert_to_json: return dict(json.loads(plain_text.decode('utf-8'))) else: return plain_text.decode('utf-8') def init_credential(): cred = prompt_credential() key_filename = input("Key file name:") cipher_text_filename = input("Filename for encrypted file:") print("Creating key {}...\n".format(key_filename)) check_key(key_filename) print("Key {} created...\n".format(key_filename)) encrypt(cipher_text_filename, json.dumps(cred), key_filename) print("Encrypted data into file {}".format(cipher_text_filename)) return { "key_filename": key_filename, "encrypted_filename": cipher_text_filename }
init_cred.py run this first
Run this script to get the username and password from user.
The json.dump
function is to convert a string data into json to a file, dump
method requires a file pointer which is j
and the data.
from security.crypto import init_credential import json crypto_result = init_credential() with open("crypto_result.json", "w") as j: json.dump(crypto_result, j)
ftp_cmd.py
is the collection of ftp command function
Currently this script only has one function for download files.
from ftplib import FTP from os import chdir def ftp_downloader(filename, dir, cred_dict, host="ftp.pyclass.com"): # use the with context to automatically close the ftp connection. with FTP(host, cred_dict['username'], cred_dict['password']) as ftp_client: # ftp command to change working directory ftp_client.cwd(dir) # save the downloads to the specified windows directory. chdir("D:\\temp") with open(filename, "wb") as file: # write the binary to the file. # note that file.write is not file.write(). ftp_client.retrbinary("RETR {}".format(filename), file.write)
Threading to download files
This is a subclass to use threading to download files from the ftp server.
from threading import Thread from ftp_cmd import ftp_downloader # sub class of Thread class GetFilesFromFTP(Thread): def __init__(self, filename, dir, cred_dict, name=None): super().__init__() self.filename = filename self.dir = dir self.cred_dict = cred_dict self.name = name # overriding the run method of Thread class def run(self): print("Downloading from /{}/{}".format(self.dir, self.filename)) ftp_downloader(self.filename, self.dir, self.cred_dict)
main.py
runs all scripts together
This function runs all scripts together, it creates a thread from the subclass GetFilesFromFTP
.
from network_threads.get_files import GetFilesFromFTP from security.crypto import decrypt import json from time import time from os.path import getsize # get the file names so that these names are used for reading credential. with open("crypto_result.json", "r") as j: j_data = json.load(j) # grab the credential cred_dict = decrypt(j_data.get("encrypted_filename"), j_data["key_filename"], convert_to_json=True) # files I need to download from ftp server. files = ["data-format.txt", "data-technical-document.txt", "isd-lite-format.pdf", "station-info-metadata.txt", "station-info.txt"] # for collecting child threads. threads = list() if __name__ == '__main__': start = time() # for each file create a thread. for file in files: t = GetFilesFromFTP(file, "Data", cred_dict, name=file) threads.append(t) print("Starting thread to download {}".format(file)) t.start() for thread in threads: t.join() print("Thread {} is finished, joining back to main thread...".format(thread.name)) print("All threads finished, total {} seconds...".format(time() - start)) print("Checking the downloaded file size in bytes...\r") for file in files: print("{} is {} bytes..\r".format(file, getsize(file)))
How the concurrent download looks like and how long it takes?