Python

[Python] Quick sort

def quick_sort(nums, left=None, right=None):
    def _partition(nums, left, right, pivot_idx):
        pivot = nums[pivot_idx]
        nums[pivot_idx], nums[right] = nums[right], nums[pivot_idx]
        tmp_idx = left
        for idx in range(left, right):
            if nums[idx] <= pivot:
                nums[tmp_idx], nums[idx] = nums[idx], nums[tmp_idx]
                tmp_idx += 1
        nums[tmp_idx], nums[right] = nums[right], nums[tmp_idx]
        return tmp_idx

    if left is None:
        left = 0
    if right is None:
        right = len(nums)-1
    if left >= right:
        return

    new_pivot_idx = _partition(nums, left, right, left)
    quick_sort(nums, left, new_pivot_idx-1)
    quick_sort(nums, new_pivot_idx+1, right)


# inputs = input()
# inputs = [int(c) for c in inputs.split()]
inputs = [3, 5, 1, 2, 6, 4, 7]
quick_sort(inputs)
print(inputs)

[Python] Multiprocessing task with tqdm

We can use tqdm and multiprocessing together by utilizing imap.

import multiprocessing as mp
from tqdm import tqdm

def parallel_work(args):
    my_arg_a= args['my_arg_a']
    my_arg_b= args['my_arg_b']
    result = do_things(my_arg_a, my_arg_b)
    return result

if __name__ == '__main__':
    # Build a list with objects containing desired arguments
    parallel_work_args = [{'my_arg_a': 'a', 'my_arg_b': 'b'}, ...]  

    # If you encounter deadlock issue, try using the 'spawn' method for children forking
    # with mp.get_context("spawn").Pool(mp.cpu_count()) as pool:
    with mp.Pool(mp.cpu_count()) as pool:
        # Parallel work each arg object
        # list() and total are required 
        results = list(tqdm(pool.imap(parallel_work, parallel_work_args), total=len(parallel_work_args)))

[Python] import 機制要點

  • https://docs.python.org/3/reference/import.html#package-relative-imports
  • script和module不同,script通常在package外執行
  • package內部,相鄰的module需使用relative import或是absolute import。例如:
package/A.py:
import package.B
from . import B
import package.B.func
from .B import func
# 若只使用”from B import func”會導致”找不到模組”
  • 執行script時,該script的所在位置會被加入path中,因此同階級的script之間是能直接互相import的。

[Python] Gunicorn

Basic usage

gunicorn -k gevent -t 300 -w 4 -b 0.0.0.0:9500 main:app

—Gevent: asynchronous request handling

—Timeout: 300 sec

# of worker processes: 4

Deploy with nginx

/etc/systemd/system/myapi.service

[Unit]
Description=Gunicorn instance for flask API
Requires=myapi.socket
After=network.target

[Service]
User=hubert
Group=hubert
WorkingDirectory=/home/hubert/projectdir
ExecStart=/home/hubert/miniconda3/bin/gunicorn -k gevent -t 300 -w 2 myapp:app
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=mixed
TimeoutStopSec=5
PrivateTmp=true

[Install]
WantedBy=multi-user.target

/etc/systemd/system/myapi.socket

[Unit]
Description=Gunicorn socket for my API.

[Socket]
ListenStream=/run/myapi.sock
# Our service won't need permissions for the socket, since it
# inherits the file descriptor by socket activation
# only the nginx daemon will need access to the socket
User=nginx
# Optionally restrict the socket permissions even more.
# Mode=600

[Install]
WantedBy=sockets.target

nginx site conf file

server {
        listen       9800;
        location / {
            proxy_pass http://unix:/run/myapi.sock;
        }
    }

[Python] Multiprocessing example

import multiprocessing as mp

def task(x):
    # Do something
    return x

if __name__ == '__main__':
    with mp.Pool(mp.cpu_count()) as pool:
        results = []
        # Using apply_async
        for i in range(cpu_count):
            result = pool.apply_async(task, args=(i,))
            results.append(result)
        for result in results:
            print(result.get())

        # Using map_async
        result = pool.map_async(task, range(cpu_count))
        print(result.get())

[Python] unittest template for python script

import unittest
import subprocess

def execute_command(command):
    """Execute a shell command and return its printed value."""
    return subprocess.run(command, stdout=subprocess.PIPE).stdout.decode('utf-8').strip()

class TestTemplate(unittest.TestCase):
    def test_case1(self):
        result = 1
        expected_result = 1
        self.assertEqual(result, expected_result)

if __name__ == '__main__':
    unittest.main(verbosity=2)

[Python] Send an email

import smtplib

# user settings
email_addr = 'your@email'
email_pass = 'yourpassword' # you might need to use an application password
msg = "your message"

# send simple email
server = smtplib.SMTP('smtp.gmail.com', 587)
server.starttls()
server.login(email_addr, email_pass)
server.sendmail(email_addr, email_addr, msg)
server.quit()

[Python] 虛擬環境簡易用法

建立虛擬環境

python -m venv mytargetdir

啟用虛擬環境

source activate myenv

關閉虛擬環境

source deactivate

export環境:

pip freeze > requirements.txt

import環境:

pip install -r requirements.txt

*conda請使用以下指令:

#新建virtual env:
conda create -n yourenvname python=x.x

#使用環境檔案建立虛擬環境
conda env create -f environment.yml

#查看所有env:
conda info -e

#啟用虛擬環境
conda activate yourenvname

#關閉虛擬環境
conda deactivate

#刪除虛擬環境:
conda env remove -n yourenvname

#匯出虛擬環境:
conda env export > environment.yml

#使用環境檔案安裝套件
conda env update -n yourenvname --file environment.yml

[Python] Big5 and utf-8

中文的 windows cmd 編碼預設是Big5(cp950) ,而 Python3 的預設程式碼編碼是utf-8 (cp65001),如果在輸出時產生 「UnicodeEncodeError: ‘cp950’ codec can’t encode character … …」的錯誤必須在程式開頭加上:

# -*- coding: utf-8 -*-

若程式會在console輸出,則使用

print(mytext.encode(sys.stdin.encoding, "replace").decode(sys.stdin.encoding))

若使用檔案輸出,則必須在open中指定編碼參數:

out_file = open("File.txt","w",encoding="utf-8")