Best Practices in Python Bots

This document describes the best practices used by Unimate RPA for building RPA and Analytics bots in Python.

Project Structure

Organize the code in a modular and easy to maintain structure, following this structure:

project_name/
│
├── project_name/        # Paquete principal
│   ├── __init__.py      # Indica que es un paquete de Python
│   ├── main.py          # Punto de entrada principal
│   ├── logs/            # Carpeta para logs del bot
│   ├── tasks/           # Módulos para cada tarea del bot (si es necesario)
│   │   ├── task1.py
│   │   ├── task2.py
│   │   └── ...
│
├── .env                 # Variables de entorno (credenciales seguras)
├── config.yaml          # Archivo de configuración (rutas, parámetros)
├── README.md            # Documentación del bot
├── requirements.txt     # Dependencias del proyecto
├── launcher.py          # Script para gestionar entorno y ejecutar el bot

Modular Code and Separation of Responsibilities

Each functionality should be in its own module, avoiding unnecessary code in main.py.

The flow of bot must be clear and delegate responsibilities correctly to modules in tasks/.

tasks/task1.py (Task-specific module, reusable logic)

"""
Módulo de procesamiento de datos.
"""

def process_data(data):
    """
    Procesa los datos y devuelve el resultado.

    Parámetros:
    data (list): Lista de strings a procesar.

    Retorna:
    list: Lista de strings en mayúsculas.
    """
    return [d.upper() for d in data]

main.py (Entry point with all the logic of bot)

"""
Punto de entrada principal del bot RPA.
"""

import logging
from tasks.task1 import process_data

# Configurar logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def main():
    """Ejecuta el bot y maneja su flujo principal."""
    logger.info("Iniciando bot...")
    data = ["rpa", "automation", "bot"]
    resultado = process_data(data)
    print("Datos procesados:", resultado)

if __name__ == "__main__":
    main()

Use of requirements.txt and Virtual Environment (venv)

A virtual environment should always be used to avoid conflicts between dependencies.

Creation and activation of the virtual environment:

python -m venv env
env\Scripts\activate

Installation of dependencies:

pip install -r requirements.txt

Example of requirements.txt:

python-dotenv
requests
selenium
pyautogui
pandas

Route Management and Configuration in the Bot

To ensure flexibility and scalability in RPA bots, it is essential to avoid hardcoded paths and use a standard approach to managing paths and configurations. This is achieved by combining absolute and relative paths, along with the use of configuration files``(config.yaml``) and environment variables (.env).

Important:

  • config.yaml stores paths, general parameters and settings that are not sensitive.

  • .env must always be present, as it stores credentials and sensitive data.

  • The bot should not be executed if .env is not present.

Use of Configuration with config.yaml and .env

Example of config.yaml:

rutas:
    logs: "logs/bot.log"
    entrada: "data/input.xlsx"
    salida: "data/output.xlsx"

parametros:
    modo_debug: true
    max_intentos: 3

Example of .env:

API_KEY=my_secret_key
DB_PASSWORD=super_secure_password

Configuration loading in main.py

import os
import yaml
from dotenv import load_dotenv
import sys

# Cargar variables de entorno desde .env (el script debe fallar si .env no está presente)
if not os.path.exists(".env"):
    print("Error: Archivo .env no encontrado. El bot no puede ejecutarse sin credenciales.")
    sys.exit(1)

load_dotenv()

# Cargar configuración desde config.yaml
with open("config.yaml", "r") as file:
    config = yaml.safe_load(file)

# Obtener parámetros del YAML
LOGS_PATH = config["rutas"]["logs"]
INPUT_FILE = config["rutas"]["entrada"]
OUTPUT_FILE = config["rutas"]["salida"]
DEBUG_MODE = config["parametros"]["modo_debug"]

# Obtener credenciales desde .env
API_KEY = os.getenv("API_KEY")
DB_PASSWORD = os.getenv("DB_PASSWORD")

print(f"Ruta de logs: {LOGS_PATH}")
print(f"Archivo de entrada: {INPUT_FILE}")
print(f"Modo Debug: {DEBUG_MODE}")

Environment Management: Development and Production

To handle different environments**(development** and production), config.yaml can contain separate configurations for each, avoiding changing the source code when changing environments.

Structure of config.yaml with multiple environments

entorno: "desarrollo"  # Puede ser "desarrollo" o "producción"

configuracion:
    desarrollo:
        rutas:
            logs: "logs/dev_bot.log"
            entrada: "data/dev_input.xlsx"
            salida: "data/dev_output.xlsx"
        parametros:
            modo_debug: true
            max_intentos: 5

    produccion:
        rutas:
            logs: "logs/prod_bot.log"
            entrada: "data/input.xlsx"
            salida: "data/output.xlsx"
        parametros:
            modo_debug: false
            max_intentos: 3

How to load the configuration into the code

The bot should automatically select the environment configuration defined in config.yaml:

import yaml

with open("config.yaml", "r") as file:
    config = yaml.safe_load(file)

# Obtener el entorno activo
entorno_activo = config["entorno"]

# Cargar la configuración correspondiente al entorno activo
configuracion = config["configuracion"][entorno_activo]

# Acceder a las rutas y parámetros
LOGS_PATH = configuracion["rutas"]["logs"]
INPUT_FILE = configuracion["rutas"]["entrada"]
OUTPUT_FILE = configuracion["rutas"]["salida"]
DEBUG_MODE = configuracion["parametros"]["modo_debug"]

print(f"Entorno: {entorno_activo}")
print(f"Ruta de logs: {LOGS_PATH}")
print(f"Modo Debug: {DEBUG_MODE}")

Change of environment without modifying the code

To switch from development to production, simply modify the line:

entorno: "produccion"

It is also possible to make the environment configurable from an .env environment variable:

ENTORNO=produccion

And in the code:

import os

entorno_activo = os.getenv("ENTORNO", "desarrollo")  # Usa "desarrollo" por defecto si no está definido

print(f"Entorno: {entorno_activo}")

Use of Parameters from the Command Line

Bots can receive external parameters when running, instead of relying only on config.yaml.

When to use argparse instead of config.yaml?

  • config.yaml → For fixed parameters in the configuration.

  • argparse → For values that change on each run.

Example of main.py with argparse:

import argparse

def main(fecha):
    print(f"Procesando datos para la fecha: {fecha}")

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Ejecutar bot con fecha específica.")
    parser.add_argument("--fecha", type=str, required=True, help="Fecha en formato YYYY-MM-DD")
    args = parser.parse_args()
    main(args.fecha)

Management of Absolute and Relative Routes

Use of relative routes

Whenever possible, use relative paths within the project in config.yaml:

import os

BASE_DIR = os.path.dirname(os.path.abspath(__file__))
input_path = os.path.join(BASE_DIR, "data", "input.xlsx")
print(f"Ruta absoluta del archivo de entrada: {input_path}")

When to use absolute routes?

Absolute routes may be necessary in some cases:

  1. Interaction with external programs

    • Example: Storing files in a specific folder that another program reads.

  2. Shared network paths

    • Example: Accessing a file on a shared server \\servidor\carpeta\archivo.xlsx.

Example of how to define absolute routes in config.yaml:

rutas:
    reporte: "C:/Users/Usuario/Documentos/reporte.xlsx"

Example in code:

reporte_path = config["rutas"]["reporte"]
print(f"Ruta absoluta del reporte: {reporte_path}")

Logging and Error Handling

Properly capturing and logging errors is critical for debugging and maintaining bots.

Logging configuration

The config.yaml file must define the location of the log file and the logging level:

rutas:
    logs: "logs/bot.log"

logging:
    nivel: "INFO"

Validation of the logging level To avoid errors due to invalid values in config.yaml, it must be validated before applying it:

import logging
import yaml

with open("config.yaml", "r") as file:
    config = yaml.safe_load(file)

LOGS_PATH = config["rutas"]["logs"]
LOG_LEVEL = config["logging"]["nivel"].upper()

if LOG_LEVEL not in ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]:
    LOG_LEVEL = "INFO"

logging.basicConfig(
    filename=LOGS_PATH,
    level=getattr(logging, LOG_LEVEL, logging.INFO),
    format="%(asctime)s - %(levelname)s - %(message)s"
)

logger = logging.getLogger(__name__)

Error Handling in main.py

main.py should catch and log errors using traceback, ensuring that bot does not fail silently.

import traceback
from logs.logger import logger
from tasks.task1 import process_data

def main():
    """Ejecuta el bot y maneja su flujo principal."""
    logger.info("Iniciando bot...")

    try:
        data = ["rpa", "automation", "bot"]
        resultado = process_data(data)
        print("Datos procesados:", resultado)
    except FileNotFoundError as e:
        logger.error(f"Archivo no encontrado: {e}\n{traceback.format_exc()}")
    except ValueError as e:
        logger.warning(f"Error de valor: {e}\n{traceback.format_exc()}")
    except Exception as e:
        logger.critical(f"Error inesperado: {e}\n{traceback.format_exc()}")

if __name__ == "__main__":
    main()

Best practices in error handling:

  1. Log errors in logs, not just print them.

  2. Differentiate recoverable and fatal errors.

  3. Set the logging level in config.yaml to adjust detail without changing code.

Use of launcher.py for Manual and Automatic execution

launcher.py is responsible for:

  • Activate the virtual environment.

  • Install dependencies only if necessary.

  • Ejecutar main.py.

Example of launcher.py:

import os
import subprocess
import sys

BASE_DIR = os.path.dirname(os.path.abspath(__file__))
VENV_PATH = os.path.join(BASE_DIR, "env")
REQUIREMENTS_FILE = os.path.join(BASE_DIR, "requirements.txt")

def setup_environment():
    """Crea el entorno virtual si no existe e instala las dependencias."""
    if not os.path.exists(VENV_PATH):
        print("Creando entorno virtual...")
        subprocess.run([sys.executable, "-m", "venv", VENV_PATH], check=True)

    print("Instalando dependencias si es necesario...")
    subprocess.run([os.path.join(VENV_PATH, "Scripts", "python.exe"), "-m", "pip", "install", "-r", REQUIREMENTS_FILE], check=True)

def run_bot():
    """Ejecuta el bot principal."""
    try:
        subprocess.run([PYTHON_EXEC, "main.py"], check=True)
    except subprocess.CalledProcessError as e:
        print(f"Error en la ejecución de main.py: {e}")
        sys.exit(1)

if __name__ == "__main__":
    try:
        setup_environment()
        run_bot()
    except Exception as e:
        print(f"Error crítico: {e}")
        sys.exit(1)

Run it manually:

python launcher.py

Programming in Windows:

  • Program: C:\ruta\al\proyecto\env\Scripts\python.exe

  • Arguments: C:\ruta "project".launcher.py

  • Start in: C:\ruta "project".

Comments and Docstrings

To keep the code clear and understandable, it is important to document it correctly with docstrings and inline comments.

General rules:

  • Use docstrings in functions and classes to describe their purpose.

  • Use in-line comments only when it is necessary to explain a specific step.

  • Keep comments up to date, avoiding outdated information that may be misleading.

Example of good docstrings and comments in main.py:

"""
Punto de entrada principal del bot RPA.

Este módulo contiene la lógica principal del bot, incluyendo la inicialización
del entorno y la ejecución de tareas específicas.
"""

import logging
from tasks.task1 import process_data

logger = logging.getLogger(__name__)

def main():
    """Inicia la ejecución del bot."""
    logger.info("Bot iniciado.")
    data = ["rpa", "automation", "bot"]
    resultado = process_data(data)
    print("Datos procesados:", resultado)

if __name__ == "__main__":
    main()

Example of documentation with parameters and return in tasks/task1.py:

"""
Módulo de procesamiento de datos del bot.
"""

def process_data(data):
    """
    Procesa los datos y los convierte a mayúsculas.

    Parámetros:
    data (list): Lista de strings a procesar.

    Retorna:
    list: Lista de strings en mayúsculas.
    """
    return [d.upper() for d in data]

Example of inline comments when needed in main.py:

def execute_task():
    """Ejecuta una tarea y maneja errores con traceback."""
    try:
        logger.info("Ejecutando tarea...")
        result = 10 / 0  # Esto generará un ZeroDivisionError
        return result
    except Exception as e:
        logger.error(f"Error en execute_task: {e}")
        return None

Documentation with README.md

Each bot must include a clear and structured README.md file.

Example of README.md:

# Bot RPA con Python

Este bot automatiza procesos repetitivos utilizando Python.

## Estructura del Proyecto

project_name/
│
│── project_name/       # Código fuente
│   │── main.py         # Punto de entrada
│   │── tasks/          # Módulos de tareas
│
│── logs/               # Carpeta de logs
│── .env                # Variables de entorno
│── requirements.txt    # Dependencias
│── launcher.py         # Script de ejecución

## Instalación y Configuración

1. Descomprimir el paquete en la ubicación deseada
    - Extraer el contenido del .zip en una carpeta de trabajo.

2. Ejecutar launcher.py
El archivo launcher.py se encarga de:
    - Crear y activar el entorno virtual.
    - Instalar las dependencias necesarias.
    - Ejecutar main.py.

## Uso

Para ejecutar el bot manualmente:
```sh
python launcher.py
```

Para programarlo en Windows:
- Programa: `C:\ruta\al\proyecto\env\Scripts\python.exe`
- Argumentos: `C:\ruta\al\proyecto\launcher.py`
- Iniciar en: `C:\ruta\al\proyecto\`

## Configuración

El bot utiliza un archivo `.env` para gestionar credenciales y configuraciones sensibles.

Ejemplo de `.env`:
```ini
API_KEY=my_secret_key
DB_PASSWORD=super_secure_password
DB_HOST=localhost
LOG_LEVEL=INFO
```

**Carga de variables en Python:**
```python
from dotenv import load_dotenv
import os

load_dotenv()  # Carga variables de entorno desde .env

API_KEY = os.getenv("API_KEY")
DB_PASSWORD = os.getenv("DB_PASSWORD")
DB_HOST = os.getenv("DB_HOST", "default_host")  # Valor por defecto
```

Efficiency and Best Practices in Code

  • Use comprehension lists instead of unnecessary loops:

# Forma tradicional con for loop
squared_numbers = []
for num in range(10):
    squared_numbers.append(num**2)

# Lista por comprensión (más eficiente)
squared_numbers = [num**2 for num in range(10)]
  • Use context managers to manage files:

# Mala práctica: No se cierra el archivo automáticamente
file = open("archivo.txt", "r")
content = file.read()
file.close()

# Buena práctica: Context manager cierra automáticamente el archivo
with open("archivo.txt", "r") as file:
    content = file.read()
  • Avoid fixed waits at RPA and use dynamic strategies:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Mala práctica: Espera fija, desperdicia tiempo
import time
time.sleep(5)  # Siempre espera 5 segundos aunque el elemento ya esté disponible

# Buena práctica: Espera dinámica con WebDriverWait
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "boton")))

Conclusion

The implementation of these best practices ensures that the bots developed with Unimate are modular, secure, efficient and easy to maintain.

Standards are established for code structure, virtual environment management, dependency management, documentation and automation, ensuring scalable and sustainable development.