Automated Data Import
A Python-based system for daily automated downloading, extracting, processing, and archiving files from FTP servers.
The script connects to multiple FTP endpoints, downloads ZIP files matching specified criteria, extracts and renames text files,
updates modification dates, and organizes them into target directories. After each run, a detailed log is generated, sent
via e-mail, and archived monthly.
AI Chatbot for PDF Manuals
A Telegram chatbot that answers user questions about the content of technical PDF manuals. The bot processes the PDF,
splits it into sections and chunks, and uses semantic search (sentence-transformers, scikit-learn) to find the most relevant fragments.
Answers are generated step-by-step using Google Gemini API and delivered via Telegram. Built with Python, langchain, PyMuPDF, and
asyncio for efficient, asynchronous user interaction.
OCR Protocol Automation
A tool for automated text extraction from scanned protocol documents (PDFs) using Google Cloud Vision API.
The script classifies document types based on headers, extracts employee names with regular expressions,
and renames files accordingly. This solution supports efficient digital archiving and data extraction from large
batches of scanned documents. Developed in Python with pdf2image and robust error handling.