I enrolled in this program to build a stronger technical foundation for working with AI products — to understand how models are built and how they can be applied to real problems.
During the program I built several projects:
Computer Vision Model for Retinopathy Detection
Together with my team, I built a CNN trained to detect retinopathy of prematurity from fundus images. It was a joint research project with a regional ophthalmology center, which provided domain expertise and helped with data labeling.
I personally worked across the full pipeline: setting up the labeling process in CVAT, building the dataset, applying data augmentation, fine-tuning EfficientNet for binary classification, and using Grad-CAM for decision visualization.
The model reached 90% recall and 90% accuracy. We later published this work in a peer-reviewed journal. A follow-up study by our colleagues, based on our findings, later reached 97% recall and 97% accuracy on the same task.
Solar Energy Forecasting Model
A research project aimed at predicting power output for photovoltaic stations three days ahead. Since energy generation depends on Global Horizontal Irradiance, the core task was forecasting GHI using historical weather data, the current date, and prior-week weather readings.
The team explored three architectures — LSTM, MLP, and XGBoost — to find the best fit; I built the MLP model. We achieved multi-day forecasts, but the work also surfaced limitations: the models struggled with weather anomalies, and relying on historical data instead of satellite imagery limited overall accuracy.
Legal Document Checker Web App
A chatbot that analyzes contracts, highlights risky clauses in plain language, and answers questions about the document. It supports PDF, DOCX, and scanned JPG uploads.
Pipeline: OCR for JPGs and PDFs → Yandex LLM with a prompt → structured risk summary + context window management for follow-up Q&A.
The main challenges were connecting several models into one pipeline, writing the right prompts, and storing context so users could ask multiple questions about the same document.
Speech-to-Text Web App
Upload any audio file and get a clean transcript back. Built on OpenAI Whisper for multilingual transcription and deployed on Streamlit. Still works