Oleh O, AI/ML Engineer and Data Scientist

Data Science (7.0 yr.), AI and Machine Learning (4.0 yr.)
english B2 (Upper-Intermediate) English
seniority Senior (5-10 years)
location Ukraine UTC+02:00

Summary

* Data Scientist with a Master’s Degree in Computer Science and extensive experience in machine learning, deep learning, and cloud services ( Azure,AWS and GCP).
* 7 years of experience, proficient in building ML pipelines and deploying scalable solutions.
* Developed and deployed a voice-to-voice pipeline for a call center VoIP system using Whisper, Llama, and MMS-TTS, including API integrations and Docker deployment.
* Implemented a Google Doc AI Wrapper, improving document recognition accuracy through advanced preprocessing techniques.
* Designed a deforestation detection system using clustering and Azure-based services to monitor forest areas.

Main Skills

Python, AI/ML Engineer and Data Scientist

Python

Azure, AI/ML Engineer and Data Scientist

Azure

TensorFlow, AI/ML Engineer and Data Scientist

TensorFlow

PyTorch, AI/ML Engineer and Data Scientist

PyTorch

AI & Machine Learning

Amazon Lex AWS ML (Amazon Machine learning services) Google Document AI Keras Kubeflow Linear models LLaMA ollama OpenCV OpenVoice PCA Prompt Engineering PyTorch Spacy TensorFlow Transformer models vLLM Whisper

Programming Languages

Python Libraries and Tools

Keras Matplotlib NLTK Pandas Pillow Pvporcupine PyTorch Seaborn TensorFlow

Mobile Frameworks and Libraries

KVO

JavaScript Libraries and Tools

p5.js

Data Analysis and Visualization Technologies

Decision Tree ETL Jupyter Notebook Pandas

Cloud Platforms, Services & Computing

AWS Azure

Amazon Web Services

AWS Boto3 AWS LightSail AWS ML (Amazon Machine learning services)

Azure Cloud Services

Azure

Google Cloud Platform

GCE GCP BigQuery Google Docs

SDK / API and Integrations

API Google API

Third Party Tools / IDEs / SDK / Services

Asterisk Sublime Text

iOS Libraries and Tools

Core Audio

Virtualization, Containers and Orchestration

Docker GCE

Web/App Servers, Middleware

MSCS (Microsoft Cluster Server)

Platforms

Mail / Network Protocols / Data transfer

VoIP

Other Technical Skills

MIP MIPS MLP MMS MMS-TTS PELCO-D Rhino Grasshopper
ID: 400-264-500
Last Updated: 2025-02-27

Work Experience

Caller - VoIP server for call center

Implemented voice-to-voice pipeline in Hebrew for VoIP server for call center.

Responsibilities:

  • Implemented voice-to-voice pipeline Whisper (STT) - Llama (text-only) - mms-tts (TTS) - OpenVoice (voice cloning),
  • Implemented API to interact with models and to integrate them with the VoIP server,
  • Prepared dataset for training text-to-speech and speech-to-text models,
  • Trained whisper model,
  • Deployed models using Docker, created Dockerfiles for models,
  • Prompt engineered LLama,
  • Configured Asterisk VoIP server,

Tools and technologies: Docker, AWS, ollama, vLLM, Llama, Whisper, MMS-TTS, OpenVoice, Prompt Engineering, Asterisk

Google Doc AI Wrapper

Implemented wrapper for Google Doc AI service in order to improve service accuracy in recognizing freeform documents

Responsibility:

  • Used image preprocessing techniques to improve document recognition by Google service,
  • Parsed recognized document data and searched for desired form fields, tables and other valuable data,
  • Implemented processing and filtering recognized table data,
  • Implemented optimization function that combine different image preprocessing methods in order to find combination that provide most recognized data that are searched for

Tools and technologies: Google document ai, OpenCV, pillow, Pandas

Assets tracking with camera POC

tracking asset’s location: real-time positioning, collecting information about cargo and handling equipment using machine vision

Responsibilities:

  • Trained container segmentation model, added OCR,
  • Created a pipeline for localizing an object on a map using a depth camera,
  • Implemented optimization function that combine different image preprocessing methods.

Tools and technologies: Google Vision API, OAK-D (OpenCV Kit), Tensorflow, Keras, Unity3D (synthetic data generation, visualization)

LiveScetchScanner

Implemented camera image and mask processing module; implemented voice command interface

Responsibilities:

  • Extraction image part based on mask from NN and Hough line detector,
  • Image transformation, processing, filtering,
  • implementation voice command interface using Amazon Lex bot API and rhino model

Tools and technologies: OpenCV, Pyaudio, Pvporcupine, rhino, Amazon Lex, boto3

Meowtalk

App to translate the cat’s meows

Responsibilities:

  • Created Docker containers with code parts needed for model training,
  • set up training pipeline for Meowtalk model using Kubeflow Pipeline in GCP,
  • Modified data preprocessing to achieve better accuracy of the model.

Tools and technologies: Docker, Kubeflow pipeline, GCP, Keras

Creating the model to predict the playoff refsults of the World Cup

Responsibilities:

  • Data analysis, preprocessing, data cleaning
  • Search for additional data
  • Feature development based on all available data
  • Dataset formation for training models
  • Selection of models (Linear models, Decision trees, MLP) for prediction and analysis of results

Tools and technologies used: scikit-learn, PCA, Linear models, Decision trees, MLP

Meeting summarizer

Implemented model for creation summaries of audio meetings based on article

Responsibility and achievements:

  • Research of existing abstractive text summarization methods
  • Implemented tree-based method

Tools and technologies used: Python, Azure, spacy, NLP-abstractive text summarization, MIP

Deforestation detection

Developing model for deforestation detection on Ukraine territory

Achievements:

  • Collected dataset of map fragments: writing scripts to work with the API, searching for the necessary fragments by coordinates, date, filtering by cloudiness and types of fragments and their download
  • Created a dataset for network training by clustering images (k-means, image clustering), searching for outliers / anomalies, and then selecting images from different clusters
  • Implemented forest watch service based on Azure function
  • Implemented a model for deforestation detection.

Tools and technologies used: Rasterio, MS Azure, image clustering

Education

  • Master’s Degree in Computer Science, National University