Oleh O, AI/ML Engineer and Data Scientist

Data Science (7.0 yr.), AI and Machine Learning (4.0 yr.)

Summary

* Data Scientist with a Master’s Degree in Computer Science and extensive experience in machine learning, deep learning, and cloud services ( Azure,AWS and GCP).
* 7 years of experience, proficient in building ML pipelines and deploying scalable solutions.
* Developed and deployed a voice-to-voice pipeline for a call center VoIP system using Whisper, Llama, and MMS-TTS, including API integrations and Docker deployment.
* Implemented a Google Doc AI Wrapper, improving document recognition accuracy through advanced preprocessing techniques.
* Designed a deforestation detection system using clustering and Azure-based services to monitor forest areas.

Work Experience

Caller - VoIP server for call center

Implemented voice-to-voice pipeline in Hebrew for VoIP server for call center.

Responsibilities:

Implemented voice-to-voice pipeline Whisper (STT) - Llama (text-only) - mms-tts (TTS) - OpenVoice (voice cloning),
Implemented API to interact with models and to integrate them with the VoIP server,
Prepared dataset for training text-to-speech and speech-to-text models,
Trained whisper model,
Deployed models using Docker, created Dockerfiles for models,
Prompt engineered LLama,
Configured Asterisk VoIP server,

Tools and technologies: Docker, AWS, ollama, vLLM, Llama, Whisper, MMS-TTS, OpenVoice, Prompt Engineering, Asterisk

Google Doc AI Wrapper

Implemented wrapper for Google Doc AI service in order to improve service accuracy in recognizing freeform documents

Responsibility:

Used image preprocessing techniques to improve document recognition by Google service,
Parsed recognized document data and searched for desired form fields, tables and other valuable data,
Implemented processing and filtering recognized table data,
Implemented optimization function that combine different image preprocessing methods in order to find combination that provide most recognized data that are searched for

Tools and technologies: Google document ai, OpenCV, pillow, Pandas

Assets tracking with camera POC

tracking asset’s location: real-time positioning, collecting information about cargo and handling equipment using machine vision

Responsibilities:

Trained container segmentation model, added OCR,
Created a pipeline for localizing an object on a map using a depth camera,
Implemented optimization function that combine different image preprocessing methods.

Tools and technologies: Google Vision API, OAK-D (OpenCV Kit), Tensorflow, Keras, Unity3D (synthetic data generation, visualization)

LiveScetchScanner

Implemented camera image and mask processing module; implemented voice command interface

Responsibilities:

Extraction image part based on mask from NN and Hough line detector,
Image transformation, processing, filtering,
implementation voice command interface using Amazon Lex bot API and rhino model

Tools and technologies: OpenCV, Pyaudio, Pvporcupine, rhino, Amazon Lex, boto3

Meowtalk

App to translate the cat’s meows

Responsibilities:

Created Docker containers with code parts needed for model training,
set up training pipeline for Meowtalk model using Kubeflow Pipeline in GCP,
Modified data preprocessing to achieve better accuracy of the model.

Tools and technologies: Docker, Kubeflow pipeline, GCP, Keras

Creating the model to predict the playoff refsults of the World Cup

Responsibilities:

Data analysis, preprocessing, data cleaning
Search for additional data
Feature development based on all available data
Dataset formation for training models
Selection of models (Linear models, Decision trees, MLP) for prediction and analysis of results

Tools and technologies used: scikit-learn, PCA, Linear models, Decision trees, MLP

Meeting summarizer

Implemented model for creation summaries of audio meetings based on article

Responsibility and achievements:

Research of existing abstractive text summarization methods
Implemented tree-based method

Tools and technologies used: Python, Azure, spacy, NLP-abstractive text summarization, MIP

Deforestation detection

Developing model for deforestation detection on Ukraine territory

Achievements:

Collected dataset of map fragments: writing scripts to work with the API, searching for the necessary fragments by coordinates, date, filtering by cloudiness and types of fragments and their download
Created a dataset for network training by clustering images (k-means, image clustering), searching for outliers / anomalies, and then selecting images from different clusters
Implemented forest watch service based on Azure function
Implemented a model for deforestation detection.

Tools and technologies used: Rasterio, MS Azure, image clustering

Education

Master’s Degree in Computer Science, National University

Not your tech stack?

Join the Upstaff community and we are looking for the best project for you. Be ready for the next steps: Create your profile on our website (import from LinkedIn)

20-30-minute screening call
Technical interview
Feedback
Project Selection (we are looking for the best project for you).

We work with developers from 50+ countries in different regions: Europe, LATAM, the U.S. (W-9 form owners), Canada, Asia (Philippines, Indonesia), Oceania (Australia, New Zealand, Papua New Guinea), and the the UK.

We don’t have a legal and ethical basis to accept applicants from the following countries: Russia, Belarus, Iran, North Korea
We do not provide visa assistance, and our cooperation model does not include the benefits typically offered with direct hire.