Borys Data Science Engineer
Summary
Certified Data Scientist with a strong focus on NLP, CV, and Recommender Systems backed by 4 years of commercial experience. Proficient in Python with a rich toolset including Pandas, numpy, TensorFlow, and Keras. Possesses a solid track record in building products from scratch and devising innovative solutions with machine learning and data processing methodologies. Hands-on experience in deploying scalable solutions using Kubeflow, Docker, and CI/CD practices, complemented by proficiency with various databases such as MySQL and BigQuery. With a Bachelor’s and Master’s degrees in Cybersecurity Engineering, and continued education via a PhD, the engineer exemplifies a deep understanding of computer science fundamentals and data science trends. This technical expertise, combined with domain knowledge in e-commerce and network security, distinguishes the potential candidate as a valuable asset for fostering growth and innovation in technology-driven environments.
Work Experience
Data Scientist (CV), E-commerce Data Analysis and Visualization
Duration: July 2021 - presentSummary: Conducted in-depth data analysis for E-commerce services, created interactive dashboards with Tableau, and implemented a custom ETL solution for data infrastructure enhancement.
Responsibilities: Comprehensive data analysis, ETL solution development, interactive dashboards creation
Technologies: Python, ChurnZero, MySQL, AWS RedShift, Requests, SQLAlchemy, Tableau
Data Scientist (CV), NLP Chat-bot and Product Recommendation
Duration: July 2021 - presentSummary:
- Implemented a Chat-bot for user interaction and product recommendation utilizing OpenAI GPT models including GPT-3 and GPT-4
- engineered a vector search and an API within a Docker container
Technologies: Python, OpenAI, GPT-3, GPT-4, Chat GPT, Langchain, Docker
Data Scientist (CV), OpenAI Model Fine-Tuning for Text Classification
Duration: July 2021 - presentSummary: Implemented OpenAI model fine-tuning for text classification, developed an API for model updating and serving.
Responsibilities: Training data creation, dataset preparation, model fine-tuning, API development
Technologies: Python, OpenAI, GPT-3 (davinci, curie, babbage, ada)
Data Scientist (CV), OpenAI Chat-like Interaction Research
Duration: July 2021 - presentSummary: Researched and created proofs of concept for OpenAI models' chat-like discussion capabilities, used few-shot learning and fine-tuning approaches.
Responsibilities: Training data engineering, model fine-tuning, inference testing
Technologies: Python, OpenAI, GPT-3, GPT-4, Chat GPT, Langchain
Data Scientist (CV), Logistics Route Optimization
Duration: July 2021 - presentSummary: Conducted exploratory data analysis and deployed machine learning solutions for optimizing logistic routes via statistical and ML models.
Responsibilities: Data engineering, EDA execution, groups division for target data, data visualization research
Technologies: Python, Pandas, SciPy, Sklearn, Matplotlib, Plotly, Prophet, BigQuery
Data Scientist (CV), Real Estate Image Classification
Duration: July 2021 - presentSummary: Resolved multi-label classification tasks for room type and features identification from images in the real estate industry.
Technologies: Python, Tensorflow, Google Vision AutoML, Image multi-label classification, Docker, Flask
Data Scientist (NLP), Semantic Text Similarity Service
Duration: July 2021 - presentSummary: Developed a Semantic Text Similarity service to assist with the reduction of localization costs by grouping semantically similar strings prior to translation.
Technologies: Python, Tensorflow, Docker, USE, FastAPI, SQL, tf-serving
Data Scientist (NLP), Morphology Analysis Service
Duration: July 2021 - presentSummary: Developed a morphology analysis service covering POS-tagging, NER, lemmatization, and glossary extraction to maintain consistency in translations.
Technologies: Python, Spacy, nltk, Docker, Flask
Data Scientist (NLP), Translation Alignment Service
Duration: July 2021 - presentSummary: Created a service for aligning translations without identifiers to the source strings in localization management platforms.
Technologies: Python, Tensorflow, USE, Clustering, Flask, Docker, tf-serving
Data Scientist (Recommender systems), Online Retail Product Recommendations
Duration: July 2021 - presentSummary: Developed product recommendation systems for online retail using implicit feedback, performed A/B testing, and developed a model serving application.
Responsibilities: Recommendation system development, model offline and online evaluation, model serving app development
Technologies: Python, GCP, Tensorflow, Kubeflow
Data Scientist, SIEM System Enhancement
Duration: January 2020 - July 2021Summary: Enhanced the Company's SIEM system with scalable and supportable software, including the development of ML models for network attack detection and feature engineering.
Responsibilities: Unsupervised and supervised ML models development, feature engineering, model training and evaluation
Technologies: Python, ELK stack, scikit-learn, XGBoost
Education
- Bachelor’s Degree in Cybersecurity Engineering
Ternopil Ivan Puluj National Technical University - Master’s Degree in Cybersecurity Engineering
Ternopil Ivan Puluj National Technical University - Ongoing PhD in Computer Science
Ternopil Ivan Puluj National Technical University
Certification
- Kyivstar Big Data School 4.0
Program of the school introduces students to major concepts and techniques of data science process: predictive analytics and machine learning at scale, big data tools and technologies, basics of business analytics
2019 - Data Science Camp Offline ML course at SmartInsight
2021 - Introduction to Recommender Systems: Non-Personalized and Content-Based
Coursera
2022 - Nearest Neighbor Collaborative Filtering
Coursera
2022 - Convolutional Neural Networks in TensorFlow
Coursera
2022 - Natural Language Processing with Classification and Vector Spaces
Coursera
2023 - Natural Language Processing with Probabilistic Models
Coursera
2023