Ihor K, Big Data & Data Science Engineer with BI & DevOps skills

Data Engineer, Data Extraction and ETL, Data Science

Summary

- Data Engineer with a Ph.D. degree in Measurement methods, Master of industrial automation
- 16+ years experience with data-driven projects
- Strong background in statistics, machine learning, AI, and predictive modeling of big data sets.
- AWS Certified Data Analytics. AWS Certified Cloud Practitioner. Microsoft Azure services.
- Experience in ETL operations and data curation
- PostgreSQL, SQL, Microsoft SQL, MySQL, Snowflake
- Big Data Fundamentals via PySpark, Google Cloud, AWS.
- Python, Scala, C#, C++
- Skills and knowledge to design and build analytics reports, from data preparation to visualization in BI systems.

WORK EXPERIENCE

Data Engineer

Apr-2011 To Till now

Project: AWS ELT data pipeline and AWS cloud deployment architecture

(2022-06 – current)

Project Description: Creation of ELT pipelines deployed on AWS to collect data from e-commerce platforms

Responsibilities:

architecture design of ELT pipeline that gathers data from e-commerce clients into a single data warehouse.
using DBT for processing customer data and identifying similar attributes.
setting up Airbyte connections, developing custom Airbyte connectors, deploying AWS architecture, Terraform scripting
building custom data management tools, creating data flow security solutions

Tools & Technologies: Python, Airbyte, Kubernetes, AWS EC2, CI/CD, OpenVPN server, AWS Lambda, AWS SQS, Fargate, BigQuery, DBT, Airflow, AWS Cloudwatch, REST API, AWS ECR

Project: Batch and Streaming Data Ingest into DataLake

(2021-10 – 2022-05)

Responsibilities:

Design data processing pipelines for medical/ marketing/ e-commerce applications.
Data Modelling,
Database Design,
Database development,
using DBT for processing patient data,
Big data processing using Spark Scala,
Distributed platform development,
ETL Data Transformation
ETL Architecture and ETL Solutions Design

Tools & Technologies: Python, Scala, DB (SQL, PostgreSQL), DBT, Spark, Hadoop, Terraform, Kubernetes, Helm, GitLab CI/CD, AWS, Keycloak, Swagger, AirFlow.

Project: Audience Segmentation

(2018 - 2021)

Building a custom customer data platform for a marketing company. Build an ETL pipeline that allows retrieving the data from multiple sources and storing them in the private data warehouse in Hadoop. Create CloudFormation "infrastructure as a code" description of the pipeline and CI/CD to deploy it into the desired environment. Work with streaming data in Amazon Kinesis. Design sources for BI reports in AWS.

Responsibilities:

Design and implement batch and event-driven workflows for big data processing
Automated tests for distributed applications
Data analysis and visualization
Develop applications for data ingestion and selection
Develop a recommendation system
Built reporting dashboards in QuickSight from Athena sources.

Tools & Technologies: Python, Scala, SQL, Kubernetes, Spark, Hadoop framework, Docker, AWS (Storage, Database, DocumentDB, Athena, Lambda, Glue, API Gateway, Kinesis, QuickSight, CI/CD AWS CloudFormation and CodePipeline), Grafana, Git.

Data scientist and Data/software engineer

(Jan-2011 To 2018)

data analysis
applying machine learning algorithms
image analysis
image recognition
neural networks developing and tuning
Database development
ETL operations engineering
Development of backend services for data curation
Automated tests for CI/CD workflows

Tools & Technologies: C#, Python, Keras, TensorFlow, Theano, OpenCV, Pandas, Microsoft SQL Server, SQL, .NET Framework,

Associate Professor

(09/1999–Present)

Department of industrial automation

Taught courses:

EDUCATION AND TRAINING

Measurement methods and devices Ph.D. Degree, EQF level 8
Master of industrial automation, EQF level 7

COMMUNICATION SKILLS

Communication skills both oral and written gained as a university professor and R&D projects participant
Presentation skills gained as a scientific conference speaker

COURSES & CERTIFICATES:

AWS Certified Data Analytics
HDP Overview: Apache Hadoop Essentials (SPLL)
Feature Engineering with PySpark
Big Data Fundamentals via PySpark
Deep Learning in Python
Intermediate Python for Data Science
Linear Classifiers in Python
Machine Learning with the Experts
Python Data Science Toolbox

Not your tech stack?

Join the Upstaff community and we are looking for the best project for you. Be ready for the next steps: Create your profile on our website (import from LinkedIn)

20-30-minute screening call
Technical interview
Feedback
Project Selection (we are looking for the best project for you).

We work with developers from 50+ countries in different regions: Europe, LATAM, the U.S. (W-9 form owners), Canada, Asia (Philippines, Indonesia), Oceania (Australia, New Zealand, Papua New Guinea), and the the UK.

We don’t have a legal and ethical basis to accept applicants from the following countries: Russia, Belarus, Iran, North Korea
We do not provide visa assistance, and our cooperation model does not include the benefits typically offered with direct hire.