Vadym S., Data Engineer
Summary
- 4+ years of experience as a Data Engineer, focused on ETL automation, data pipeline development, and optimization;
- Strong skills in SQL, DBT, Airflow (Python), and experience with SAS, PostgreSQL, and BigQuery for building and optimizing ETL processes;
- Experience working with Google Cloud (GCP) and AWS: utilizing GCP Storage, Pub/Sub, BigQuery, AWS S3, Glue, and Lambda for data processing and storage;
- Built and automated ETL processes using DBT Cloud, integrated external APIs, and managed microservice deployments;
- Optimized SDKs for data collection and transmission through Google Cloud Pub/Sub, used MongoDB for storing unstructured data;
- Designed data pipelines for e-commerce: orchestrated complex processes with Druid, MinIO, Superset, and AWS for data analytics and processing;
- Worked with big data and stream processing: using Apache Spark, Kafka, and Databricks for efficient transformation and analysis;
- Amazon sales forecasting using ClickHouse, Vertex AI, integrated analytical models into business processes;
- Experience in Data Lake migration and optimization of data storage, deploying cloud infrastructure and serverless solutions on AWS Lambda, Glue, and S3.
Main Skills
Python
PySpark
Docker
Apache Airflow
Kubernetes
AI & Machine Learning
Programming Languages
C++ Libraries and Tools
Mobile Frameworks and Libraries
Python Libraries and Tools
Data Analysis and Visualization Technologies
Databases & Management Systems / ORM
Amazon Web Services
Azure Cloud Services
SDK / API and Integrations
Virtualization, Containers and Orchestration
Other Technical Skills
Work Experience
Data Engineer, NDA
(July 2021 - Present)
Data Engineer, ETL Automation
Summary: Building and automating ETL data pipelines with a focus on optimizing PostgreSQL models in the DBT cloud, integrating with third-party APIs using Python, and refactoring Zeppelin notebooks.
Responsibilities: ETL Automation, designing and implementing storage systems, managing API integrations, developing PostgreSQL models in DBT Cloud, establishing microservice deployment jobs, and refactoring notebooks.
Technologies: Python, PySpark, Zeppelin, Docker, AirFlow, Kubernetes, MiniKube, S3, Athena, ECR, PubSub, DBT Cloud, Airbyte, API, BigQuery, PostgreSQL, HiveDB, GitHub, GitLab, Miro, Jira, Teams
Data Engineer, Analytic Platform
Summary: Advanced SDK development to optimize data reception from API endpoints, transformation into special format events, and efficient transmission to Google Cloud Pub/Sub, incorporating MongoDB and Google Cloud storage.
Responsibilities: Optimizing SDK data reception, transforming data into event formats, data transmission with Pub/Sub, integrating MongoDB, and using Google Cloud storage.
Technologies: Python, GCP storage, PubSub, API, MongoDB, GitHub, BitWarden, Jira, Confluence
Data Engineer, E-commerce platform
Summary: Orchestration of complex data pipeline for an e-commerce platform, focusing on data transfer, ingest, processing, and optimization using Airflow, Druid, Minio, and Superset for visualization, and building architectures with AWS.
Responsibilities: Data pipeline orchestration, architectural planning and visualization, workflow optimization, data processing with Spark and Kafka, and implementation with AWS services.
Technologies: Python, Airflow, Druid, Minio, MongoDB, Spark, Kafka, AppFlow, Glue, Athena, Quicksight, PostgreSQL, GitLab, Superset, InSight, Draw.io, Jira, and Confluence
Data Engineer, Retail Platform
Summary: Developed a technical pipeline for a retail platform, emphasizing economic efficiency, integrating key technologies like AWS, Firebase, and Stripe, and utilizing no-code solutions with Xano.
Responsibilities: Technical pipeline development, data transfer optimization, authenticating users with Firebase, payment integration with Stripe, enhancing data processing with AWS IoT, and utilizing Xano's no-code solution.
Technologies: Python, AWS, Xano, Firebase, API, Stripe
Data Scientist, E-commerce Analytic Platform
Summary: Big data analysis and sales forecasting for Amazon product sales, utilizing advanced statistical and programming skills.
Responsibilities: Collecting historical data, preparing sales forecasts, big data analysis, and predictive modeling.
Technologies: Sales Prediction, ClickHouse, Vertex AI, AirFlow, Jenkins, Kibana logs, Keepa
Data Engineer, Simulation, and Automation Worker traffic
Summary: Created a simulation and automation framework for worker traffic, automated EC2 instance management, and deployed solutions using containerization and AWS cloud services.
Responsibilities: Developing simulation and automation framework, managing EC2 instances, and deploying containerized solutions.
Technologies: Python, EC2, ECR, Docker, Windows
Data Engineer, Hotel & Restaurant
Summary: Led the optimization of cloud infrastructure for the hotel industry using AWS services, improving performance, scalability, and cost-effectiveness.
Responsibilities: Cloud infrastructure review and optimization, code refactoring, enhancement of AWS services, and pipeline setup for room price prediction.
Technologies: AWS Lambda, Glue, ECR, DMS, EventBridge, SNS, API Gateway, S3, Python
Data Engineer / Big Data Engineer, Scalable ETL Pipeline with Databricks and AWS
Summary: Designed and implemented an ETL pipeline with Databricks and AWS, processing large-scale data with Apache Spark and integrating with AWS services for schema management, data governance, and real-time processing.
Responsibilities: Designing and implementing end-to-end ETL pipeline, data transformation, and cleaning, metadata and schema management, querying and dashboard integration, and maintaining cost efficiency.
Technologies: Databricks, AWS (S3, Glue, Lambda, EMR), Apache Spark, Delta Lake, Python, SQL
Education
Bachelor in Software Engineer
West Ukrainian National University (WUNU) is a classical university of Ternopil, a leading modern education institution.
2020 - Present
Certification
- Programming for Everybody (Getting Started with Python)
Coursera certificate
- What is Data Science?
Coursera certificate
- Introduction to Data Science in Python
Coursera certificate
- Applied Machine Learning in Python
Coursera certificate
- Amazinum Data Science Camp
Amazinum certificate
- Machine learning with Python
Coursera certificate
- Google Cloud Big Data and Machine Learning Fundamentals
Coursera certificate