Yurii, Data Scientist

Vetted expert in Data Science (4.0 yr.), AI and Machine Learning (4.0 yr.)
english B2 (Upper-Intermediate) English
seniority Middle (3-5 years)
location Lviv, Ukraine UTC+02:00

Summary

- Data Scientist with 4+ years of experience in AI and machine learning;
- Specialized in NLP, time series forecasting, and generative AI;
- Built RAG systems using OpenAI, Langchain, and custom pipelines;
- Developed multi-agent systems for sales enablement, education, and virtual assistants;
- Proficient in Python, SQL, and ML libraries like Pandas, Sklearn, Keras, and PyTorch;
- Created legal assistants with HuggingFace models and citation-based RAG responses;
- Built voice-based chatbots using OpenAI Whisper and ElevenLabs for voice cloning and audio processing;
- Designed pipelines for text-to-speech and image generation in mobile and cloud environments;
- Extracted and analyzed financial data using AWS Textract and OpenAI VLMs;
- Built production-ready support bots using Dialogflow CX, Twilio, and Google Firestore;
- Experienced with AWS and GCP for scalable model deployment.

Main Skills

Python, Data Scientist

Python

SQL, Data Scientist

SQL

NLP, Data Scientist

NLP

Gen AI, Data Scientist

Gen AI

RAG, Data Scientist

RAG

AI & Machine Learning

AWS Textract Claude DialogFlow Docling ElevenLabs Gemini Gemini 3 Gen AI Grok Hugging Face Keras LangChain LangGraph NLP NumPy OpenAI PyTorch RAG Scikit-learn Spacy Whisper

Programming Languages

Python

Python Libraries and Tools

Gensim Keras Matplotlib NLTK NumPy Pandas PyTorch Scikit-learn Seaborn

Data Analysis and Visualization Technologies

Pandas

Databases & Management Systems / ORM

MySQL Oracle Database PostgreSQL Sphinx SQL SQL queries

Cloud Platforms, Services & Computing

GCP

Google Cloud Platform

Google Cloud Pub/Sub

Deployment, CI/CD & Administration

Active Directory Flux

SDK / API and Integrations

API GraphQL Twilio

Operating Systems

Unix

Web/App Servers, Middleware

WildFly

Other Technical Skills

Currency Cloud Loco Translate
ID: 600-265-507
Last Updated: 2025-04-17

Work Experience

Data Scientist, Multi-Agent Assistant for Sales Enablement

Duration: 6 months

Summary: Developed a multi-agent assistant for sales enablement, capable of autonomously preparing lead summaries, company/industry reports, and meeting agendas.

Responsibilities:

  • Designed and implemented a multi-agent system that used Tavily API for intelligent web search and synthesis;
  • Automated generation of sales research reports (for person, company, industry) and meeting agendas based on lead data and found information;
  • Built a knowledge base system allowing users to upload files or provide website links for ingestion and querying;
  • Developed a RAG interface to let users interact with their documents and websites in natural language.

Technologies: Python, OpenAI, Tavily, Docling, pgvector, SQL, Langchain.

Data Scientist, Legal Assistant using RAG System

Duration: 3 months

Summary: Legal assistant powered by a RAG system for U.S. constitutional and federal law. Enabled users to ask complex legal questions and receive accurate, citation-based responses grounded in the U.S. Constitution, federal law, and New York State regulations.

Responsibilities:

  • Prepare dataset by parsing and embedding legal documents using text embeddings model;
  • Researched and experimented with different text embedding models from HuggingFace to optimize performance and quality;
  • Implemented a process for intelligent query reformulation step using RAG to improve retrieval accuracy based on the user's intent.

Technologies: Python, HuggingFace, Langchain, OpenAI, AWS.

Data Scientist, Virtual Agent for FAQs and Ticket Logging

Duration: 1 year

Summary: Create a virtual agent for FAQs and logging tickets on TechSupport system. Extract names, mails, phone numbers, issues and details during the conversation fill the ticket on system or send self-guide.

Responsibilities:

  • Design conversational flow and develop a chatbot for realistic conversation;
  • Create a system for managing the connection to the TechSupport system for creating tickets;
  • Extract detailed information from conversation and post-process conversational records and text;
  • Implement issues classification system for smarter assignment of tickets on the TechSupport side;
  • Integrate OpenAI Whisper to improve sound processing.

Technologies: Python, Dialogflow CX, Spacy, NLTK, Twillio, Google Firestore, OpenAI API, GCP.

Data Scientist, DriveED

Duration: 6 months

Summary: Developed a multi-agent generative AI system to automate the creation of lesson plans aligned with U.S. educational standards.

Responsibilities:

  • Designed a modular pipeline for generating lesson plans, integrating curriculum standards, success criteria, and textbook-based task generation;
  • Implemented multi-agent orchestration for concept-based task design, visual asset generation, and final lesson assembly;
  • Tuned agents to adapt lessons by grade level and complexity;
  • Integrated RAG to access relevant textbook content and success criteria dynamically;
  • Contributed to task formatting and refinement to meet pedagogical goals and improve classroom usability.

Technologies: Python, OpenAI, Langchain.

Data Scientist, Interpretr AI 

Duration: 6 months

Summary: A mobile app designed for recording and interpreting users' dreams through a psychoanalytic lens. The bot engages users in interactive discussions to gather insights into their dreams and take into account their life context, aiming to provide meaningful interpretations.

Responsibilities:

  • Crafting a natural dialogue flow to mimic friendly and insightful real-time conversations;
  • Building a system for collecting and analyzing detailed information on dreams, and providing interpretations rooted in Jungian theory;
  • Set Up ElevenLabs with cloning and tuning;
  • Upgrade pipeline for real-time Text to Speech;
  • Integrate Image Generation pipeline with Flux for visualization of Dreams;
  • Image generation with Flux.

Technologies: OpenAI, Flux, Python, ElevenLabs, Grok, Claude, langchain, AWS.

Data Scientist, Tech Cargo System

Duration: 1 year

Summary: The project focused on analyzing and extracting financial information from documents to assess the financial health of companies. Supported multilingual and multi-format document processing.

Responsibilities:

  • Leveraged AWS Textract for automated data extraction from financial documents, including balance sheets, income statements, and cash flow statements;
  • Designed workflows for financial health analysis using custom formulas and algorithms;
  • Integrated OpenAI Visual Language Models for testing advanced data extraction capabilities;
  • Built multilingual pipelines using AWS APIs;
  • Enhanced support for diverse document types, ensuring scalability and robustness.

Technologies: Python, AWS Textract, AWS Textract Queries, OpenAI API (Vision and Text), AWS Translate, AWS Currency Converter.

Education

  • Master of Computer Science
  • Bachelor of Computer Science

Ready to hire Yurii
or someone with similar skills?

All developers are available for an interview. Let's discuss your project/vacancy.
Book A Call