Oleg B. — ML Engineer/Big Data Architect
Expertise in Data Engineer.
Last verified on August 05, 2023Core Skills
Bio Summary
- Over 15 years experience in leading the design, developing, and delivery of complex IT projects and high-performance solutions, +10 years in business intelligence and in the data analytics field - Advanced hands-on experience in reactive, microservices-based, distributed system design and development including stream application platforms for advanced analytics including machine learning and data science - Proficient Data Engineer-researcher focused on the immediate benefits for the business using Big Data tools (AWS Glue, AWS Greengrass, AWS EMR, AWS Data Lake) with advanced analytical and visualization APIs (graph DB – Titan, Neo4J, Tinkerpop, software development – Scala, Python) with CI/CD pipelines – Jenkins, Circle CI, GitLab actions - Generative AI - Q&A with multiple choices, pre-trained models (Hugging Faces ecosystem, T5, BERT, GPT), ChatBot for online gambling platform (LangChain, Pinecone, Cohere, Faiss, Hugging Face Hub) - Generative AI in NLP - information retrieval for 1) generate personalized recommendations for products or services based on a user's preferences and past behavior 2) summarize legal documents and contracts, making it easier for lawyers and legal professionals to review and analyze large volumes of legal documents. 3) create content such as product descriptions, blog posts, and social media posts - Recommendations platforms - mobile games platform (generate game recommendations based on player history, promo-offers, AWS Personalize ), self-learning algorithms for data-based risk management in agriculture (Monte-Carlo tree and Markov chains) - Upper-intermediate English. - Availability starting from ASAP
Technical Skills
| Programming Languages | Python, R, Scala |
| Scala Frameworks | Akka, Apache Spark |
| Scala Libraries and Tools | Akka |
| Java Frameworks | Apache Spark |
| AI & Machine Learning | AWS SageMaker (Amazon SageMaker), Keras, Kubeflow, Mlflow, PyTorch, TensorFlow |
| .NET Platform | Azure |
| Python Libraries and Tools | BentoML, Dask, Keras, Matplotlob, Metaflow, Pandas, PyTorch, Seaborn, TensorFlow |
| Python Frameworks | Django |
| Data Analysis and Visualization Technologies | Apache Airflow, Apache Hive, Apache Spark, HBase, Jupyter Notebook, ML, Pandas, Power BI, Sqoop |
| Databases & Management Systems / ORM | Apache Hadoop, Apache Hive, Apache Kylin, Apache Spark, AWS ElasticSearch, AWS Redshift, Cassandra, ELK stack (Elasticsearch, Logstash, Kibana), Microsoft SQL Server, MongoDB, MySQL, Neo4j, Oracle Database, PostgreSQL, Redis, Snowflake, SQL |
| Cloud Platforms, Services & Computing | AWS, Azure, Azure ML, GCP |
| Amazon Web Services | AWS EC2, AWS ElasticSearch, AWS Glue, AWS Kinesis, AWS Lambda, AWS RDS (Amazon Relational Database Service), AWS Redshift, AWS S3, AWS SageMaker (Amazon SageMaker), AWS SAM, AWS VPC |
| Deployment, CI/CD & Administration | Ansible, CI/CD, Helm |
| Web/App Servers, Middleware | Apache HTTP Server |
| Platforms | Apache Mesos |
| SDK / API and Integrations | API |
| Mail / Network Protocols / Data transfer | Consul |
| Operating Systems | Debian, Linux, Ubuntu, Windows |
| Virtualization, Containers and Orchestration | Docker, Kubernetes, Terraform |
| Version Control | Git |
| Collaboration, Task & Issue Tracking | Jira, Redmine |
| Message/Queue/Task Brokers | Kafka |
| Other Technical Skills | Hashicorp, Pachyderm, Raspberry |