Upstaff Sign up
Amit
Created AtUpstaffer since March, 2022

Amit — Expert Data Engineer

Expertise in Data Engineer.

Last verified on July, 2023

Core Skills

Apache Hadoop
Apache Hadoop
Kafka
Kafka
GCP
GCP
AWS
AWS

Bio Summary

- 8+ year experience in building data engineering and analytics products (Big data, BI, and Cloud products)
- Expertise in building Artificial intelligence and Machine learning applications.
- Extensive design and development experience in AZURE, Google, and AWS Clouds.
- Extensive experience in loading and analyzing large datasets with Hadoop framework (Map Reduce, HDFS, PIG and HIVE, Flume, Sqoop, SPARK, Impala), No SQL databases like Cassandra.
- Extensive experience in migrating on-premise infrastructure to AWS and GCP clouds.
- Intermediate English
- Available ASAP

Technical Skills

Programming LanguagesJavaScript, PL, Python, Scala
AI & Machine Learningartificial intelligence, AWS ML (Amazon Machine learning services), Machine Learning
.NET PlatformAzure
Java Libraries and ToolsJSON
Data Analysis and Visualization TechnologiesApache Hive, Apache Pig, Attunity, AWS Athena, Databricks, Domo, Flume, Hunk, Impala, Map Reduce, Oozie, Presto S3, Snaplogic, Sqoop
Databases & Management Systems / ORMApache Hadoop, Apache Hive, AWS Redshift, Cassandra, MySQL, Neteeza, Oracle Database, Snowflake, SQL
Cloud Platforms, Services & ComputingAWS, Azure, GCP
Amazon Web ServicesAWS Athena, AWS EMR, AWS Kinesis, AWS ML (Amazon Machine learning services), AWS Quicksight, AWS Redshift, AWS SQS
Azure Cloud ServicesDatabricks
Google Cloud PlatformGoogle BigQuery, Google Cloud Pub/Sub
PlatformsApache Solr
Deployment, CI/CD & AdministrationBamboo
Version ControlBitBucket, Git
Collaboration, Task & Issue TrackingIBM Rational ClearCase
Message/Queue/Task BrokersKafka
Operating SystemsLinux, Windows
Scripting and Command Line Interfaces*nix Shell Scripts
Logging and MonitoringSplunk
Other Technical SkillsCloudera search, Lex, Polly, VSS

Projects

Architect , EMPLOYER– RAIDON CLOUD SOLUTIONS

JAN’19– PRESENT
Responsibility:

  • Advanced analytics platform for one of the largest US banks on Google cloud infrastructure using Bigquery / Dataproc (Hive, Spark).
  • Analytical pipeline on AWS- EMR, Kinesis, Athena, Amazon-ML, Lex, Polly.
  • Data lake design and development for an Australian insurance company based on GCP.
  • Data lake strategy and implementation for European AMC based on AWS-EMR (Spark, Hive, Sqoop, Quicksight).
  • Contribution to Wipro BDRE- opens source platform for Blockchain analytics.
  • Real-time streaming platform on Kafka stream/KSQL.
  • Real-time ingestion pipeline with Kafka, Spark streaming.
  • Design and development of Snowflake data warehouse and integration with AWS lambda. Development of Stored procedures, tasks, and python client for data ingestion into Snowflake.
  • Pyspark design and development using Databricks.
  • Stakeholder management.
  • Strategic directives for the advanced analytics capabilities.

Technologies:Azure, GCP and AWS, Advanced Analytics (AI and ML), Big data and Hadoop

Senior Data Developer, Lead Online business for Tesco, EMPLOYER– TESCO

MAR’16– DEC’18
Responsibility:

  • Design and development of a machine learning platform for the marketplace using Kafka, Spark streaming, Spark ML modules.
  • Design and development of batch and real-time analytics systems based on Hive, Spark
    and Kafka, AWS and Azure Cloud.
  • Design and development of a visualization layer based on Domo.
  • Created Athena pipeline in sync with Hive Metastore to directly read S3 buckets.
  • PySpark development on Databricks.
  • Migration of PIG scripts from Cloudera hadoop to Databricks pySpark layer.
  • Built automation framework for Spark and Hive jobs.
  • Developed JSIVE utility to automatically create hive ddl for complex JSON schemas.
  • Developed JSON data generator for creating test data on research clusters to help data scientists.

Senior big data developer, EMPLOYER – BLACKROCK INC.

 APRL’14– MAR’16
Description: Worked as a senior big data developer in the web product research team at Blackrock in Gurgaon /Bangalore, designing and developing big data applications.
Environment: Hadoop streaming (Python), HIVE, Sqoop, Shell scripting, TWS scheduler, Impala
The purpose of the project is to replace existing RUFF calculations from conventional ETL
platforms with Hadoop. This would ensure faster and on-time delivery of Loss and Policy models
to clients.
Responsibility:

  • Involved in project design and creation of technical specifications.
  • Developed Sqoop based ETL systems to bring data from EDW data warehouse.
  • Created Hive tables to store and transform files from ADW data warehouse.
  • Written MR steaming jobs in Python.
  • Involved in creating TWS workflow to automate data transformation and presentation processes.
  • Developed the process for downstream systems using IMPALA.
  • Participated in deployment, system testing, UAT.
  • Prepared implementation plans for moving a code to production.

How to hire with Upstaff

1

Talk to Our Talent Expert

Our journey starts with a 30-min discovery call to explore your project challenges, technical needs and team diversity.

2

Meet Carefully Matched Talents

Within 1-3 days, we’ll share profiles and connect you with the right talents for your project. Schedule a call to meet engineers in person.

3

Validate Your Choice

Bring new talent on board with a trial period to confirm you hire the right one. There are no termination fees or hidden costs.

Why Upstaff

Upstaff is a technology partner with expertise in AI, Web3, Software, and Data. We help businesses gain competitive edge by optimizing existing systems and utilizing modern technology to fuel business growth.

Real-time project team launch

<24h

Interview First Engineers

Upstaff's network enables clients to access specialists within hours & days, streamlining the hiring process to 24-48 hours, start ASAP.

x10

Faster Talent Acquisition

Upstaff's network & platform enables clients to scale up and down blazing fast. Every hire typically is 10x faster comparing to regular recruitement workflow.

Vetted and Trusted Engineers

100%

Security And Vetting-First

AI tools and expert human reviewers in the vetting process is combined with track record & historically collected feedbacks from clients and teammates.

~50h

Save Time For Deep Vetting

In average, we save over 50 hours of client team to interview candidates for each job position. We are fueled by a passion for tech expertise, drawn from our deep understanding of the industry.

Flexible Engagement Models

Arrow

Custom Engagement Models

Flexible staffing solutions, accommodating both short-term projects and longer-term engagements, full-time & part-time

Sharing

Unique Talent Ecosystem

Candidate Staffing Platform stores data about past and present candidates, enables fast work and scalability, providing clients with valuable insights into their talent pipeline.

Transparent

$0

No Hidden Costs

Price quoted is the total price to you. No hidden or unexpected cost for for candidate placement.

x1

One Consolidated Invoice

No matter how many engineers you employ, there is only one monthly consolidated invoice.

Ready to hire Amit
or someone with similar Skills?
Looking for Someone Else? Join Upstaff access to All profiles and Individual Match
Start Hiring