Upstaff Sign up
CHAITANYA PRIYA SURUKONTI
🇺🇸United States
Created AtUpstaffer since February, 2026

CHAITANYA PRIYA SURUKONTI — Senior Data Engineer

Expertise in Data Engineer, AI and Machine Learning.

 Last verified on February, 2026

Core Skills

Bio Summary

  • Senior Data Engineer with 5+ years designing scalable lakehouse architectures and distributed data pipelines using Databricks, Snowflake (Snowpark), PySpark, Kafka, and Airflow across healthcare, life sciences, and finance domains.
  • Expertise in building AI/ML-ready feature engineering pipelines, embedding datasets, and integrating ML workflows with MLflow and Databricks Feature Store for clinical risk modeling and forecasting.
  • Proficient in cloud platforms AWS, Azure, and GCP, implementing CI/CD, Terraform IaC, and DataOps frameworks ensuring HIPAA-compliant governance and 99.5% SLA data freshness.
  • Strong background in performance optimization, including Snowflake query tuning (28% improvement), Spark resource tuning (18% cost reduction), and streaming ingestion reducing latency by 45%.
  • Master of Science in Computer Science with hands-on experience in REST API development (FastAPI), containerization (Docker, Kubernetes), and modular data contracts using dbt, enabling robust, scalable data engineering solutions.

Technical Skills

Programming LanguagesPython
Java FrameworksApache Spark
Scala FrameworksApache Spark
Python FrameworksFastAPI
AI & Machine LearningMlflow, Vertex AI
Python Libraries and ToolsPySpark
Data Analysis and Visualization TechnologiesApache Airflow, Apache Spark, Apache Spark Streaming, Databricks, Looker Studio, Power BI, Tableau
Databases & Management Systems / ORMApache Spark, Apache Spark Streaming, AWS Redshift, dbt, Oracle Database, PostgreSQL, Snowflake, SQL
Cloud Platforms, Services & ComputingAWS, GCP
Amazon Web ServicesAWS Lambda, AWS Redshift
Azure Cloud ServicesDatabricks
Google Cloud PlatformGoogle BigQuery
UI/UX/Wireframing3D Modelling
Deployment, CI/CD & AdministrationCI/CD
QA, Test Automation, SecurityData Validation
Virtualization, Containers and OrchestrationDocker, Kubernetes, Terraform
SDK / API and IntegrationsFastAPI, RESTful API
Message/Queue/Task BrokersKafka
Methodologies, Paradigms and PatternsPublish/Subscribe Architectural Pattern
Other Technical SkillsDataOps, Delta lake, Snowpark API, Spark EMR

Work Experience

Senior Data Engineer - CVS Health (Healthcare Data Lakehouse and AI-Ready Pipelines)

Duration: Jun 2025 – Present
Summary:
  • Development and architecture of scalable batch and near-real-time data pipelines processing multi-terabyte healthcare datasets daily
  • The project supports clinical risk modeling, operational forecasting, and AI retrieval workflows by delivering ML-ready and embedding-ready datasets
  • It includes integration of ML workflows and deployment automation across cloud environments
Responsibilities:
  • Architected scalable data pipelines using PySpark, Databricks, Delta Live Tables, and Kafka.
  • Built AI-ready feature engineering pipelines for ML training and inference.
  • Designed and optimized Snowflake pipelines with Snowpark, Streams & Tasks.
  • Implemented ingestion pipelines for structured and semi-structured healthcare data supporting AI retrieval workflows.
  • Developed embedding-ready datasets for AI experimentation.
  • Integrated ML workflows with MLflow, Databricks Feature Store, and model versioning.
  • Designed REST-based ingestion services using FastAPI.
  • Orchestrated pipelines using Apache Airflow and Terraform for deployment automation on AWS and Azure.
  • Implemented dbt transformation layers for modular data contracts and semantic models.
  • Built DataOps validation frameworks for schema drift detection, anomaly checks, and observability monitoring.
  • Enforced HIPAA-compliant governance with RBAC policies and audit-ready lineage.
  • Optimized compute costs via Spark resource tuning and cluster auto-scaling.
Technologies: Databricks, Delta Lake, Snowflake (Snowpark, Streams, Tasks), PySpark, Kafka, dbt, MLflow, FastAPI, AWS (S3, Glue, EMR), Azure Databricks, Terraform, Docker, Kubernetes

Data Engineer – ML & Analytics - Dr. Reddy’s Laboratories (Supply Chain and Forecasting Data Pipelines)

Duration: Mar 2021 – Jul 2023
Summary:
  • Designed and implemented Spark-based ELT pipelines supporting enterprise analytics and machine learning initiatives across supply chain and forecasting systems
  • Migrated on-premises workflows to AWS S3 Data Lake architecture to improve scalability and reduce costs
  • Developed AI-ready datasets and optimized Snowflake performance for secure data sharing and efficient transformations
Responsibilities:
  • Designed Spark-based ELT pipelines for analytics and ML initiatives.
  • Migrated on-prem workflows to AWS S3 Data Lake architecture.
  • Built scalable feature engineering pipelines for ML training.
  • Designed curated AI-ready datasets using dimensional modeling and Snowflake transformations.
  • Implemented Snowflake performance tuning, partition optimization, and secure data sharing.
  • Developed data ingestion workflows from REST APIs and third-party sources.
  • Integrated batch and streaming ingestion using Kafka to reduce reporting latency.
  • Implemented data validation pipelines with profiling and reconciliation logic.
  • Orchestrated workflows with Apache Airflow maintaining high data freshness SLAs.
  • Collaborated with data scientists on feature refresh schedules and schema evolution for model retraining.
Technologies: AWS (S3, EMR, Lambda), Apache Spark, PySpark, Snowflake, Airflow, PostgreSQL, Tableau, Python, Kafka

Data Engineer - Hexaware Technologies (Financial Data ETL and Fraud Analytics Support)

Duration: Mar 2019 – Feb 2021
Summary:
  • Developed SQL-based ETL pipelines ingesting financial datasets for risk and fraud analytics in regulatory environments
  • Automated data cleansing workflows and designed relational data models to support reporting and risk analysis
  • Delivered curated datasets for predictive risk scoring and built dashboards tracking operational KPIs and compliance metrics
Responsibilities:
  • Developed SQL-based ETL pipelines for financial data ingestion.
  • Automated data cleansing workflows using Python.
  • Designed relational data models in Oracle and PostgreSQL.
  • Supported fraud analytics teams with curated datasets.
  • Implemented validation and reconciliation checks for data accuracy.
  • Optimized SQL queries and ETL jobs to improve reporting performance.
  • Built Power BI dashboards for operational KPIs and compliance metrics.
Technologies: SQL, Python, Oracle, PostgreSQL, Power BI

Education

  • Master of Science in Computer Science
    University of Bridgeport — Bridgeport, Connecticut, USA
    Sept 2023 - May 2025

How to hire with Upstaff

1

Talk to Our Talent Expert

Our journey starts with a 30-min discovery call to explore your project challenges, technical needs and team diversity.

2

Meet Carefully Matched Talents

Within 1-3 days, we’ll share profiles and connect you with the right talents for your project. Schedule a call to meet engineers in person.

3

Validate Your Choice

Bring new talent on board with a trial period to confirm you hire the right one. There are no termination fees or hidden costs.

Why Upstaff

Upstaff is a technology partner with expertise in AI, Web3, Software, and Data. We help businesses gain competitive edge by optimizing existing systems and utilizing modern technology to fuel business growth.

Real-time project team launch

<24h

Interview First Engineers

Upstaff's network enables clients to access specialists within hours & days, streamlining the hiring process to 24-48 hours, start ASAP.

x10

Faster Talent Acquisition

Upstaff's network & platform enables clients to scale up and down blazing fast. Every hire typically is 10x faster comparing to regular recruitement workflow.

Vetted and Trusted Engineers

100%

Security And Vetting-First

AI tools and expert human reviewers in the vetting process is combined with track record & historically collected feedbacks from clients and teammates.

~50h

Save Time For Deep Vetting

In average, we save over 50 hours of client team to interview candidates for each job position. We are fueled by a passion for tech expertise, drawn from our deep understanding of the industry.

Flexible Engagement Models

Arrow

Custom Engagement Models

Flexible staffing solutions, accommodating both short-term projects and longer-term engagements, full-time & part-time

Sharing

Unique Talent Ecosystem

Candidate Staffing Platform stores data about past and present candidates, enables fast work and scalability, providing clients with valuable insights into their talent pipeline.

Transparent

$0

No Hidden Costs

Price quoted is the total price to you. No hidden or unexpected cost for for candidate placement.

x1

One Consolidated Invoice

No matter how many engineers you employ, there is only one monthly consolidated invoice.

Ready to hire CHAITANYA PRIYA SURUKONTI
or someone with similar Skills?
Looking for Someone Else? Join Upstaff access to All profiles and Individual Match
Start Hiring
Propose a Job for CHAITANYA PRIYA SURUKONTI
Attachment File attachment Arrow

Upload File. Drag and Drop or Browse

At Upstaff we respect confidentiality, privacy and value your information.

Confidential (C) UPSTAFF LTD, England and Wales, #12727246 17 Montgomery Drive, Tavistock, United Kingdom PL19 8KX

Terms, conditions and legal information.

Thank you! 🎉

Your message has been successfully sent. We’ll review it and get back to you as soon as possible.

Create an account to save your details and track your applications.

Sign up