Web Scraping and Automation Engineer

Scripting and Automation, Data Scraping

Are you a talented developer looking for a remote job that lets you show your skills and get decent compensation? Join Upstaff.com, a platform that connects you with hand-picked startups and scale-ups in the US and Europe.

Summary

- Proficiency in web scraping libraries and frameworks such as Beautiful Soup, Scrapy, Selenium, or Playwright.
- Handling anti-scraping mechanisms (e.g., CAPTCHAs, IP rotation, user-agent management).
- AWS Cloud: (S3, Lambda, EC2, IAM, CloudWatch), Git;
- Focus: Scalable web scraping, configuration-driven data extraction, AWS-based data pipelines;
- Required experience: Web scraping on Python, AWS S3 integration, anti-scraping countermeasures, data parsing and cleaning, configuration frameworks;
- Nice to have: Experience working for scraping platforms apify.com
- Start Date: ASAP (Pref. June 16th or 24th);
- Duration: approx. 2 months;
- Type: Full-time

Job Description

About the company

A technology organization delivering software solutions for the construction sector, with a focus on digital project coordination, procurement workflows, and business intelligence. The platform offers modular capabilities to support operational efficiency and strategic planning. Recently revised its brand and product structure. Operates with a distributed international workforce.

About the role

We are seeking a skilled and motivated Web Automation Developer to join our dynamic team. The ideal candidate will be proficient in Python and experienced in developing and maintaining robust web scraping solutions. This role will focus on utilizing a configuration-based approach to extract data from a diverse range of websites. The developed solutions will be hosted on Amazon Web Services (AWS), with data stored in Amazon S3 buckets.

Key Responsibilities:

  • Develop, deploy, and maintain efficient and scalable web scraping scripts and applications using Python;
  • Implement and manage a configuration-based framework for web scraping, allowing for flexible and adaptable data extraction from various website structures;
  • Utilize AWS services (primarily S3, and potentially Lambda, EC2, CloudWatch, and IAM) for hosting, managing, and monitoring web scraping solutions and data storage;
  • Design and implement data pipelines to ensure reliable and timely data extraction and storage into Amazon S3;
  • Troubleshoot and resolve issues related to web scraping, including changes in website structures, anti-scraping measures, and data quality;
  • Collaborate with data analysts and other stakeholders to understand data requirements and ensure the delivery of accurate and relevant information;
  • Develop and maintain documentation for web scraping processes, configurations, and codebase;
  • Stay up-to-date with the latest web scraping techniques, tools, and best practices, as well as relevant AWS services;
  • Ensure compliance with legal and ethical standards for web scraping and data privacy;
  • Monitor the performance and reliability of scraping jobs, implementing alerts and recovery mechanisms to ensure optimal operation.

Required Skills and Qualifications:

  • Proven experience as a Python Developer with a strong focus on web scraping (last one two years, and also nice to have with scraping platforms)
  • Experience with scraping popular resources like X, FB, LN that have high-grade protection from scraping.
  • Proficiency in web scraping libraries and frameworks such as Beautiful Soup, Scrapy, Selenium, or Playwright;
  • Familiarity with handling anti-scraping mechanisms (e.g., CAPTCHAs, IP rotation, user-agent management);
  • Solid understanding of web technologies including HTML, CSS, JavaScript, and DOM manipulation;
  • Experience with APIs and data formats like JSON and XML;
  • Demonstrable experience with AWS services, particularly S3 for data storage;
  • Familiarity with other services like Lambda for serverless execution, EC2 for hosting, IAM for security, and CloudWatch for monitoring is highly desirable;
  • Experience in designing and implementing configuration-driven systems/applications;
  • Strong understanding of data extraction, data parsing, and data cleaning techniques;
  • Familiarity with version control systems, preferably Git;
  • Excellent problem-solving and analytical skills with strong attention to detail;
  • Ability to work independently and as part of a collaborative team;
  • Good communication skills, both written and verbal;
  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.

Desired Qualifications (Nice to Have):

  • Experience working for scraping platforms apify.com
  • Experience with containerization technologies like Docker;
  • Knowledge of database technologies (SQL or NoSQL);
  • Experience with CI/CD pipelines;
  • Understanding of data warehousing concepts;
  • AWS Certifications (e.g., AWS Certified Developer, AWS Certified Solutions Architect).
Submit a Talent for Web Scraping and Automation Engineer
AttachmentFile attachment Arrow

Upload File. Drag and Drop or Browse

At Upstaff we respect confidentiality, privacy and value your information.

Confidential (C) UPSTAFF LTD, England and Wales, #12727246 17 Montgomery Drive, Tavistock, United Kingdom PL19 8KX

Terms, conditions and legal information.

Application Submitted Successfully! 🎉 Next step is to create account on Upstaff platform!

You've successfully submitted your application! The next step is to create Upstaff that allow you to complete the application process, track your applications, submit for another relevant jobs, get updates about new positions and be visible for our client network.

Create Upstaff account