Deskripsi Pekerjaan
Join DataFlow Solutions as a Senior Scraper Operator and become the backbone of our data intelligence ecosystem. We're seeking a meticulous tech expert to architect and maintain cutting-edge web scraping systems that fuel AI models, market research, and competitive intelligence. In this pivotal role, you'll navigate complex digital landscapes while adhering to ethical data practices and evolving anti-bot technologies. Our dynamic startup culture offers flexible hours, remote-first flexibility, and the chance to shape the future of data extraction.
Tanggung Jawab
- Design, implement, and optimize scalable scraping pipelines using Python/Scrapy frameworks
- Develop anti-detection strategies including proxy rotation, CAPTCHA solving, and header spoofing
- Monitor scraping performance, troubleshoot API failures, and implement rate-limiting protocols
- Transform raw scraped data into clean, structured formats for analytics teams
- Document scraping methodologies and maintain ethical compliance guidelines
- Collaborate with data scientists to validate data quality and extraction accuracy
- Research emerging anti-bot technologies and adapt scraping strategies accordingly
Kualifikasi
- 3+ years of professional web scraping experience with Python (Scrapy/BeautifulSoup)
- Proficiency in JavaScript/Node.js for dynamic content extraction
- Expertise in proxy management services (Luminati, Oxylabs) and CAPTCHA solvers
- Familiarity with cloud platforms (AWS/GCP) and containerization (Docker)
- Strong understanding of HTML/CSS selectors and RESTful APIs
- Experience with database integration (PostgreSQL/MongoDB)
- Ability to write clean, maintainable code with version control (Git)
- Knowledge of data privacy regulations (GDPR, CCPA)