Web Skills

Skills for web scraping, data extraction, and web automation

/api/skills/web

1 Skill

web_scraping

Extract structured data from websites using Python and BeautifulSoup

/api/skills/web/web_scraping
Skill python
Content Type
Static (Markdown)
Language
python
Widgets
n/a

Trigger Keywords

scrapewebextracthtmlparsecrawldata extractionwebsite

Required Packages

beautifulsoup4requestslxmlpandas

Instructions Preview

# Web Scraping Skill Extract structured data from websites using Python. Use the helper script or write custom code. ## Using the Helper Script ```bash # Scrape a single page and extract all links python /home/daytona/skills/web-scraping/scraper.py https://example.com --links # Extract text content python /home/daytona/skills/web-scraping/scraper.py https://example.com --text # Extract data using CSS selectors python /home/daytona/skills/web-scraping/scraper.py https://example.com --selector "h1,h2,h3" --output /home/daytona/out/headings.json # Extract table data to CSV python /home/daytona/skills/web-scraping/scraper.py https://example.com --tables --output /home/daytona/out/tables.csv # Full page analysis python /home/daytona/skills/web-scraping/scraper.py https://example.com --analyze ``` ## Custom Scraping Code ### Basic Page Fetching ```python import requests from bs4 import BeautifulSoup def fetch_page(url): headers = { 'User-Agent': 'Mozilla/5.0 (compatibl...