diff options
author | filip <“filip.rabiega@gmail.com”> | 2025-04-26 13:26:34 +0200 |
---|---|---|
committer | filip <“filip.rabiega@gmail.com”> | 2025-04-26 13:26:34 +0200 |
commit | 87a282a37b951d650525e43bc5607e4297192d3a (patch) | |
tree | 67826ede1be1bbb12c64f833df6cfa541b50cc67 | |
parent | cd08dcdd71b846c1639bdb5647cdfdc4eded49fd (diff) | |
download | chadscraper-87a282a37b951d650525e43bc5607e4297192d3a.tar.gz chadscraper-87a282a37b951d650525e43bc5607e4297192d3a.tar.bz2 chadscraper-87a282a37b951d650525e43bc5607e4297192d3a.zip |
-rw-r--r-- | README.md | 32 |
1 files changed, 32 insertions, 0 deletions
@@ -0,0 +1,32 @@ +# Chadscraper +Chadcrawler & Chadscraper +Automated tool for crawling and scraping websites, storing extracted data in CSV format. + +## Features +- Web crawling to discover URLs and links from a given website. +- Web scraping to extract specific data (e.g., text, prices, headlines). +- Stores data in CSV format for easy analysis. +- User-agent rotation to avoid detection. +- Supports handling dynamic content (optional). +- Extracts title, meta description, and meta keywords +- Fetches H1 to H6 headings for keyword analysis +- Collects internal and external links +- Saves data in a CSV file + +## Installation +- Clone the repository +- Install the dependencies + +pip install -r requirements.txt + + +## Usage +Chadcrawler +python chadcrawler.py --start-url "https://example.com" + +Chadscraper +python scraper.py --url "https://example.com" --output "data.csv" + +## Requirements +Python 3.x +requests, BeautifulSoup, csv, Scrapy (if needed) |