summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorfilip <“filip.rabiega@gmail.com”>2025-04-26 13:26:34 +0200
committerfilip <“filip.rabiega@gmail.com”>2025-04-26 13:26:34 +0200
commit87a282a37b951d650525e43bc5607e4297192d3a (patch)
tree67826ede1be1bbb12c64f833df6cfa541b50cc67
parentcd08dcdd71b846c1639bdb5647cdfdc4eded49fd (diff)
downloadchadscraper-master.tar.gz
chadscraper-master.tar.bz2
chadscraper-master.zip
updated READMEHEADmaster
-rw-r--r--README.md32
1 files changed, 32 insertions, 0 deletions
diff --git a/README.md b/README.md
index e69de29..89385db 100644
--- a/README.md
+++ b/README.md
@@ -0,0 +1,32 @@
+# Chadscraper
+Chadcrawler & Chadscraper
+Automated tool for crawling and scraping websites, storing extracted data in CSV format.
+
+## Features
+- Web crawling to discover URLs and links from a given website.
+- Web scraping to extract specific data (e.g., text, prices, headlines).
+- Stores data in CSV format for easy analysis.
+- User-agent rotation to avoid detection.
+- Supports handling dynamic content (optional).
+- Extracts title, meta description, and meta keywords
+- Fetches H1 to H6 headings for keyword analysis
+- Collects internal and external links
+- Saves data in a CSV file
+
+## Installation
+- Clone the repository
+- Install the dependencies
+
+pip install -r requirements.txt
+
+
+## Usage
+Chadcrawler
+python chadcrawler.py --start-url "https://example.com"
+
+Chadscraper
+python scraper.py --url "https://example.com" --output "data.csv"
+
+## Requirements
+Python 3.x
+requests, BeautifulSoup, csv, Scrapy (if needed)