summaryrefslogtreecommitdiff
path: root/README.md
blob: 89385db99ee02cc01b947bb04309875c1be60d8a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Chadscraper
Chadcrawler & Chadscraper
Automated tool for crawling and scraping websites, storing extracted data in CSV format.

## Features
- Web crawling to discover URLs and links from a given website.
- Web scraping to extract specific data (e.g., text, prices, headlines).
- Stores data in CSV format for easy analysis.
- User-agent rotation to avoid detection.
- Supports handling dynamic content (optional).
- Extracts title, meta description, and meta keywords
- Fetches H1 to H6 headings for keyword analysis
- Collects internal and external links
- Saves data in a CSV file

## Installation
- Clone the repository
- Install the dependencies

pip install -r requirements.txt


## Usage
Chadcrawler
python chadcrawler.py --start-url "https://example.com"

Chadscraper
python scraper.py --url "https://example.com" --output "data.csv"

## Requirements
Python 3.x
requests, BeautifulSoup, csv, Scrapy (if needed)