A Python application to download technology stack data from BuiltWith API with automatic pagination support.
- Clone the repository
- Install requirements:
pip install -r requirements.txt- Create a
.envfile and add your BuiltWith API key:
BUILTWITH_API_KEY=your_api_key_here- Ensure you have the
data/technologies.jsonfile with your target technologies:
{
"technologies": [
{
"RequestName": "Shopify"
},
{
"RequestName": "Magento"
}
]
}Run the main script:
python main.pyThe script will:
- List available technologies from your technologies.json
- Let you select a technology to download data for
- Automatically fetch all pages of data using the API's pagination
- Save results to CSV files in the
data/csvdirectory - Store raw API responses in the
data/rawdirectory for backup
- Automatic pagination handling
- Retry mechanism for failed requests
- CTRL+C support for graceful stopping
- CSV output with detailed website information
- Raw JSON backups of all API responses
The CSV files include the following columns:
- Domain
- Social Links
- Company Name
- Telephone Numbers
- Email Addresses
- City
- State
- Postcode
- Country
- Vertical
- Titles
- First/Last Detection Dates
- First/Last Index Dates
- Score
- Rank
builtwith-data-downloader/
├── data/
│ ├── csv/ # CSV output files
│ ├── raw/ # Raw JSON responses
│ └── technologies.json
├── client.py # API client implementation
├── main.py # Main script
└── .env # API key configuration
- Automatic retry on failed requests (up to 3 attempts)
- 5-second delay between retries
- Clear error messages and progress reporting
- Graceful exit support with CTRL+C
- API requests are rate-limited with a 2-second delay between successful requests
- Large datasets may take significant time to download completely
- Use CTRL+C to stop the download process at any time