Google Maps Business Scraper

The Google Maps Business Scraper is a Python-based tool for extracting detailed business information from Google Maps. It’s designed for efficiency, scalability, and ease of use, making it perfect for businesses, researchers, and analysts.

Key Features

Proxy Rotation: Minimizes the risk of IP blocks by Google Maps, ensuring continuous scraping.
Keyword Generator: Creates detailed and specific keywords to maximize data collection efficiency.
Browser Refresh: Automatically refreshes the browser, proxy, and appends data to the CSV after each batch to optimize RAM usage and system performance.
Batch Processing: Runs multiple scripts simultaneously, drastically reducing processing time.
Progress Tracking: Each process displays an individual progress bar for clear and real-time updates.
Easy Resume: Automatically resumes from the leftover keywords.txt and continues appending new data to the CSV

Installation

Clone the Repository

git clone https://github.com/xgino/Google-Map-Scraper.git  
cd Google-Map-Scraper

Install Dependencies

pip install -r requirements.txt  
playwright install

Usage

Generate Keywords
```
cd generate_keywords  
python generate_keywords.py
```
Edit country, places, and keywords in the script to suit your needs.

Run the Scraper

cd ..  # Go back to the root directory  
python main.py

Scraper Output

The scraped data is saved in ./data/data.csv with the following columns:

name: Business name
address: Business address
website: Business website
phone: Phone number
reviews_count: Number of reviews
reviews_average: Average review rating
latitude: Latitude of the business
longitude: Longitude of the business
search_keyword: Keyword used for search

Considerations and Potential Challenges

Google Maps Updates: Changes to Google Maps' HTML structure might require script updates to maintain functionality.
Proxy Maintenance: The socks5.txt file must be manually updated with valid and working proxies to ensure uninterrupted scraping.
Hardware Limits: Scaling to higher processor counts may be difficult due to hardware constraints like RAM and CPU capacity.
I noticed a minor issue: Duplicates may occur due to eg. (stores) being found under multiple keywords or overlapping processes in multiprocessing. Consider implementing duplicate removal to further use this effectively.
Another notice: It doesn't seem to work on a MacBook with a silicon chip. I wasn't able to dive deeper, perhaps a driver issue for Playwright.

Credits

Proxy sourced from TheSpeedX/PROXY-List.
All other code was written from scratch over two weeks.
Developed with support from ChatGPT for debugging and brainstorming.

Support

What began as a simple 2-hour project quickly escalated. The first version of the scraper took over 48 hours to complete, which was far from ideal. After weeks of trial and error, optimizations brought it down to 24 hours, but it still wasn’t good enough. Countless tweaks and revisions followed, pushing the script to its limits until it now runs in just 30 minutes, an improvement that feels like a major victory.

A project I thought would take only a couple of hours turned into weeks of refining and learning. But I’m proud to have an optimized and fast version that finally works as intended. If you find this tool useful, consider supporting me with a coffee on . Every sip helps fuel a new line of code. Thank you for your support, and keep coding!

Legal Disclaimer

Educational & Research Use Only. This tool is provided "as-is" by xgino. By using this software, you agree to the following:

Google Terms of Service: Automated scraping is explicitly forbidden under Google Maps Platform Terms (Section 3.2.3). Google requires the use of their official API for commercial data retrieval.
GDPR/AVG Compliance: While business listings are public, they often contain Personal Data (e.g., Sole Proprietors/ZZP). Scraping this data without consent is a violation of the GDPR in the EU. Check your local laws before use.
Ethical Usage: This tool is designed to minimize impact. Users must ensure their scraping frequency does not disrupt Google’s services or violate their policies.
No Liability: The author assumes zero responsibility for misuse, API bans, or legal consequences. You use this tool at your own risk and agree to indemnify the author against all claims.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
__pycache__		__pycache__
data		data
env		env
generate_keywords		generate_keywords
.DS_Store		.DS_Store
.gitattributes		.gitattributes
flow.png		flow.png
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt
scraper.py		scraper.py
socks5.txt		socks5.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Google Maps Business Scraper

Key Features

Installation

Usage

Scraper Output

Considerations and Potential Challenges

Credits

Support

Legal Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Google Maps Business Scraper

Key Features

Installation

Usage

Scraper Output

Considerations and Potential Challenges

Credits

Support

Legal Disclaimer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages