Automated daily scraping of earthquake data from the Philippine Institute of Volcanology and Seismology (PHIVOLCS).
Obtain the scraped data by using the following links:
This repository automatically collects and archives earthquake data from PHIVOLCS, providing:
All earthquake data is stored in the data/ folder:
| File | Description |
|---|---|
phivolcs_earthquake_2023.csv |
All earthquakes from 2023 |
phivolcs_earthquake_2024.csv |
All earthquakes from 2024 |
phivolcs_earthquake_2025.csv |
All earthquakes from 2025 (current year) |
phivolcs_earthquake_all_years.csv |
Combined data from all years |
Each CSV file contains the following columns:
Simply browse to the data/ folder to see the latest earthquake data.
Click on any CSV file in the data/ folder, then click βDownloadβ or βRawβ to get the data.
You can directly link to the raw CSV files in your applications:
https://raw.githubusercontent.com/zekejulia/phivolcs-earthquake-scraper/main/data/phivolcs_earthquake_all_years.csv
pip install requests pandas lxml html5lib
python scrape_phivolcs.py
The script will automatically:
The scraper provides automatic statistics including:
Example output:
π Summary by Year:
β’ 2023: 15,234 earthquakes
β’ 2024: 16,789 earthquakes
β’ 2025: 8,456 earthquakes
π Total Records: 40,479
Edit scrape_phivolcs.py and modify:
YEARS_TO_SCRAPE = 3 # Change to scrape more/fewer years
Edit .github/workflows/scrape-earthquake-data.yml:
on:
schedule:
- cron: '0 2 * * *' # Daily at 2 AM UTC (10 AM PHT)
Common schedules:
'0 */6 * * *''0 2 * * 1''0 2,14 * * *'phivolcs-earthquake-scraper/
βββ .github/
β βββ workflows/
β βββ scrape-earthquake-data.yml # GitHub Actions workflow
βββ data/
β βββ phivolcs_earthquake_2023.csv # 2023 data
β βββ phivolcs_earthquake_2024.csv # 2024 data
β βββ phivolcs_earthquake_2025.csv # 2025 data
β βββ phivolcs_earthquake_all_years.csv # Combined data
βββ scrape_phivolcs.py # Main scraper script
βββ README.md # This file
This dataset can be used for:
You can manually trigger the scraper from GitHub:
import pandas as pd
# Load the latest year's data
df = pd.read_csv('data/phivolcs_earthquake_2025.csv')
# Display basic info
print(f"Total earthquakes: {len(df)}")
print(f"Average magnitude: {df['Magnitude'].mean():.2f}")
# Filter for strong earthquakes (magnitude >= 5.0)
strong_quakes = df[df['Magnitude'] >= 5.0]
print(f"Strong earthquakes: {len(strong_quakes)}")
library(tidyverse)
# Load the data
df <- read_csv('data/phivolcs_earthquake_2025.csv')
# Summary statistics
summary(df$Magnitude)
# Plot magnitude distribution
ggplot(df, aes(x = Magnitude)) +
geom_histogram(binwidth = 0.5, fill = "steelblue") +
theme_minimal() +
labs(title = "Earthquake Magnitude Distribution")
Contributions are welcome! Feel free to:
All earthquake data is sourced from:
This project is for educational and research purposes. Please cite PHIVOLCS as the original data source when using this data.
This project is open source and available under the MIT License.
The earthquake data itself belongs to PHIVOLCS and is subject to their terms of use.
For questions or suggestions, please open an issue.
This README was last updated: October 2025
Check the commit history or GitHub Actions runs for the latest data update timestamp.
Made with β€οΈ for the Philippine data science community
If you find this project useful, please β star the repository!