Overview
Instead of manually adding URLs to monitor, import them directly from your site’s XML sitemap. ValidGraph parses the sitemap, extracts all URLs, and adds them to your monitoring list or bulk validation queue.
How It Works
1. User provides the sitemap URL (e.g., https://example.com/sitemap.xml)
2. ValidGraph fetches and parses the sitemap (supports sitemap index files)
3. Preview mode shows discovered URLs before import
4. User confirms and URLs are added to monitoring or queued for validation
5. Supports filtering by URL pattern
Tier Availability
| Tier | Available |
|——|———–|
| Free | No |
| Pro | No |
| Agency | Yes |
| Enterprise | Yes |
Related Features
– URL Monitoring: Where imported URLs are added
– Bulk CSV Validation: Alternative bulk import method
– Site Crawler: Discover URLs beyond the sitemap
Quick Start: Import URLs from Sitemap
Step 1: Provide your sitemap URL
curl -X POST https://api.validgraph.io/wp-json/validgraph/v1/import-sitemap
-H "Authorization: Bearer YOUR_API_KEY"
-H "Content-Type: application/json"
-d '{
"sitemap_url": "https://example.com/sitemap.xml",
"action": "preview"
}'
Step 2: Review preview — See how many URLs will be imported
Step 3: Confirm import (change action to “import”)
Step 4: URLs added to monitoring automatically
Example:
Your site has 150 product pages in sitemap.xml. Import adds all 150 to monitoring (Agency tier limit: 500). Next day, all 150 are re-scanned automatically; score drops on 12 detected. Email alert fires.
Technical Details
Sitemap Parsing
ValidGraph supports:
– Single sitemaps: sitemap.xml with URL entries
– Sitemap indexes: sitemap_index.xml linking to multiple sitemaps
– Recursive depth: Follows sitemap chains up to 3 levels deep
– URL extraction: Parses tags; respects and if present
Import Endpoints
Preview import (dry-run):
POST /wp-json/validgraph/v1/import-sitemap/validate
{
"sitemap_url": "https://example.com/sitemap.xml"
}
Execute import:
POST /wp-json/validgraph/v1/import-sitemap
{
"sitemap_url": "https://example.com/sitemap.xml",
"frequency": "daily",
"monitor": true
}
Preview Response
Request:
{
"sitemap_url": "https://example.com/sitemap.xml"
}
Response:
{
"success": true,
"data": {
"preview": true,
"sitemap_url": "https://example.com/sitemap.xml",
"sitemap_type": "sitemap_index",
"sitemaps_found": 3,
"total_urls_discovered": 285,
"urls_preview": [
{
"url": "https://example.com/article-1",
"lastmod": "2024-03-20",
"priority": 0.8
},
{
"url": "https://example.com/article-2",
"lastmod": "2024-03-19",
"priority": 0.7
}
],
"import_estimation": {
"urls_to_import": 285,
"tier_limit": 500,
"warning": null,
"can_import": true
},
"parsing_details": {
"child_sitemaps": [
"https://example.com/sitemap-articles.xml",
"https://example.com/sitemap-products.xml",
"https://example.com/sitemap-pages.xml"
],
"depth_traversed": 2,
"parsing_time_ms": 340
}
}
}
Import Execution Response
Request:
{
"sitemap_url": "https://example.com/sitemap.xml",
"frequency": "daily",
"monitor": true,
"alert_on_score_drop": true,
"alert_threshold": 5
}
Response (202 Accepted):
{
"success": true,
"data": {
"import_id": "import_xyz789",
"sitemap_url": "https://example.com/sitemap.xml",
"status": "processing",
"urls_found": 285,
"urls_added_to_monitoring": 285,
"urls_already_monitored": 12,
"urls_duplicates": 3,
"frequency": "daily",
"created_at": "2024-03-21T10:00:00Z",
"estimated_completion": "2024-03-21T10:05:00Z",
"import_progress": {
"processed": 0,
"total": 285,
"percent": 0
},
"monitoring_batch_id": "batch_import_xyz789"
}
}
Check Import Status
Request:
GET /wp-json/validgraph/v1/import-sitemap/import_xyz789/status
Response (in progress):
{
"import_id": "import_xyz789",
"status": "processing",
"processed": 145,
"total": 285,
"percent_complete": 50.9,
"urls_added": 145,
"errors": 2,
"error_details": [
{
"url": "https://example.com/invalid-page",
"error": "Invalid URL format"
}
],
"estimated_completion": "2024-03-21T10:04:00Z"
}
Response (complete):
{
"import_id": "import_xyz789",
"status": "completed",
"processed": 285,
"total": 285,
"percent_complete": 100,
"urls_added": 283,
"errors": 2,
"summary": {
"new_urls_monitored": 283,
"already_existing": 2,
"skipped_invalid": 2,
"duplicate_entries": 0
},
"monitoring_details": {
"batch_id": "batch_import_xyz789",
"frequency": "daily",
"next_scan": "2024-03-22T06:00:00Z",
"alert_rule_applied": true
},
"completed_at": "2024-03-21T10:03:45Z"
}
Tier-Specific Limits
| Tier | Available | Max URLs per Import | Recursive Depth |
|——|———–|——————-|—————–|
| Free | No | — | — |
| Pro | No | — | — |
| Agency | Yes | 500 | 3 levels |
| Enterprise | Yes | 2,000 | 3 levels |
Sitemap Format Example
Single sitemap:
https://example.com/article-1
2024-03-20
0.8
https://example.com/article-2
2024-03-19
0.7
Sitemap index:
https://example.com/sitemap-articles.xml
2024-03-21
https://example.com/sitemap-products.xml
2024-03-20
Import Behavior
– Deduplication: Duplicate URLs in sitemaps are imported only once
– Already-monitored: URLs already in monitoring are skipped (not re-added)
– Invalid URLs: Malformed URLs are skipped with error logged
– Frequency uniform: All imported URLs receive the same scan frequency
– Async processing: Import runs asynchronously; check status endpoint
References
– Sitemap XML Format (W3C): https://www.sitemaps.org/
– Sitemap XML Schema: https://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd
– robots.txt sitemap declaration: https://www.robotstxt.org/robotstxt.html
– URL Validation (RFC 3986): https://tools.ietf.org/html/rfc3986
– HTTP Range Requests: https://tools.ietf.org/html/rfc7233