Search Engine Crawling Compliance and Page Quality Comprehensive Analysis

Analyze your website's robots.txt and sitemap.xml to verify SEO compliance and comprehensively evaluate the accessibility and quality of pages registered in the sitemap.

📋 Test Process:
1. Check robots.txt file existence and rules
2. Search sitemap.xml files and collect URLs
3. Filter crawling-allowed URLs according to robots.txt rules
4. Sample up to 50 pages and test sequentially
5. Measure HTTP status, metadata, and quality score for each page
6. Analyze duplicate content (title/description) ratio

🎯 Measurement Tools:
• Custom Node.js-based crawler (robots.txt compliant)
• sitemap.xml parser (supports recursive index file processing)
• HTML parser for metadata extraction
• Quality scoring algorithm (100-point scale)

💯 Quality Score Calculation Criteria:
• Title tag length (under 5 characters: -15 points)
• Description meta tag (under 20 characters: -10 points)
• Missing canonical URL (-5 points)
• Missing H1 tag (-10 points) / Excessive use (-5 points)
• Insufficient content (under 1000 characters: -10 points)

🚀 Test Purpose:
• Verify that search engines can properly crawl your site
• Validate that all pages registered in sitemap are normally accessible
• Diagnose SEO penalty risks from duplicate content
• Derive improvement points through page-by-page quality scores

This test takes approximately 30 seconds to 2 minutes.
Grade Score Criteria
A+ 90~100 robots.txt properly applied
sitemap.xml exists with no missing/404 errors
All test pages return 2xx status
Overall page quality average ≥ 85 points
Duplicate content ≤ 30%
A 80~89 robots.txt properly applied
sitemap.xml exists with integrity maintained
All test pages return 2xx status
Overall page quality average ≥ 85 points
B 70~79 robots.txt and sitemap.xml exist
All test pages return 2xx status
Overall page quality average irrelevant
C 55~69 robots.txt and sitemap.xml exist
Test list includes some 4xx/5xx errors
D 35~54 robots.txt and sitemap.xml exist
Test URL list can be generated
However, low normal access rate or quality check impossible
F 0~34 Missing robots.txt or sitemap.xml
Cannot generate test list
No results yet

Run a test to see crawling test results.

No results yet

Run a test to see Raw JSON data.

Sign in to view test history.

Sign in to manage domains.