Security Overview
DocWeb implements multiple layers of security to protect your data and ensure safe operation.
Architecture Security
Infrastructure
DocWeb runs on Google Cloud Platform:
- Firebase Auth: Handles authentication securely
- Cloud Firestore: Encrypted database with security rules
- Cloud Functions: Serverless, isolated execution environment
- HTTPS: All traffic encrypted with TLS
Authentication
We support two secure authentication methods:
| Method | Provider | Features |
|---|---|---|
| Email/Password | Firebase Auth | Password hashing, email verification |
| Google OAuth | Google Identity | 2FA support, secure token exchange |
Authorization
Firestore security rules enforce data access:
users can only access their own data:
- artifacts/{userId}/** → User-specific sessions and URLs
- users/{userId} → User profile and credits
global data is read-only for authenticated users:
- artifacts/global/** → Shared cache and embeddings
Application Security
Input Validation
- URLs are validated before discovery
- User inputs are sanitized
- Request payloads are type-checked
Rate Limiting
- Built-in concurrency control (p-limit)
- Credit system limits discovery and chat
- Per-user session limits
Error Handling
- Errors are logged securely
- Sensitive information is not exposed to clients
- Stack traces are not returned in production
Crawling Security
Respectful Crawling
DocWeb's bot (DocWeb-Bot/1.0) follows web standards:
- robots.txt: Strictly respected
- crawl-delay: Honored
- noindex/nofollow: Respected where applicable
Limitations
- Max crawl depth: 3 levels
- Max sitemap depth: 5 levels
- Request timeout: 15 seconds
- Max URLs per source: 10,000
No Bypass
DocWeb does NOT:
- Bypass authentication or paywalls
- Circumvent anti-bot measures
- Access restricted content
- Ignore robots.txt rules
Third-Party Security
Google Cloud
- SOC 2 Type II certified
- ISO 27001 certified
- GDPR compliant
Stripe
- PCI-DSS Level 1 compliant
- We never see or store full card numbers
- Tokenized payment processing
Google Gemini
- Enterprise-grade AI service
- Data not used for model training
- Secure API connections
Incident Response
Monitoring
- Real-time error logging
- Performance monitoring
- Unusual activity detection
Response Process
- Detection and assessment
- Containment and mitigation
- User notification (if required)
- Root cause analysis
- Prevention measures
Contact
Report security issues to: [email protected]
Compliance
Data Protection
- User data segregated by userId
- Data deletion available on request
- Data export available on request
Privacy
- No selling of personal data
- Minimal data collection
- Clear privacy policy
See our Privacy Policy for complete details.
Security Best Practices for Users
Account Security
- Use a strong, unique password
- Enable Google OAuth for added security
- Don't share your account credentials
- Log out from shared devices
Safe Usage
- Only discover sites you have permission to analyze
- Review scraped content for sensitive information
- Be cautious with third-party site terms of service
- Report suspicious activity