Skip to main content

Privacy Policy

Effective Date: January 22, 2026 Last Updated: January 22, 2026

DocWeb ("we", "our", or "us") is committed to protecting your privacy. This Privacy Policy explains how we collect, use, disclose, and safeguard your information when you use our documentation visualization platform at docweb.net.

Information We Collect

Account Information

When you create an account, we collect:

  • Email address
  • Password (hashed via Firebase Auth, never stored in plaintext)
  • Google account information (if using Google OAuth)
  • Display name (optional)

Usage Data

We automatically collect:

  • URLs you discover and analyze
  • Chat conversations with Dex (our AI assistant)
  • Session metadata (domain, URL count, timestamps)
  • Credit usage and billing information
  • Discovery statistics (sitemap URLs, navigation URLs, sources)

Technical Data

  • Browser type and version
  • Device information
  • IP address (for security and rate limiting)
  • Access timestamps
  • Error logs and debugging information

Scraped Content

When you discover documentation sites, we collect:

  • Page titles and descriptions
  • Main content converted to Markdown (max 50KB per page)
  • URL structure and hierarchy
  • Cluster assignments and priority scores

How We Use Your Information

To Provide Our Services

  • Authenticate your account via Firebase Auth
  • Process URL discoveries using our Waterfall Discovery algorithm
  • Generate vector embeddings (768-dimensional) for semantic search
  • Power AI chat responses via Google Gemini 2.5 Flash
  • Track and manage your credits and sessions

To Improve Our Services

  • Analyze usage patterns to improve features
  • Debug and fix issues using error logs
  • Develop new features based on usage data
  • Optimize caching and performance

For Billing

  • Process payments via Stripe
  • Manage subscriptions (Pro: $9.99/mo, Max: $29.99/mo)
  • Send billing notifications
  • Handle subscription lifecycle events

Data Storage

Firestore Collections

Your data is stored in Google Cloud Firestore:

CollectionData StoredRetention
users/{userId}Account profile, tier, creditsUntil deletion
artifacts/{userId}/public/data/sessionsSession metadataUntil you delete
artifacts/{userId}/public/data/urlsDiscovered URLs, contentUntil session deletion
artifacts/global/domainCacheCached domain data24-hour TTL
artifacts/global/pageCacheCached page content24-hour TTL
artifacts/global/embeddingsVector embeddingsPermanent

Global Cache

Scraped content and embeddings are stored in a global cache that benefits all users:

  • Domain cache: Discovered URLs for entire domains (24-hour TTL)
  • Page cache: Scraped Markdown content (24-hour TTL)
  • Embeddings: 768-dimensional vectors (permanent, deduplicated by URL hash)

Your discoveries contribute to this shared cache, and you benefit from others' discoveries. Cached data is anonymized and not linked to individual users.

Content Hashing

We use content hashing to track changes:

  • URL hash: SHA256 (16 characters) for deduplication
  • Content hash: MD5 (12 characters) for freshness detection

Data Sharing

We Do NOT Sell Your Data

We never sell your personal information to third parties.

Third-Party Services

ServiceData SharedPurpose
Firebase AuthEmail, password hashAuthentication
Google FirestoreAll application dataDatabase storage
Google GeminiChat queries, content chunksAI responses, embeddings
StripeEmail, payment methodBilling

AI Processing

When you chat with Dex:

  • Your query and relevant content chunks are sent to Google Gemini
  • Queries are not used to train Google's models
  • Responses are generated in real-time and not stored by Google

We may disclose information if required by law or to:

  • Comply with legal process
  • Protect our rights and safety
  • Prevent fraud or abuse

Data Retention

Data TypeRetention PeriodNotes
Account infoUntil account deletionIncludes email, tier, credits
SessionsUntil you delete themIncludes all URLs and chat
Chat historyUntil session deletionStored per session
Domain cache24 hoursAuto-expiring TTL
Page cache24 hoursAuto-expiring TTL
EmbeddingsPermanentShared across users
Payment records7 yearsLegal requirement
Error logs30 daysDebugging purposes

Your Rights

Access and Export

You can view your data through the DocWeb interface:

  • All saved sessions with metadata
  • Discovered URLs and scraped content
  • Chat history with Dex
  • Credit usage statistics

Deletion

You can delete:

  • Individual sessions (cascades to URLs and chat)
  • Your entire account (contact support at [email protected])

Note: Global cache entries (embeddings, page cache) are shared and cannot be individually deleted, but they are not linked to your identity.

Data Portability

Contact us to request an export of your data in machine-readable format.

Opt-Out

  • Unsubscribe from marketing emails (if any)
  • Disable cookies (may affect authentication)
  • Delete your account to remove all personal data

Your Privacy Rights by Region

For California Residents (CCPA)

If you are a California resident, you have the right to:

  • Know what personal information we collect, use, and disclose
  • Delete your personal information (subject to certain exceptions)
  • Opt-out of the sale of personal information (we do not sell your data)
  • Non-discrimination for exercising your privacy rights

To exercise these rights, contact us at [email protected].

For European Users (GDPR)

If you are located in the European Economic Area, you have the right to:

  • Access your personal data
  • Rectification of inaccurate data
  • Erasure ("right to be forgotten")
  • Restrict processing of your data
  • Data portability in a machine-readable format
  • Object to processing based on legitimate interests
  • Withdraw consent at any time (where processing is based on consent)

Legal Basis for Processing: We process your data based on:

  • Contract performance (providing our services)
  • Legitimate interests (improving services, security)
  • Consent (marketing communications, if applicable)

Data Controller: DocWeb (Sean Seo) Contact: [email protected]

You may also lodge a complaint with your local data protection authority.

Cookies and Local Storage

We use cookies and local storage for:

  • Authentication state: Firebase Auth session tokens
  • Theme preferences: Light/dark mode setting
  • Session management: Current session ID

We do not use third-party tracking cookies.

You can manage cookie preferences through your browser settings. Note that disabling cookies may affect authentication functionality.

Children's Privacy

DocWeb is not intended for users under 18 years of age. We do not knowingly collect personal information from minors. If we discover we have collected data from a user under 18, we will delete it promptly.

Security Measures

We implement comprehensive security measures:

MeasureImplementation
Encryption in transitHTTPS/TLS for all connections
Encryption at restFirestore encryption
Password securityFirebase Auth with hashing
Access controlFirestore security rules per user
Rate limitingp-limit concurrency control
Bot protectionrobots.txt compliance, crawl-delay
Payment securityStripe PCI compliance
Incident responseUsers notified within 72 hours of confirmed breach

See our Security page for more details.

Third-Party Websites

When you discover and scrape third-party websites:

  • We access only publicly available content
  • We strictly respect robots.txt directives
  • We honor crawl-delay specifications
  • We do not bypass authentication or paywalls
  • We identify ourselves as DocWeb-Bot/1.0

You are responsible for ensuring your use complies with third-party terms of service. See our Acceptable Use Policy.

International Data Transfers

DocWeb uses Google Cloud infrastructure. Your data may be processed in:

  • United States (primary)
  • Other Google Cloud regions for redundancy

We rely on Google's data processing agreements and standard contractual clauses for international transfers.

Changes to This Policy

We may update this Privacy Policy periodically. Changes will be:

  • Posted on this page with updated "Last Updated" date
  • Announced via email for significant changes
  • Effective immediately upon posting

Continued use of DocWeb after changes constitutes acceptance of the updated policy.

Contact Us

For privacy-related questions, requests, or concerns:

Email: [email protected]

Data Protection Contact: Sean Seo

Response Time: We aim to respond within 7 business days

Mailing Address: Available upon request