Methods & sources
Daylight is a public, observational watchdog for federal .gov infrastructure. We practice on ourselves the transparency we ask of the sites we watch: every source is named here, our bot identifies itself honestly, and our own uptime is public on /status.
The one line we never cross
Noting that a certificate, subdomain, or login page exists is fine. We never authenticate past any access wall, guess credentials, probe, port-scan, fuzz, or brute-force. We observe the front door; we never try the handle. This keeps the project legally clean and press-credible — that is the whole strategy.
- Public data only. Every source is reachable without authentication.
- Observational only. We record that things exist and how they change; we never interact past any access control.
- Reproducible & timestamped. Every observation stores its timestamp, source URL, and content hash. Every claim is independently checkable by a stranger.
- Neutral presentation. We state what was observed, linked to its source — never an accusation the data does not support.
- Rate-limit & ToS respect. Honest User-Agent, caching, backoff. We never hammer a source.
Our bot
Requests Daylight makes to public sources carry this User-Agent, so anyone can see it is us and reach a contact:
DaylightBot/0.4 (+https://daylight.watch/methods; observational; public-data-only)
Contact: contact@daylight.watch
What we watch closely
Everything Daylight records is public regardless, but a curated watchlist decides which identities — organizations, domains, and security contacts — get the loudest, highest-severity flag. It’s open to suggestions; the watchlist page has a submission link.
Data sources
Public federal .gov ownership registry — who owns each apex domain and its security contact. Diffed daily.
Public append-only logs of every issued TLS certificate — used to notice new subdomains appearing. Existence-only: we record that a cert exists; we never connect to the host.
Public page HTML + network requests, passive load-only (no auth, no form submit, no crawling) — used to fingerprint trackers and the reverse-proxy disguise trick.
Open dataset of tracker hosts + categories — seeds Floodlight's fingerprints, alongside EasyPrivacy and a session-replay vendor list.
An independent third-party archive of snapshots, so the record of what a page showed is not one we control. Powers the removal ledger.
Public SORN (System of Records Notice) search — used to check for required privacy filings. Redtape's gap findings are human-reviewed before publication and never assert illegality.
PII restraint
We surface official public registrant records — which name agency security contacts by design — and public officials acting in an official capacity. We do not enrich, cross-reference, or aggregate personal data about individuals beyond that official capacity. A redaction pass runs on ingest; anything flagged is withheld pending human review.
Credit
Built with Claude Code. Research assisted by Claude (Anthropic).