observational · public data only · every record timestamped & source-linked

Methods & sources

Daylight is a public, observational watchdog for federal .gov infrastructure. We practice on ourselves the transparency we ask of the sites we watch: every source is named here, our bot identifies itself honestly, and our own uptime is public on /status.

The one line we never cross

Noting that a certificate, subdomain, or login page exists is fine. We never authenticate past any access wall, guess credentials, probe, port-scan, fuzz, or brute-force. We observe the front door; we never try the handle. This keeps the project legally clean and press-credible — that is the whole strategy.

  • Public data only. Every source is reachable without authentication.
  • Observational only. We record that things exist and how they change; we never interact past any access control.
  • Reproducible & timestamped. Every observation stores its timestamp, source URL, and content hash. Every claim is independently checkable by a stranger.
  • Neutral presentation. We state what was observed, linked to its source — never an accusation the data does not support.
  • Rate-limit & ToS respect. Honest User-Agent, caching, backoff. We never hammer a source.

Our bot

Requests Daylight makes to public sources carry this User-Agent, so anyone can see it is us and reach a contact:

DaylightBot/0.4 (+https://daylight.watch/methods; observational; public-data-only)

Contact: contact@daylight.watch

What we watch closely

Everything Daylight records is public regardless, but a curated watchlist decides which identities — organizations, domains, and security contacts — get the loudest, highest-severity flag. It’s open to suggestions; the watchlist page has a submission link.

Data sources

CISA dotgov-dataLedger (live)

Public federal .gov ownership registry — who owns each apex domain and its security contact. Diffed daily.

Public append-only logs of every issued TLS certificate — used to notice new subdomains appearing. Existence-only: we record that a cert exists; we never connect to the host.

Live public page source (Playwright)Floodlight (engine live; capture pending)

Public page HTML + network requests, passive load-only (no auth, no form submit, no crawling) — used to fingerprint trackers and the reverse-proxy disguise trick.

DuckDuckGo Tracker RadarFloodlight (engine live)

Open dataset of tracker hosts + categories — seeds Floodlight's fingerprints, alongside EasyPrivacy and a session-replay vendor list.

Wayback Save Page Now (SPN2)Receipts (removal ledger live; capture pending)

An independent third-party archive of snapshots, so the record of what a page showed is not one we control. Powers the removal ledger.

Federal Register APIRedtape (gap-finder + human gate live)

Public SORN (System of Records Notice) search — used to check for required privacy filings. Redtape's gap findings are human-reviewed before publication and never assert illegality.

PII restraint

We surface official public registrant records — which name agency security contacts by design — and public officials acting in an official capacity. We do not enrich, cross-reference, or aggregate personal data about individuals beyond that official capacity. A redaction pass runs on ingest; anything flagged is withheld pending human review.

Credit

Built with Claude Code. Research assisted by Claude (Anthropic).