Data Methodology

How BillTaylorBass.com turns decades of live performance history into a searchable, verifiable archive - using a Sheets-first data model, lightweight static pages, and an AI-assisted workflow. For the full build history, see Project Journey. Also see What is 6 Degrees of Bill Taylor?.

Sheets-first Static site Verifiable sources AI-assisted (human-verified) Relationship layers (v2–v5)

1) Source of truth

Google Sheets is the single source of truth. The site reads published CSV endpoints at runtime; there is no separate database and no hidden admin layer.

Shows: one row per performance (date, band, venue, city/state, links, notes)
Bands: canonical band IDs, SetlistFM URLs, and band metadata
Venues: canonical venue keys + geodata (lat/lon, confidence, source)
Songs: derived plays/counts plus any curated song metadata
Press: structured press links and clippings metadata

Public site note: we do not expose edit links to the underlying Sheets.

2) What is authored vs derived

Most fields are authored directly in Sheets (especially show metadata). Many insight views are derived programmatically from the authored data.

Authored: show date, band, venue/city/state, notes, recording links, artwork URLs
Derived: counts, top collaborators, first/last appearance, pairs, tenure lanes

Rule: derived views never invent facts. If a field is missing in the source row, the derived view treats it as missing.

3) Musicians and collaborators

Source of truth: the Personnel field on the Shows tab. It is a pipe-separated list like:

Name - instrument | Name - instrument | Name - instrument

All collaborator stats are derived from Personnel text
If a show has no Personnel value, it contributes no musician counts/pairs
Name normalization is formatting-only (trim whitespace, consistent casing)

Maintenance rule: to correct musician stats, update Personnel on the show row. No other edit is required.

4) Venues and geography

Venues are stabilized using a canonical venue_key (typically venue-slug|city|state) so minor spelling changes do not fragment maps and counts.

Shows drive venue rollups (counts, first/last date, bands played)
Venues store geodata (lat/lon) with source + confidence
Low-confidence city-level fallbacks are allowed but should be upgraded over time

Tooling: Venue Harmonization helps detect near-duplicate spellings before they become canon.

5) Recordings, setlists, and media

Archive.org: canonical recording links when available
Setlist.fm: used for setlist coverage and discovery where applicable
Cloudinary: media delivery and transformations (thumbnails, responsive images)

Media is treated as evidence and context: links are attributed, and the archive does not republish paywalled or restricted content.

6) Verification and transparency

Shows fall into two buckets:

Documented: verified date + venue + band (often with a source link)
Referenced: mentioned in press/listings but not fully verified (tracked for future backfill)

AI is used to accelerate discovery and cleanup, but AI output is treated as a proposal until verified by a human.

7) Relationship layers by version

The archive adds relationship intelligence in layers — intentionally, and in order — so the system stays verifiable and maintainable as it grows.

v2: historical edges (Show ↔ Band, Show ↔ Venue).

v3: derived edges (Band ↔ Band via shared stages and bill order; opener/headliner directionality).

v4: explicit edges (Person ↔ Band memberships, time-aware and source-backed).

v5: traversal and graph queries (execution-layer change only; the canonical data rules remain the same).

No relationship layer is introduced without documentation updates
See also: What is 6 Degrees of Bill Taylor?

8) Membership model (v4+)

In v4, Person becomes a first-class entity and the archive begins capturing explicit membership relationships.

Membership edges are the backbone of 6 Degrees — and every edge requires provenance.

Fields: person, band, role, time range, confidence, status, source
Membership edges are time-aware (start/end or year ranges)
No edge exists without a source (link, citation, or verified evidence)

9) AI & automation rules

AI can accelerate discovery, parsing, and suggestions — but it cannot be the authority for relationships or identity merges.

AI may propose, never approve.
All AI-generated relationship suggestions default to pending.
Evidence is mandatory (links, citations, or verifiable artifacts).
Nothing auto-merges; human review is required for canonical changes.

10) Cross-document contract

This project uses two complementary documents to prevent drift:

Project Journey explains why and when features evolve.

Data Methodology defines how data is represented and what is allowed.

Any expansion of 6 Degrees requires updates to both documents so the intent and rules stay locked.

Read: Project Journey
Read: What is 6 Degrees of Bill Taylor?

Suggest a correction

If you spot an error, missing show, or bad link, use Suggest an Edit. This workflow is designed to capture corrections without exposing the underlying spreadsheet.