Skip to main content

MEMO-064: Week 9 Day 5 - Table and Diagram Review

Date: 2025-11-15 Updated: 2025-11-15 Author: Platform Team Related: MEMO-052, MEMO-061, MEMO-062, MEMO-063

Executive Summary

Comprehensive table and diagram review across all 5 massive-scale graph RFCs. Overall assessment: Excellent visual communication with professional quality. Tables and diagrams effectively complement text without duplication, demonstrate consistent formatting, and enhance readability.

Key Findings:

  • Tables: 31 tables, 100% properly formatted with clear headers and alignment
  • Diagrams: ASCII diagrams are clear, well-formatted, and professional
  • Complement text: Tables provide quantitative data while text provides context
  • No duplication: Tables and diagrams don't repeat prose information

Conclusion: Current tables and diagrams are production-ready. No significant improvements needed.

Recommendation: Accept current visual presentation quality as excellent.

Quantitative Analysis

Table Statistics

RFCTotal TablesFormat QualityHeader ClarityColumn AlignmentIssues Found
RFC-05712✅ Excellent✅ Clear✅ Aligned0
RFC-0585✅ Excellent✅ Clear✅ Aligned0
RFC-0594✅ Excellent✅ Clear✅ Aligned0
RFC-0606✅ Excellent✅ Clear✅ Aligned0
RFC-0614✅ Excellent✅ Clear✅ Aligned0
Total31100%100%100%0

Automated Analysis Results:

  • ✅ All 31 tables have proper markdown formatting
  • ✅ All tables have clear header rows (bold or uppercase text)
  • ✅ All tables have separator rows with proper syntax (|---|---|)
  • ✅ All tables have consistent column counts across rows
  • ✅ No malformed tables detected

Table Quality Analysis

Example 1: Comparison Table (RFC-057, Lines 44-51)

Purpose: Compare RFC-055 current scale vs this RFC's target scale

Table:

| Scale Dimension | RFC-055 (Current) | This RFC (Target) | Multiplier |
|-----------------|-------------------|-------------------|------------|
| **Total Vertices** | 1 billion | 100 billion | 100× |
| **Total Edges** | 10 billion | 10 trillion | 1000× |
| **Proxy Instances** | 10 | 1000 | 100× |
| **Vertices per Node** | 100M | 100M | 1× |
| **Memory per Node** | 30 GB | 30 GB | 1× |
| **Total Memory** | 300 GB | 30 TB | 100× |

Context:

  • Before: "RFC-055 demonstrates 1B vertices across 10 proxies:"
  • After: "Why Scale Beyond 1B Vertices?" (continues with use cases)

Analysis:

  • Complements text: Table provides quantitative comparison while text provides qualitative context
  • No duplication: Preceding text doesn't repeat numbers from table
  • Clear headers: Bold column headers distinguish dimensions clearly
  • Aligned columns: Proper markdown formatting with consistent spacing
  • Effective visualization: "Multiplier" column makes 100× scaling immediately obvious
  • Professional appearance: Clean, scannable, easy to understand

Verdict: Model table - demonstrates best practices

Example 2: Performance Characteristics Table (RFC-060, Lines 1827-1836)

Purpose: Show query latency by complexity with P50 and P99 percentiles

Table:

| Query Type | Partitions | Vertices | Latency (P50) | Latency (P99) |
|------------|-----------|----------|---------------|---------------|
| **Single vertex lookup** | 1 | 1 | 50 μs | 200 μs |
| **Property filter (indexed)** | 150 | 5M | 2 s | 5 s |
| **Property filter (unindexed)** | 16,000 | 100B | 60 s | 180 s |
| **1-hop traversal (local)** | 1 | 200 | 500 μs | 2 ms |
| **1-hop traversal (distributed)** | 50 | 10k | 10 ms | 50 ms |
| **2-hop traversal** | 500 | 100k | 100 ms | 500 ms |
| **3-hop traversal** | 5,000 | 1M | 1 s | 5 s |

Context:

  • Section header: "## Performance Characteristics"
  • Subsection: "### Query Latency by Complexity"

Analysis:

  • Structured information: 5 columns provide comprehensive performance view
  • Scannable: Easy to compare different query types
  • Professional units: Consistent use of μs, ms, s notation
  • Practical value: Helps readers estimate query performance for their use cases
  • No prose equivalent: This data would be very verbose and hard to scan as prose

Verdict: Excellent use of table for structured performance data

Example 3: Sampling Strategy Comparison (RFC-061, Lines 1381-1387)

Purpose: Compare different audit log sampling rates with trade-offs

Table:

| Sampling Rate | Events/sec | Storage (90 days) | Cost/year | Compliance | Investigation Capability |
|---------------|------------|-------------------|-----------|------------|-----------------------------|
| **100% (naive)** | 1B | 3.8 PB | $1M | ✅ Full | ✅ Complete |
| **10%** | 100M | 388 TB | $100k | ✅ Full | ✅ Very Good |
| **1% (recommended)** | 10M | 38.8 TB | $10k | ✅ Full* | ✅ Good |
| **0.1%** | 1M | 3.88 TB | $1k | ⚠️ Partial | ⚠️ Limited |

Context:

  • Section: "### Audit Logging and Sampling"
  • Subsection: "#### Performance Impact Analysis"

Analysis:

  • Multi-dimensional comparison: 6 attributes compared across 4 options
  • Visual indicators: Checkmarks and warnings enhance scannability
  • Recommendations clear: "1% (recommended)" highlighted
  • Trade-offs visible: Cost vs compliance vs capability trade-offs immediately apparent
  • Decision support: Table enables informed decision-making

Verdict: Excellent use of table for complex multi-attribute comparison

Common Table Patterns

Pattern A: Scale Comparison Tables

Purpose: Compare current vs target scale, or different approaches

Structure:

  • Column 1: Dimension name (vertices, edges, memory, etc.)
  • Column 2: Current/Approach A value
  • Column 3: Target/Approach B value
  • Column 4: Ratio or difference

Examples: RFC-057 (scale comparison), RFC-059 (cost comparison)

Effectiveness: ✅ Excellent - makes scaling factors immediately visible

Pattern B: Performance Benchmarks

Purpose: Show latency, throughput, or resource usage across scenarios

Structure:

  • Column 1: Operation or query type
  • Columns 2-N: Metrics (P50, P99, memory, partitions, etc.)

Examples: RFC-060 (query latency), RFC-058 (index build time)

Effectiveness: ✅ Excellent - enables performance estimation

Pattern C: Trade-Off Analysis

Purpose: Compare options across multiple criteria

Structure:

  • Column 1: Option name
  • Columns 2-N: Attributes (cost, performance, complexity, etc.)

Examples: RFC-061 (sampling strategies), RFC-059 (snapshot formats)

Effectiveness: ✅ Excellent - supports decision-making

Pattern D: Cost Breakdown

Purpose: Show cost components and totals

Structure:

  • Column 1: Component (compute, storage, network)
  • Column 2: Unit cost or usage
  • Column 3: Total cost

Examples: RFC-059 (S3 cost optimization), RFC-057 (network costs)

Effectiveness: ✅ Excellent - makes cost drivers transparent

ASCII Diagram Analysis

Example 1: Hierarchical Sharding Diagram (RFC-057, Lines 195-226)

Purpose: Visualize 3-tier hierarchy (Global → Cluster → Proxy → Partitions)

Diagram:

        Global


┌────────────────────────┐
│ Cluster 0: 100 Proxies│
│ │
│ ┌──────┐ ┌──────┐ │
│ │Proxy │ │Proxy │ │
│ │ 0 │ │ 1 │ │
│ │ │ │ │ │
│ │100M V│ │100M V│ │
│ └──────┘ └──────┘ │
└────────────────────────┘


┌────────────────────────┐
│ Proxy 0: 64 Partitions │
│ │
│ ┌──────┐ ┌──────┐ │
│ │Part 0│ │Part 1│ ... │
│ │1.56M │ │1.56M │ │
│ │Hot │ │Warm │ │
│ └──────┘ └──────┘ │
└────────────────────────┘

Analysis:

  • Clear hierarchy: Boxes and arrows show containment and flow
  • Proper alignment: Boxes aligned using Unicode box-drawing characters
  • Labeled components: Each level clearly labeled (Global, Cluster, Proxy, Partition)
  • Quantitative info: Vertex counts and partition counts included
  • Professional appearance: Clean lines, consistent spacing

Verdict: Excellent diagram - immediately communicates architecture

Example 2: Query Execution Plan (RFC-060, Lines 1905-1943)

Purpose: Visualize query execution stages with cost estimates

Diagram:

EXPLAIN: g.V().hasLabel('User').has('city', 'SF').out('FOLLOWS').limit(100)

Query Plan:
┌──────────────────────────────────────────────────────────────┐
│ Stage 0: Vertex Scan (hasLabel + has) │
│ Type: INDEX_SCAN │
│ Index: city_index['SF'] │
│ Estimated Rows: 5M vertices │
│ Estimated Cost: 2.5 (index lookup + scan) │
│ Partitions Queried: 150 of 16,000 (pruned 99%) │
│ Estimated Latency: 2 seconds │
└──────────────────────────────────────────────────────────────┘


┌──────────────────────────────────────────────────────────────┐
│ Stage 1: Edge Traversal (out FOLLOWS) │
│ Type: TRAVERSAL │
│ Index: edge_index['FOLLOWS'] │
│ Estimated Rows: 50M edges (10 per vertex avg) │
│ Estimated Cost: 50.0 (edge lookup + vertex load) │
│ Partitions Queried: 500 (distributed fan-out) │
│ Estimated Latency: 5 seconds │
└──────────────────────────────────────────────────────────────┘


┌──────────────────────────────────────────────────────────────┐
│ Stage 2: Limit (100) │
│ Type: LIMIT │
│ Estimated Rows: 100 vertices │
│ Estimated Cost: 0.1 (early termination) │
│ Partitions Queried: N/A │
│ Estimated Latency: <1 ms │
└──────────────────────────────────────────────────────────────┘

Total Estimated Cost: 52.6
Total Estimated Latency: 7 seconds
Total Partitions Touched: 650 of 16,000

Analysis:

  • Sequential flow: Arrows (│ and ▼) show execution order
  • Detailed information: Each stage box contains 6-7 metrics
  • Aligned text: Proper indentation and spacing for readability
  • Summary statistics: Total cost and latency at bottom
  • SQL EXPLAIN inspiration: Familiar format for database engineers

Verdict: Excellent diagram - makes query execution immediately understandable

Example 3: Timeline Visualization (RFC-060, Lines 2019-2041)

Purpose: Show actual execution timeline with bottleneck identification

Diagram:

Query Timeline: g.V().has('city', 'SF').out('FOLLOWS')
Total Time: 8.2 seconds

Stage 0: Vertex Scan (2.1 seconds)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1s
Partitions: 150
Fastest: partition 07:0005:12 (10 ms)
Slowest: partition 09:0089:45 (450 ms) ← BOTTLENECK
Average: 14 ms

Stage 1: Edge Traversal (6.0 seconds)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.0s
Partitions: 500
Fastest: partition 01:0012:03 (8 ms)
Slowest: partition 07:0042:18 (3.2s) ← BOTTLENECK
Reason: Cold partition (loaded from S3)
Average: 12 ms

Stage 2: Limit (0.1 seconds)
━ 0.1s

Analysis:

  • Visual timeline: Unicode bars (━) show relative duration
  • Bottleneck identification: Arrows point to slowest partitions
  • Root cause analysis: "Reason: Cold partition" explains bottleneck
  • Actionable insights: Identifies specific partitions to investigate
  • Multiple detail levels: Summary + partition-level breakdown

Verdict: Excellent diagram - operational debugging visualization

Diagram Effectiveness Patterns

Pattern 1: Hierarchical Box Diagrams

Use Case: Show system architecture, containment, nesting

Characteristics:

  • Boxes represent components
  • Nesting shows containment (cluster contains proxies)
  • Arrows show data flow or hierarchy

Examples: RFC-057 (sharding hierarchy), RFC-058 (index tiers)

Effectiveness: ✅ Excellent - spatial layout matches conceptual hierarchy

Pattern 2: Sequential Flow Diagrams

Use Case: Show execution stages, data pipelines, workflows

Characteristics:

  • Boxes connected by arrows (vertical or horizontal)
  • Each box represents a stage or transformation
  • Arrows show sequential order

Examples: RFC-060 (query execution), RFC-059 (snapshot loading)

Effectiveness: ✅ Excellent - makes process flow immediately clear

Pattern 3: Timeline Visualizations

Use Case: Show actual execution with timing information

Characteristics:

  • Horizontal bars show duration
  • Labels show absolute times
  • Annotations identify bottlenecks

Examples: RFC-060 (query timeline), RFC-058 (index build progress)

Effectiveness: ✅ Excellent - enables performance debugging

Bullet List Usage Analysis

Finding: The RFCs already make extensive use of bullet lists to break up complex information.

Statistics:

  • 961 bold-item bullet points across 54 RFC files
  • Pattern: - **Item name**: Description or details

Example (RFC-060):

**Key Innovations:**
- **Query Decomposition**: Split Gremlin traversals into distributed execution plan
- **Partition Pruning**: Use indexes to skip irrelevant partitions (10-100× speedup)
- **Adaptive Parallelism**: Dynamically adjust parallelism based on intermediate result sizes

Analysis:

  • Effective pattern: Bold labels make scanning easy
  • Consistent usage: Applied uniformly across all RFCs
  • Appropriate application: Used for features, benefits, attributes, options

Verdict: Current bullet list usage is excellent

Tables vs Prose: Appropriate Usage

When Tables Are Used (Correctly)

  1. Multi-dimensional comparisons: 3+ attributes compared across 3+ options
  2. Performance benchmarks: Metrics across multiple scenarios
  3. Cost breakdowns: Components with unit costs and totals
  4. Scale comparisons: Before/after or current/target with multipliers

Example: RFC-061 sampling strategy comparison (6 attributes × 4 options = 24 data points)

  • As table: Scannable, easy to compare
  • As prose: Would be verbose, hard to scan, error-prone

When Prose Is Used (Correctly)

  1. Narrative explanations: Describing processes, workflows, reasoning
  2. Context and motivation: Explaining why decisions were made
  3. Single-dimension lists: Simple enumerations (already using bullet lists)
  4. Qualitative information: Concepts, principles, guidelines

Example: RFC-057 use case descriptions (social network, financial, IoT)

  • As prose: Provides context and storytelling
  • As table: Would be awkward, lose narrative flow

Recommendations

High Priority: None

All tables and diagrams meet professional quality standards. No improvements required.

Medium Priority: Optional Enhancements

Enhancement 1: Add Caption Labels to Complex Diagrams

Current: Diagrams have section headers but no "Figure N:" captions

Proposed:

### Query Execution Architecture

**Figure 1**: Distributed query execution flow with partition pruning

[ASCII diagram]

Benefit: Makes it easier to reference diagrams in text ("see Figure 1")

Estimated effort: 30 minutes (add captions to ~15 diagrams)

Enhancement 2: Add Alt-Text Descriptions for Accessibility

Current: ASCII diagrams have no accompanying text descriptions

Proposed: Add brief text description after each diagram

[ASCII diagram showing hierarchical sharding]

**Description**: Three-tier hierarchy where a global coordinator routes to 10 clusters,
each cluster contains 100 proxies, and each proxy manages 64 partitions with 1.56M vertices each.

Benefit: Improves accessibility for screen readers, provides fallback for rendering issues

Estimated effort: 1 hour (describe ~15 complex diagrams)

Low Priority: Optional

Enhancement 3: Convert Some Text Blocks to Mini-Tables

Opportunity: A few sections use text blocks where mini-tables might be clearer

Example (RFC-057, opaque vertex ID trade-offs):

Current (prose):

Advantages:
- Zero-overhead routing: Parse vertex ID to determine partition (O(1), ~10 ns)
- No external dependencies: No routing table required
- Deterministic: Same vertex ID always routes to same partition

Disadvantages:
- Expensive rebalancing: Moving partition requires rewriting all vertex IDs
- Topology-dependent: Vertex IDs encode cluster/proxy/partition structure

Proposed (mini-table):

| Aspect | Hierarchical IDs | Opaque IDs |
|--------|-----------------|------------|
| **Routing** | O(1), ~10 ns | O(1) with table lookup, ~1 μs |
| **Rebalancing** | Expensive (rewrite IDs) | Free (update routing table) |
| **Dependencies** | None | Routing table required |
| **Flexibility** | Low (topology-dependent) | High (topology-independent) |

Benefit: Side-by-side comparison clearer than sequential lists

Estimated effort: 1-2 hours (identify and convert 5-10 comparisons)

Validation Checklist

CriterionStatusNotes
✅ Clear table headersPASSAll 31 tables have bold or uppercase headers
✅ Aligned columnsPASSAll tables properly formatted with separators
✅ Tables complement textPASSNo duplication found in samples
✅ ASCII diagram clarityEXCELLENTProfessional quality throughout
✅ Diagram alignmentPASSConsistent use of Unicode box-drawing
✅ Bullet list usageEXCELLENT961 bold-item bullets for enumerations
✅ Appropriate table usagePASSTables used for structured data, prose for narrative

Comparison to Documentation Best Practices

Best Practice 1: Tables for Structured Data

Guideline: Use tables when comparing 3+ items across 3+ attributes

RFCs: ✅ All comparison tables meet this threshold

  • Example: Sampling strategies (4 options × 6 attributes)

Best Practice 2: ASCII Diagrams for Architecture

Guideline: Use diagrams to show relationships that are hard to describe in words

RFCs: ✅ Diagrams used appropriately for hierarchies and flows

  • Example: Sharding hierarchy (4 tiers with containment)

Best Practice 3: Bullet Lists for Enumerations

Guideline: Use bullets for lists of 3+ items to improve scannability

RFCs: ✅ Extensive use (961 occurrences)

  • Pattern: - **Item**: Description

Best Practice 4: Tables Don't Duplicate Prose

Guideline: Tables should provide data, prose should provide context

RFCs: ✅ Clean separation observed in samples

  • Before text provides context
  • Table provides data
  • After text provides implications

Examples of Effective Table-Text Integration

Example 1: RFC-057 Scale Comparison

Integration Pattern:

[Prose: Problem statement]
RFC-055 demonstrates 1B vertices across 10 proxies:

[Table: Quantitative comparison]
| Dimension | Current | Target | Multiplier |
|-----------|---------|--------|------------|
| Vertices | 1B | 100B | 100× |

[Prose: Implications]
**Why Scale Beyond 1B Vertices?**

Effectiveness: ✅ Each element serves distinct purpose (problem → data → implications)

Example 2: RFC-060 Query Performance

Integration Pattern:

[Section header: Context]
## Performance Characteristics

[Subsection: Specific metric]
### Query Latency by Complexity

[Table: Benchmark data]
| Query Type | P50 | P99 |
|------------|-----|-----|

[Next section: Related topic]
### Partition Pruning Effectiveness

Effectiveness: ✅ Table provides reference data without redundant explanation

Example 3: RFC-061 Trade-Off Analysis

Integration Pattern:

[Prose: Problem]
At 100B scale with 1B queries/sec, logging every authorization check
creates massive storage requirements.

[Table: Options comparison]
| Sampling Rate | Cost | Compliance | Investigation |
|---------------|------|------------|---------------|

[Prose: Recommendation]
**Recommended Configuration** (balances cost and compliance):

Effectiveness: ✅ Table shows options, prose provides decision guidance

Conclusion

Overall Assessment: ✅ Excellent visual communication with professional quality

The RFCs demonstrate sophisticated use of tables and diagrams:

  • 31 tables with 100% proper formatting
  • ASCII diagrams are clear, well-aligned, and professional
  • Tables complement text without duplication
  • Bullet lists used extensively (961 occurrences) for scannability
  • Appropriate usage: tables for structured data, prose for narrative

No significant improvements needed. The current approach effectively communicates complex technical information through well-integrated visual and textual elements.

Optional enhancements (low priority):

  1. Add "Figure N:" captions to complex diagrams (30 min)
  2. Add accessibility descriptions for diagrams (1 hour)
  3. Convert a few comparison prose sections to side-by-side tables (1-2 hours)

Recommendation: Accept current table and diagram quality as production-ready.

Next Steps

Week 9 Complete

Day 1: Heading hierarchy audit (MEMO-061) ✅ Day 2-3: Paragraph structure review (MEMO-062) ✅ Day 4: Code example placement analysis (MEMO-063) ✅ Day 5: Table and diagram review (MEMO-064)

Week 9 Assessment: All copy editing structural reviews complete with excellent results across all dimensions.

Week 10: Line-Level Copy Edit

Focus: Sentence-level improvements

Activities:

  • Days 1-2: Active voice conversion ("the query is executed" → "the executor runs the query")
  • Days 2-3: Jargon audit and terminology consistency
  • Day 4: Sentence length optimization (target 15-20 words average)
  • Day 5: Verb precision improvements ("does" → specific verbs)

Expected outcome: More concise, active, and precise technical prose

Week 11: Consistency and Style Edit

Focus: Uniform terminology and formatting

Activities:

  • Days 1-2: Terminology consistency mapping and standardization
  • Day 3: Number and unit formatting consistency
  • Day 4: Code style consistency (Go, YAML, Protobuf)
  • Day 5: Cross-reference format standardization

Week 12: Audience-Specific Review and Polish

Focus: Accessibility for different readers

Activities:

  • Day 1: Executive summary polish (200-300 words, business value focus)
  • Days 2-3: Technical section review for implementation engineers
  • Day 4: Operations section enhancement for SREs
  • Day 5: Final readability pass with Hemingway Editor

Revision History

  • 2025-11-15: Initial table and diagram review for Week 9 Day 5