Comprehensive mapping of all data sources, transformation pipelines, and API endpoints required for the ECIC Static Reporting Center.
1.1 Executive Summary
Purpose: Design unified API layer connecting all ECIC data sources to interactive Quarto dashboards Architecture: Event-driven data pipelines feeding centralized API serving real-time and historical data Security: Role-based access with data sensitivity classification at the API level
Type: CRM system of record API Limits: 100 requests/minute Authentication: API key (stored in Doppler) Update Pattern: 4-hour sync cycles with incremental updates
Critical Data: - Client master records (48 active contacts) - Custom fields mapping to SEC record types - Pipeline/opportunity tracking - Contact interaction history - Document attachment metadata
1.2.1.33. Google Workspace APIs
Services: Gmail, Drive, Calendar, Docs Authentication: OAuth 2.0 with service accounts Update Pattern: Daily ingestion (2 AM UTC) Retention: 180-day email metadata compliance
Data Types: - Email communication summaries - Document collaboration activity - Meeting/calendar data - Shared drive file inventory - Form responses and submissions
1.2.1.44. Altruist Custody Platform
Critical Limitation: No API access available Current Process: Manual copy/paste from platform Data Staleness Risk: Updates dependent on manual process Integration Need: Screen scraping or alternative solution
Essential Data: - Real-time account balances and positions - Transaction history and confirmations - Performance attribution by account - Fee billing and cost basis tracking
1.2.2Secondary Data Sources
1.2.2.15. Market Data & Benchmarks
Need: External market data for performance comparison Potential Sources: Alpha Vantage API, Yahoo Finance, SEC EDGAR Update Frequency: Daily market close
Requirements: - S&P 500, Russell indices for benchmarking - ESG/sustainable investing index performance - Sector performance for attribution analysis - Yield curve and fixed income benchmarks
1.2.2.26. Regulatory Data
Sources: SEC IARD system, Utah Division of Securities Type: Form ADV filings, registration status, regulatory notices Update Pattern: On-demand when filings occur
1.2.2.37. Vendor Performance Data
Sources: Direct vendor APIs where available Types: SLA metrics, cost data, service quality scores Integration Complexity: Varies by vendor API capabilities
1.3 API Architecture Design
1.3.1Core API Principles
1. Unified Data Layer
graph TB
A[LACRM API] --> E[Data Processing Layer]
B[Google Workspace] --> E
C[Hooks DuckDB] --> E
D[Altruist Manual] --> E
E --> F[Unified API Gateway]
F --> G[Dashboard Consumption]
F --> H[Report Generation]
F --> I[Compliance Monitoring]
style E fill:#e1f5fe
style F fill:#c8e6c9
style D fill:#ffcccc
2. Event-Driven Architecture - Real-time events for critical data changes - Batch processing for historical analysis - Webhook integration for external systems - Queue-based processing for reliability
/api/auth/ POST /login # Google SSO integration POST /refresh # Token refresh GET /permissions # Role-based permissions check POST /logout # Session termination
1.3.2.2Executive Dashboard APIs
/api/executive/ GET /dashboard # Real-time KPI summary GET /aum # Assets under management trends GET /clients # Client count and growth metrics GET /performance # Portfolio performance summary GET /alerts # Priority alerts and actions needed
Data Sources: Hooks DuckDB + LACRM + Altruist Update Frequency: Every 4 hours Cache Strategy: 1-hour cache with real-time invalidation Security Level: Executive (Level 5)
1.3.2.3Regulatory Intelligence APIs
/api/regulatory/ GET /compliance # SEC record compliance status GET /form-adv # Form ADV amendment tracking GET /examinations # Examination readiness checklist GET /filings # Regulatory filing status GET /audit-trail # Document completeness audit
/api/clients/ GET /performance # Portfolio performance by strategy GET /demographics # Client distribution analysis GET /satisfaction # Client satisfaction metrics GET /lifecycle # Client acquisition/retention GET /communications # Communication touch analysis
Data Sources: Hooks DuckDB + LACRM + Google Workspace + Altruist Update Frequency: Daily for aggregates, 4-hour for individual metrics Cache Strategy: 2-hour cache Security Level: Restricted (Level 3)
1.3.2.5Operations APIs
/api/operations/ GET /vendors # Vendor performance scorecards GET /processes # Process efficiency metrics GET /systems # System health and uptime GET /costs # Cost analysis and budget tracking GET /automation # Automation status and effectiveness
Data Sources: Multiple vendor APIs + system monitoring + cost tracking Update Frequency: Hourly for health, daily for performance metrics Cache Strategy: 30-minute cache Security Level: Internal (Level 2)
1.3.2.6Strategic Intelligence APIs
/api/strategic/ GET /market # Market analysis and trends GET /competition # Competitive positioning GET /growth # Growth planning metrics GET /initiatives # Strategic initiative tracking GET /scenarios # Risk/scenario modeling
Data Sources: External market data + internal strategic planning data Update Frequency: Daily for market data, weekly for strategic metrics Cache Strategy: 4-hour cache Security Level: Executive (Level 5)
Rationale: - Native async support for high-performance data serving - Automatic OpenAPI documentation generation - Built-in data validation with Pydantic - Excellent integration with DuckDB and other data sources
Problem: No API access for real-time custody data Solutions: - Screen Scraping: Automated browser interaction (legal/ToS review required) - CSV Import API: Manual export → automated processing pipeline - Alternative Custodian: Evaluate custodians with robust APIs - Reconciliation Reports: Enhanced manual process with validation
1.5.22. LACRM Rate Limiting
Problem: 100 requests/minute limit Solutions: - Intelligent Queuing: Batch requests and optimize query patterns - Incremental Sync: Only sync changed records since last update - Caching Strategy: Aggressive caching with smart invalidation - Webhook Integration: Real-time updates instead of polling
1.5.33. Google Workspace Quotas
Problem: API quotas for Gmail, Drive, Calendar Solutions: - Service Account Distribution: Multiple service accounts for load balancing - Batch Processing: Combine multiple operations in single requests - Selective Sync: Only sync relevant data based on business rules - Exponential Backoff: Robust retry mechanisms for quota exhaustion
1.5.44. Data Freshness vs Performance
Problem: Balance between real-time data and system performance Solutions: - Tiered Refresh Rates: Critical data (1hr), Important (4hr), Standard (daily) - Event-Driven Updates: Real-time updates only for material changes - Progressive Loading: Fast initial load with progressive detail enhancement - Background Processing: Heavy calculations during off-hours
performance_sla:executive_dashboard:target:"<2 seconds initial load"acceptable:"<5 seconds with cold cache"detailed_reports:target:"<10 seconds for complex analysis"acceptable:"<30 seconds for historical deep-dives"real_time_alerts:target:"<1 second for critical notifications"acceptable:"<5 seconds for standard alerts"
Observable JS Performance: How well does Observable JS handle large DuckDB result sets (10k+ rows)?
Real-time Updates: Can Quarto dashboards refresh data without full page reload?
Mobile Responsiveness: How do interactive dashboards perform on mobile devices?
Export Capabilities: Can Observable JS charts be exported to PDF/PNG for compliance documentation?
1.9.2Data Integration
Altruist Alternatives: What custodians provide robust APIs for RIA firms?
Screen Scraping Legality: Legal review of automated data extraction from Altruist platform
Data Quality Thresholds: What accuracy levels are acceptable for different report types?
Historical Data Retention: How much historical data should remain in active vs archive systems?
1.9.3Regulatory Compliance
SEC API Requirements: Are there specific SEC requirements for how advisers present data via APIs?
Audit Trail Standards: What level of API access logging is required for regulatory examinations?
Client Data Consent: Do clients need explicit consent for data aggregation and reporting?
Third-Party Integration: How do SEC custody rules apply to API-based data access?
1.9.4Performance & Scaling
DuckDB Performance: At what data volume do DuckDB queries become too slow for real-time dashboards?
Cloudflare Limitations: Are there Cloudflare Pages limitations for interactive dashboard complexity?
Concurrent User Load: How many simultaneous dashboard users can the architecture support?
Geographic Performance: How do dashboard load times vary for remote users?
1.9.5User Experience
Training Requirements: What training is needed for non-technical users to access interactive reports?
Dashboard Maintenance: How much ongoing maintenance do Observable JS dashboards require?
Print-Friendly Reports: How to ensure dashboards can produce print-friendly versions for meetings?
Accessibility Compliance: How to ensure dashboards meet ADA accessibility requirements?
Next Steps: 1. Review API architecture with technical team 2. Prioritize integration challenges based on business impact 3. Begin Phase 1 implementation with authentication and core endpoints 4. Establish testing framework for API performance and reliability