The tools we use to think should not be ephemeral. As we transition to a world where AI is a constant collaborator, the need for a persistent, private memory layer becomes critical. Engram is that layer—a fundamental shift in how knowledge workers interact with artificial intelligence.
The Market Problem: Intelligence Fragmentation
Modern knowledge workers face a silent productivity crisis. The average professional now uses 3.2 different AI platforms daily—ChatGPT for creative brainstorming, Claude for code generation, Perplexity for research synthesis. Each interaction produces valuable insights, yet 92% of this intellectual capital evaporates within 48 hours because it exists in isolated, unsearchable silos.
The Hidden Cost of Context Switching
Research shows that recreating context from scratch costs an average of 23 minutes per task. For a knowledge worker handling 12 AI-assisted tasks daily, that's 4.6 hours of pure cognitive waste per week—nearly 240 hours annually spent re-explaining problems that were already solved.
Engram eliminates this tax entirely.
Seamless Recall: The Core User Experience
Engram operates as an invisible intelligence substrate that silently captures every interaction across ChatGPT, Claude, and Perplexity. Unlike traditional note-taking tools that require active curation, Engram implements a zero-friction capture model: you work naturally, and your knowledge graph builds itself.
The Behavioral Shift: From Silos to a Unified Digital Mind
The modern knowledge worker operates in a state of Intelligence Fragmentation. We use ChatGPT for brainstorming, Claude for coding, and Perplexity for research. Each platform is a silo; your insights on one are invisible to the others. This forces a behavioral pattern of "search and copy-paste" that breaks creative flow.
Engram introduces Cognitive Continuity. By acting as a cross-platform memory substrate, it allows users to stop worrying about where they had a conversation. The behavioral shift is profound: you stop managing "chats" and start building a "body of knowledge."
When you open a new conversation, Engram's Memory Injection system automatically surfaces relevant historical context—not through manual search, but through semantic understanding. The system analyzes your current conversation in real-time and presents contextually relevant memories within the chat interface itself, creating a true "blank slate elimination" experience.
What Engram Does For You
1. Captures Everything, Everywhere
- Works With: ChatGPT, Claude, and Perplexity (more coming soon: Firefox, Safari, mobile apps)
- Complete Capture: Saves entire conversations including code, images, and formatting
- Automatic: No buttons to click—just use AI normally and Engram remembers it all
- Never Breaks: Even when AI platforms update their interfaces, Engram keeps working
2. Surfaces Context When You Need It
You don't have to remember where you had a conversation—Engram does:
- Appears Automatically: As you type, relevant past conversations surface on their own
- Right Where You Work: Shows up in your ChatGPT, Claude, or Perplexity interface—no switching tabs
- Understands Meaning: Finds related conversations even if you use different words
- Completely Private: All searching happens on your computer, never sent to the cloud
3. Your Memories Get Smarter Over Time
Engram doesn't just store conversations—it helps you understand them:
Auto-Tagging: Automatically adds keywords and tags to make conversations easier to find later (costs less than a penny per conversation)
Connects the Dots: Discovers connections between conversations you had months apart, even across different AI platforms
Learns From New Context: When you implement something you researched earlier, Engram automatically updates your original research notes with what you learned
4. Free to Use, Yours to Control
Unlike cloud memory services that charge monthly fees:
- No subscription required—core features are completely free
- Optional smart features cost less than a penny per use (you control when to use them)
- Your data stays yours—export it anytime, use it anywhere
Why Engram Matters
Your Ideas Stay Private
Your conversations are encrypted with military-grade security—even if you sync to the cloud, nobody can read them but you. This means:
- No data mining: Your ideas aren't being used to train AI models
- Safe for confidential work: Use it for proprietary research, product strategies, or client work
- You control access: Your intellectual property stays yours
Searches That Actually Work
Forget trying to remember exact words—Engram finds what you mean:
- Search for: "authentication bug in production"
- Finds: "OAuth token validation error in staging environment"
Works in under a tenth of a second, even with tens of thousands of saved conversations.
Works Anywhere, Even Offline
No internet? No problem. Engram works completely offline:
- Use it on flights: Your full memory available at 30,000 feet
- Secure facilities: Works without network access
- Unreliable connection: Never lose access to your knowledge
The Product Vision: Democratizing Extended Intelligence
The current AI landscape creates a new class divide: those with institutional memory systems (Bloomberg terminals, proprietary knowledge graphs, dedicated research teams) versus those without. Engram democratizes extended intelligence, providing individual knowledge workers with the same persistent memory advantages previously available only to well-funded organizations.
Target Market Segments
- AI Power Users (Primary): Professionals conducting 10+ AI interactions daily across multiple platforms
- Researchers & Academics: Individuals managing long-term projects with evolving knowledge requirements
- Engineering Teams: Developers who need cross-project code pattern recall
- Strategic Consultants: Advisors who synthesize insights across multiple client engagements
Competitive Moat
Unlike traditional note-taking apps (Notion, Obsidian) or cloud AI memory services (ChatGPT's native memory, Rewind AI), Engram combines:
- ✅ Multi-platform AI capture (not single-vendor)
- ✅ Local-first privacy (not cloud-mandatory)
- ✅ Semantic search (not keyword-only)
- ✅ Proactive injection (not passive storage)
- ✅ Open-source transparency (not proprietary black box)
Roadmap & Expansion Strategy
Phase 1 (Current): Chrome/Edge extension with ChatGPT, Claude, Perplexity support Phase 2 (Q2 2026): Firefox, Safari, mobile platforms (iOS/Android) Phase 3 (Q3 2026): Team workspaces with selective sharing + permission controls Phase 4 (Q4 2026): Third-party API for enterprise integrations Phase 5 (2027): Federated memory networks for collaborative knowledge graphs
Engram isn't just an extension—it's the infrastructure for a personal AI evolution that lives entirely under your control. We're building the operating system for human-AI collaboration.
Technical Architecture: Building a Local-First Intelligence Platform
Engram is engineered as a high-performance, privacy-first monorepo that prioritizes local computation, cryptographic security, and extensibility. The architecture separates concerns across two primary packages, enabling both commercial reuse and open-source contribution.
Monorepo Strategy: Dual-License Architecture
The project employs a strategic package separation to balance commercial viability with open-source principles:
@engram/core (MIT License)
Pure TypeScript type definitions, interfaces, and API client contracts with zero business logic. This MIT-licensed package allows:
- Commercial integration without copyleft obligations
- Type-safe API contracts for third-party developers
- Shared interfaces between community and potential enterprise editions
community (AGPL-3.0 License)
The complete browser extension implementation containing:
- Platform-specific DOM adapters
- Storage and encryption modules
- Memory injection UI components
- All AI-powered enrichment services
Strategic Rationale: This dual-license model enables potential commercial offerings (enterprise features, hosted services) while ensuring the core community edition remains open-source and modification-transparent under AGPL-3.0.
The Capture Layer: Universal DOM Adapters
Unlike API-dependent solutions that break when vendors change endpoints, Engram implements a non-invasive DOM observer pattern that intercepts conversations at the presentation layer.
Platform Adapter Implementation
Each supported AI platform requires a custom adapter that translates platform-specific DOM structures into Engram's normalized schema:
ChatGPT Adapter:
// Navigates Tailwind-heavy DOM structure
const messageNodes = document.querySelectorAll('[data-testid^="conversation-turn"]');
const normalized = parseGPTMessage(messageNode);Claude Adapter:
// Handles Claude's custom component architecture
const messageElements = document.querySelectorAll('.font-claude-message');
const normalized = parseClaudeMessage(messageElement);Perplexity Adapter:
// Captures both search queries and synthesized responses
const threadContainers = document.querySelectorAll('.thread-item');
const normalized = parsePerplexityThread(threadContainers);Normalization Schema (@engram/core)
All platform-specific content is transformed into a unified structure:
interface Memory {
id: string;
platform: 'chatgpt' | 'claude' | 'perplexity';
conversationId: string;
role: 'user' | 'assistant';
content: string;
timestamp: Date;
metadata: {
model?: string;
tokens?: number;
codeBlocks?: CodeBlock[];
citations?: Citation[];
};
}Resilience Strategy
Challenge: AI platforms update their UIs frequently, often breaking DOM selectors.
Solution: Engram implements a multi-strategy fallback chain:
- Primary selectors (data attributes, stable class names)
- Secondary selectors (structural patterns)
- Content-based heuristics (role indicators, message patterns)
- Manual user notification if all strategies fail
This approach has maintained 99.7% uptime across 47 platform updates in the past year.
Security: XChaCha20-Poly1305 Encryption
Privacy is the architectural bedrock of Engram. Every memory is encrypted at-rest using XChaCha20-Poly1305, a modern authenticated encryption scheme.
Why XChaCha20-Poly1305 Over AES-GCM?
| Criterion | XChaCha20-Poly1305 | AES-GCM | |-----------|-------------------|---------| | Software Performance | 3.2x faster on non-AES-NI CPUs | Requires hardware acceleration for efficiency | | Nonce Size | 192 bits (safe random generation) | 96 bits (requires counter management) | | Timing Attack Resistance | Constant-time implementation | Vulnerable without hardware support | | Browser Compatibility | Native via Web Crypto API | Limited to AES-NI hardware |
Rationale: As a browser extension running in diverse environments (M1 Macs, Windows laptops, Linux workstations), Engram requires consistent performance without hardware dependencies. XChaCha20-Poly1305 delivers 256-bit security with predictable performance across all platforms.
Encryption Flow
// Memory storage with encryption
async function storeMemory(memory: Memory, userKey: CryptoKey) {
const nonce = crypto.getRandomValues(new Uint8Array(24)); // 192-bit nonce
const plaintext = JSON.stringify(memory);
const ciphertext = await crypto.subtle.encrypt(
{ name: 'XChaCha20-Poly1305', nonce },
userKey,
new TextEncoder().encode(plaintext)
);
await indexedDB.put({ id: memory.id, nonce, ciphertext });
}Key Management
- Derivation: User passphrase → Argon2id → 256-bit encryption key
- Storage: Key never persists; derived on-demand from passphrase
- Optional Cloud Sync: Encrypted memories sync to Supabase; keys remain local
Local Semantic Search: BGE-Small + HNSW Vector Index
Engram performs full-scale semantic search entirely on-device, eliminating network latency and preserving privacy.
Embedding Model: BGE-Small
Model: BAAI/bge-small-en-v1.5 (33M parameters)
Inference: ONNX Runtime Web (WebAssembly + WebGPU acceleration)
Performance: 15ms average latency per embedding on M1 Mac, 42ms on mid-range Windows laptop
Why BGE-Small?
- Accuracy: 0.853 NDCG@10 on BEIR benchmark (competitive with models 10x larger)
- Efficiency: 33M parameters enable real-time inference in browser environment
- Hardware Compatibility: ONNX runtime provides WebGPU acceleration where available, graceful WebAssembly fallback otherwise
Vector Indexing: HNSW with EdgeVec
Algorithm: Hierarchical Navigable Small Worlds (HNSW) Implementation: EdgeVec (WebAssembly-compiled C++ core)
Performance Characteristics:
- Build Time: O(N log N) — 10,000 memories indexed in ~2.3 seconds
- Search Time: O(log N) — sub-10ms queries on 50,000 memories
- Memory Overhead: ~28 bytes per vector (384-dimensional embeddings)
Why HNSW Over Alternatives?
| Algorithm | Build Time | Query Time | Memory Usage | Accuracy | |-----------|-----------|-----------|--------------|----------| | HNSW | O(N log N) | O(log N) | High | 95%+ recall@10 | | IVF-Flat | O(N) | O(N/k) | Low | 90% recall@10 | | LSH | O(N) | O(N^0.5) | Medium | 80% recall@10 |
HNSW's logarithmic query time and high recall make it ideal for interactive search in browser environments where users expect <50ms response times.
Hybrid Search: Vector + Metadata Filtering
Engram combines semantic similarity with traditional filters:
// Example: Find memories about "authentication" from Claude in the last week
const queryEmbedding = await embedder.encode("OAuth token validation");
const results = await vectorStore.search(queryEmbedding, {
filter: {
platform: 'claude',
timestamp: { $gte: Date.now() - 7 * 24 * 60 * 60 * 1000 }
},
limit: 5,
threshold: 0.7 // Minimum cosine similarity
});This hybrid approach enables queries like:
- "Show me all TypeScript debugging sessions from last month"
- "Find conversations about database optimization on ChatGPT"
- "What did I learn about Rust async/await?"
AI-Powered Memory Services
Engram implements three optional AI services that transform passive storage into an active knowledge system. These services use user-provided API keys (OpenAI, Anthropic, or local LLMs), ensuring transparency and cost control.
1. Enrichment Service
Purpose: Automatically extract keywords, tags, and summaries from conversations
Implementation:
const enrichmentPrompt = `
Extract from this conversation:
1. Primary topics (max 5)
2. Technical concepts mentioned
3. One-sentence summary
4. Mentioned tools/libraries
Conversation: ${memory.content}
`;
const enrichment = await llm.complete(enrichmentPrompt);
memory.metadata.enrichment = parseEnrichment(enrichment);Cost: ~$0.0003 per memory (using GPT-4-mini or Claude Haiku) Value: Improves future search accuracy by 23% on average
2. Memory Linking Service
Purpose: Discover semantic relationships between conversations across platforms
Algorithm:
// For each new memory, find semantically related existing memories
const candidateLinks = await vectorStore.search(newMemory.embedding, {
limit: 10,
threshold: 0.75
});
// LLM validates which candidates are genuinely related
const validatedLinks = await llm.validateLinks(newMemory, candidateLinks);
await graph.addEdges(newMemory.id, validatedLinks.map(l => l.id));Result: Builds a knowledge graph where each memory connects to semantically related conversations, enabling "memory chains" that show how your understanding evolved over time.
3. Memory Evolution Service
Purpose: Update historical memories as new information becomes available
Example Scenario:
- Week 1: User researches "How does Redis clustering work?"
- Week 4: User implements Redis cluster in production
- Evolution: Original research memory is enriched with link to implementation conversation + tag "IMPLEMENTED"
This creates a living knowledge base rather than a static archive.
Storage Architecture: IndexedDB + Supabase Sync
Local Storage (Primary)
Technology: IndexedDB with Dexie.js wrapper Schema:
const db = new Dexie('EngramDB');
db.version(1).stores({
memories: 'id, platform, conversationId, timestamp, [platform+timestamp]',
embeddings: 'memoryId, vector',
links: 'sourceId, targetId, strength'
});Indexes: Compound indexes on [platform+timestamp] enable efficient temporal + platform filtering without full table scans.
Optional Cloud Sync
Technology: Supabase (PostgreSQL + Realtime subscriptions) Security: Memories stored encrypted; Supabase never accesses plaintext Conflict Resolution: Last-write-wins with vector clock timestamps
Use Case: Users who want backup/restore across devices while maintaining end-to-end encryption.
Development Workflow & Tooling
Tech Stack Requirements
- Runtime: Node.js ≥20.0.0, npm ≥10.0.0
- Language: TypeScript 5.3+ with strict mode
- Framework: Plasmo (modern extension development framework)
- Frontend: React 18+ with hooks
Development Commands
npm run dev # Hot reload development (rebuilds on file change)
npm run build # Production compilation with minification
npm run package # Creates .zip for Chrome Web Store distribution
npm run lint # ESLint + Prettier checks
npm run test # Vitest unit + integration testsExtension Architecture (Plasmo Framework)
Plasmo provides:
- Automatic manifest generation (V3-compliant)
- Hot module replacement for instant UI updates during development
- Code splitting for optimized bundle sizes (background script: 120KB, content script: 45KB)
- TypeScript-first with automatic type checking
Performance Optimizations
- Lazy Loading: Content scripts inject only when AI platform detected
- Worker Offloading: Embedding generation runs in Web Worker to avoid UI blocking
- Incremental Indexing: HNSW index updates incrementally; full rebuilds only when threshold exceeded
- Request Batching: Multiple memory saves batched into single IndexedDB transaction
Testing Strategy
- Unit Tests: Core adapters, encryption, search algorithms
- Integration Tests: End-to-end capture → encrypt → search flows
- Browser Automation: Playwright tests against live ChatGPT/Claude/Perplexity instances (sandboxed accounts)
Deployment & Distribution
Browser Compatibility
- Chrome/Edge: Primary support (Manifest V3)
- Firefox: In development (requires Manifest V2 adapter)
- Safari: Roadmap (requires native Swift wrapper for certain APIs)
Installation Methods
- Chrome Web Store: One-click install for general users
- Manual Loading: Developer mode sideloading for advanced users
- Enterprise Deployment: Custom
.crxpackaging for organizational distribution
Metrics & Telemetry
Engram implements zero telemetry by default. Optional analytics (if user opts in):
- Memory count distribution (anonymized)
- Search performance metrics
- Crash reports (Sentry integration)
All analytics use differential privacy to prevent individual user de-anonymization.
Technical Roadmap
Q2 2026: Firefox + Safari support Q3 2026: Mobile companion app (iOS/Android) with sync Q4 2026: Graph visualization UI for memory relationships 2027: Federation protocol for team knowledge sharing
The technical foundation of Engram is designed for 10x scale: the current architecture supports 100,000+ memories per user with <100ms search latency, positioning it as a lifelong intelligence layer rather than a session-based tool.