Skip to main content

pending

  • Web app firewall
  • Learn more about OSI model and what works in which layer
  • Back of envelope estimation in SD
  • CoDel Strategy in more detail
  • WAL
  • tranformers
  • FFN
  • Attention
  • more about Layer wise LBs

Advanced:

  • Split a monolith safely
  • Active-active conflict resolution
  • Change Data Capture vs dual writes
  • Search index freshness vs ranking quality
  • Rebalancing shards under skewed traffic
  • Noisy neighbor problem in multi-tenant systems
  • Watermarks / late-arriving events in stream processing
  • Exactly-once processing vs practical deduplication

AI fund

Here are the fundamentals I would start with:

➤ LLM Basics ↬ Tokens ↬ Context Window ↬ Prompt Design ↬ System Prompts ↬ Temperature ↬ Top-p Sampling ↬ Structured Outputs ↬ JSON Mode ↬ Function Calling ↬ Tool Calling ↬ Agents ↬ Memory ↬ Guardrails ↬ Hallucinations ↬ Model Latency ↬ Model Routing ↬ Small vs Large Models ↬ Fine-tuning vs Prompting ↬ Open-source vs Closed Models

➤ RAG & Retrieval ↬ Embeddings ↬ Vector Search ↬ Vector Databases ↬ Chunking ↬ Chunk Overlap ↬ Metadata Filtering ↬ Hybrid Search ↬ Keyword Search ↬ Semantic Search ↬ Reranking ↬ Retrieval Recall ↬ Retrieval Precision ↬ Query Rewriting ↬ Document Freshness ↬ Permission-aware Retrieval ↬ Citation Grounding ↬ Evidence Selection ↬ Context Packing ↬ Missing Information Detection

➤ AI System Architecture ↬ API Gateway ↬ Request Routing ↬ Model Gateway ↬ Prompt Service ↬ Inference Service ↬ Retrieval Service ↬ Ranking Service ↬ Feature Store ↬ Offline Pipelines ↬ Online Serving ↬ Async Processing ↬ Queueing ↬ Streaming Responses ↬ Rate Limiting ↬ Fan-out/Fan-in ↬ Batch Inference ↬ Real-time Inference ↬ Human-in-the-loop Systems ↬ Fallback Workflows

➤ Cost & Performance ↬ Token Budgeting ↬ Prompt Compression ↬ Prompt Caching ↬ Semantic Caching ↬ Response Caching ↬ Batch Requests ↬ Model Quantization ↬ Distillation ↬ Latency Budgets ↬ Cold Starts ↬ GPU Utilization ↬ Throughput ↬ Cost per Query ↬ Cost per User ↬ Model Selection ↬ Inference Scaling ↬ Backpressure ↬ Load Shedding

➤ Evaluation & Quality ↬ Offline Evals ↬ Online Evals ↬ Golden Dataset ↬ Human Review ↬ LLM-as-Judge ↬ A/B Testing ↬ Regression Testing ↬ Answer Relevance ↬ Factual Accuracy ↬ Faithfulness ↬ Groundedness ↬ Toxicity Checks ↬ Safety Checks ↬ Drift Detection ↬ Feedback Loops ↬ Confidence Scoring ↬ Escalation Criteria ↬ Quality Monitoring

➤ Reliability & Security ↬ Timeouts ↬ Retries ↬ Circuit Breakers ↬ Failover ↬ Model Fallbacks ↬ Graceful Degradation ↬ Observability ↬ Tracing ↬ Prompt Logs ↬ Token Metrics ↬ Error Budgets ↬ PII Redaction ↬ Data Privacy ↬ Access Control ↬ Prompt Injection ↬ Jailbreak Defense ↬ Audit Logs ↬ Compliance