rag & evaluation
Showing 20 articles in rag & evaluation.
The Hidden Environmental Cost of AI: Who Really Pays for Your LLM?
Every LLM query has a carbon footprint. Discover the hidden environmental costs of AI systems, how they are externalized to the public, and practical strategies for building more sustainable enterprise AI solutions.
Building Production-Grade Video Search: HNSW, Vector Indexing, and Multimodal RAG
Master the infrastructure behind production video search systems. Learn HNSW graph indexing, hierarchical retrieval strategies, hybrid search, and VideoRAG architectures that power platforms like YouTube and TikTok at billion-video scale.
Understanding Multimodal Embeddings: The Evolution from CLIP to Unified Foundation Models
Explore the paradigm shift in multimodal AI from isolated CLIP-style encoders to unified foundation models like Omni-Embed and VLM2Vec-V2. Learn how instruction-tuned transformers are revolutionizing cross-modal retrieval and embedding generation.
Bridging Legal Requirements and Technical Implementation: A Practical Guide to AI Governance Frameworks
Explore how modern technology platforms like OpenMetadata, DataHub, and Apache Atlas enable organizations to meet evolving legal requirements from EU AI Act to NIST frameworks while building robust AI governance systems.
Let's Talk 0.1.5 Release: Enhanced Self-Hosting and Production-Ready Features
Announcing Let's Talk 0.1.5 with comprehensive self-hosting capabilities, enhanced security, modular architecture, and production-ready features for AI-powered interactive chat systems.
Data Governance for AI and RAG Systems: A Strategic Imperative
Discover why specialized data governance frameworks are critical for AI and RAG systems, and learn practical strategies to ensure responsible, secure, and effective AI deployment.
Responsible RAG: Ethical Considerations in Retrieval-Augmented Generation
Explore the ethical landscape of Retrieval-Augmented Generation (RAG) systems—covering citation, attribution, bias, and transparency—and learn how to evaluate fairness and responsibility using Ragas metrics.
The Economics of RAG: Cost Optimization for Production Systems
A comprehensive guide to understanding and optimizing the costs of Retrieval-Augmented Generation (RAG) systems in production, from token usage and embedding storage to infrastructure and operational overhead.
Zero-Shot RAG Systems: The Data Guy Show Podcast Episode
Join Nazz and Mo on The Data Guy Show as they explore how to build Retrieval-Augmented Generation systems that work out-of-the-box with minimal tuning, featuring real-world examples and practical insights.
You Can't Handle the Truth... Without Context!
Discover why context is the ultimate key to getting truthful, grounded answers from AI systems. Learn how proper context transforms LLM hallucinations into reliable, factual responses through real-world examples and practical techniques.
Zero-Shot RAG: Building Systems That Work Out-of-the-Box
Discover how to build Retrieval-Augmented Generation systems that perform effectively with minimal tuning, allowing for faster deployment and reduced development overhead while maintaining high quality responses.
Evaluating Advanced RAG Retrievers: A Practical Comparison
A hands-on, metric-driven comparison of advanced retrieval strategies for RAG systems using LangChain and Ragas. See which retriever wins on accuracy, speed, and cost.
Part 8: Building Feedback Loops with Ragas
A research-driven guide to designing robust, actionable feedback loops for LLM and RAG systems using Ragas. Learn how to select metrics, set baselines, define thresholds, and incorporate user and human feedback for continuous improvement.
Part 7: Integrations and Observability with Ragas
Discover how to generate robust test datasets for evaluating Retrieval-Augmented Generation systems using Ragas, including document-based, domain-specific, and adversarial test generation techniques.
Part 6: Evaluating AI Agents: Beyond Simple Answers with Ragas
Learn how to evaluate complex AI agents using Ragas' specialized metrics for goal accuracy, tool call accuracy, and topic adherence to build more reliable and effective agent-based applications.
Part 5: Advanced Metrics and Customization with Ragas
Explore advanced metrics and customization techniques in Ragas for evaluating LLM applications, including creating custom metrics, domain-specific evaluation, composite scoring, and best practices for building a comprehensive evaluation ecosystem.
Part 4: Generating Test Data with Ragas
Discover how to generate robust test datasets for evaluating Retrieval-Augmented Generation systems using Ragas, including document-based, domain-specific, and adversarial test generation techniques.
Part 3: Evaluating RAG Systems with Ragas
Learn specialized techniques for comprehensive evaluation of Retrieval-Augmented Generation systems using Ragas, including metrics for retrieval quality, generation quality, and end-to-end performance.
Part 2: Basic Evaluation Workflow with Ragas
Learn how to set up a basic evaluation workflow for LLM applications using Ragas. This guide walks you through data preparation, metric selection, and result analysis.
Part 1: Introduction to Ragas: The Essential Evaluation Framework for LLM Applications
Explore the essential evaluation framework for LLM applications with Ragas. Learn how to assess performance, ensure accuracy, and improve reliability in Retrieval-Augmented Generation systems.