Blog

Insights on AI safety, agent testing, and building reliable LLM applications.

Introducing Agent Probe
2025-03-15

Introducing Agent Probe: Automated AI Agent Testing

We're launching Agent Probe — a framework that automatically tests LLM-based agents for hallucinations, security vulnerabilities, PII leaks, bias, and toxicity. Here's why we built it and how it works.

Announcement Read more →
6-Layer Test Pyramid
2025-03-10

The 6-Layer AI Test Pyramid Explained

Traditional software has the testing pyramid. We adapted it for AI agents. Learn how our 6-layer approach — from accuracy to ethics — provides comprehensive coverage for LLM-based applications.

Technical Read more →
AI Agent Security Testing
2025-03-05

Why Your AI Agent Needs a Security Test

Prompt injection and jailbreak attacks are real threats to production AI. We analyze common attack vectors and show how Agent Probe's security evaluator catches them using JailbreakBench datasets.

Security Read more →
Detecting Hallucinations
2025-02-28

Detecting Hallucinations in LLM Agents

When your AI agent confidently tells a customer false information, the damage is already done. We explore how TruthfulQA-based evaluation catches hallucinations before they reach production.

Evaluation Read more →
Bias Testing with BBQ
2025-02-20

Bias Testing with BBQ: 20 Categories, Zero Tolerance

Does your agent respond differently based on gender, race, or age? We explain how the BBQ dataset enables systematic bias detection across 20 demographic categories.

Ethics Read more →
EU AI Act Testing
2025-02-15

EU AI Act: What It Means for AI Agent Testing

The EU AI Act is here, and it requires systematic testing of AI systems. We break down the regulation's risk tiering and show how Agent Probe helps you stay compliant.

Compliance Read more →