Video Tutorials

Step-by-step video guides for every feature of Agent Probe.

Product Demo

See Agent Probe in Action

Watch Demo

Demo 1

Agent Probe — Full Product Demo

A complete walkthrough from login to results — model selection, batch testing, live chat evaluation, and CI/CD integration.

~5 min

Watch Demo

Demo 2

Agent Probe — Advanced Features

Deep dive into evaluator configuration, custom datasets, webhook setup, and multi-model comparison.

~5 min

Tutorial Series

Step-by-Step Tutorials

6-part series covering everything from testing fundamentals to custom datasets.

Watch Now

Video 1

Introduction — Why Test AI Agents?

What problems Agent Probe solves. Hallucination, prompt injection, PII leaks, bias, toxicity — and why traditional testing isn't enough.

~4 min

Watch Now

Video 2

Test Pyramid & Evaluators

6-layer test pyramid walkthrough. All 16 evaluators explained. Judge model concept — what it is and why it matters.

~5 min

Watch Now 5 Parts

Video 3

Running Your First Test & Manual Chat

Login → model selection → configuration → batch run → real-time results. Live bias, security (jailbreak), and PII detection demo.

~5 min5 parts

Watch Now 3 Parts

Video 4

Reading Results & Model Comparison

Reading test cards (score, pass/fail, judge reasoning). Test history. Side-by-side model comparison: GPT-4o-mini vs Claude.

~5 min3 parts

Watch Now

Video 5

CI/CD — Webhooks & API Keys

Creating webhooks with cron scheduling. API key generation. cURL integration. User management and approval workflow.

~5 min

Watch Now

Video 6

Custom Datasets

JSON format explained. Creating accuracy, security, and PII test data. Upload via drag & drop. Running domain-specific evaluations.

~4 min

İzle

Video 1

Giriş — Neden AI Agent Testi Gerekli?

Halüsinasyon, prompt injection, PII sızıntısı, önyargı, toksisite — ve geleneksel testin neden yetersiz kaldığı.

~4 dk

İzle 2 Bölüm

Video 2

Test Piramidi & Değerlendiriciler

6 katmanlı piramit, 16 değerlendirici, hakem model konsepti.

~5 dk

İzle 5 Bölüm

Video 3

İlk Testi Çalıştırma & Manuel Sohbet

Giriş → model seçimi → yapılandırma → batch test → gerçek zamanlı sonuçlar. Canlı bias testi, güvenlik testi (jailbreak), PII tespit demosu.

~5 dk5 bölüm

İzle 3 Bölüm

Video 4

Sonuçları Okuma & Karşılaştırma

Test kartlarını okuma (skor, geçti/kaldı, hakem gerekçesi). Test geçmişi. Yan yana model karşılaştırma: GPT-4o-mini vs Claude.

~5 dk3 bölüm

İzle

Video 5

CI/CD — Webhook & API Keys

Cron zamanlama, API anahtarı oluşturma, cURL entegrasyonu, kullanıcı yönetimi.

~5 dk

İzle

Video 6

Özel Dataset'ler

JSON formatı, accuracy/security/PII test verisi oluşturma, sürükle-bırak yükleme, alan-özel değerlendirmeler.

~4 dk

Advanced

Technical Deep Dives

For developers and tech leads who want to understand exactly how Agent Probe works under the hood.

Watch Now 2 Parts

Chapter A

Architecture & Pipeline

FastAPI internals, ThreadPoolExecutor parallelism, asyncio.gather, BaseEvaluator class, scoring conventions, LLM-as-judge vs rule-based strategies.

~4 min

Watch Now 5 Parts

Chapter B

Evaluators In Depth

Bias (BBQ + DeepEval), Security (Garak 156 patterns), Hallucination (TruthfulQA + context), PII (Presidio NER), Accuracy (MMLU exact match + LLM judge).

~4 min5 parts

Watch Now 2 Parts

Chapter C

Dataset Architecture & Data Flow

Golden Datasets (BBQ, ToxiGen, TruthfulQA, MMLU, JailbreakBench), JSON schema, end-to-end request → evaluator → score pipeline.

~2 min2 parts

Video Tutorials

See Agent Probe in Action

Agent Probe'u Canlı Görün

Agent Probe — Full Product Demo

Agent Probe — Advanced Features

Agent Probe — Tam Ürün Demosu

Agent Probe — Gelişmiş Özellikler

Step-by-Step Tutorials

Adım Adım Eğitimler

Introduction — Why Test AI Agents?

Test Pyramid & Evaluators

Running Your First Test & Manual Chat

Reading Results & Model Comparison

CI/CD — Webhooks & API Keys

Custom Datasets

Giriş — Neden AI Agent Testi Gerekli?

Test Piramidi & Değerlendiriciler

İlk Testi Çalıştırma & Manuel Sohbet

Sonuçları Okuma & Karşılaştırma

CI/CD — Webhook & API Keys

Özel Dataset'ler

Technical Deep Dives

Teknik Derinlik Videoları

Architecture & Pipeline

Evaluators In Depth

Dataset Architecture & Data Flow

Mimari & Pipeline

Evaluator'lar Detaylı

Dataset Mimarisi & Veri Akışı