Synthetic Dataset and Evaluation Results: Validating AI-Generated Content: Security Challenges and Solutions Inspired by OWASP

Dataset

Description

Synthetic Dataset and Evaluation Results for LLM Output Validation

This repository contains the synthetic dataset and evaluation results used in the Master's thesis: Validating AI-Generated Content: Security Challenges and Solutions Inspired by OWASP by Kamran Khan, Tampere University, 2025.

Contents

1. Dataset

dataset.csv: Synthetic dataset generated for evaluating LLM outputs

Contains 446 samples across 9 categories

Each sample includes: Context, LLMOutput, TrueLabel

Generated using: Deepseek-V3.2-Exp , Gpt-4


2. Evaluation Results

BaselineResulat.csv: Results from the baseline model using only procedural security and schema/structure validator agent(non-llm)

NoJudgeAgentResult: Results ablation study using semantic, validator and Semantic Agent without Judge audit

EvalauationResult: Results from comple
Date made available23 Dec 2025
PublisherZenodo

Field of science, Statistics Finland

  • 113 Computer and information sciences

Cite this