Skill

evaluate-spring-ai

Sets up Dokimos evaluation for Spring AI apps including ChatClient, RAG pipelines, and advisor chains. Use for Spring Boot LLM testing and benchmarking.

Java

Spring Boot

testing

ai-ml

npx claudepluginhub dokimos-dev/dokimos --plugin evaluate-spring-ai

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Set up Dokimos evaluation for a Spring AI application. The user will describe their application and evaluation goals via `$ARGUMENTS`.

SKILL.md

Similar Skills

spring-ai

328

Guides Spring AI development: integrate LLMs like OpenAI/Anthropic via ChatClient, PromptTemplate; build RAG apps with vector stores.

partme-ai-full-stack-skills

evaluate-langchain4j

Sets up Dokimos evaluation for LangChain4j apps and RAG pipelines with Q&A tasks, faithfulness, relevance, and retrieval checks.

evaluate-langchain4j

evaluate-agent

Sets up Dokimos evaluation for AI agents using tools, assessing tool call validity, correctness, task completion, argument hallucinations, and tool definition quality.

evaluate-agent

Stats

Parent Repo Stars20

Parent Repo Forks3

Last CommitFeb 21, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Evaluate Spring AI

Set up Dokimos evaluation for a Spring AI application. The user will describe their application and evaluation goals via $ARGUMENTS.

Where things live

Spring AI support: dokimos-spring-ai/src/main/java/dev/dokimos/springai/SpringAiSupport.java
Examples: dokimos-examples/src/main/java/dev/dokimos/examples/springai/
Full Spring Boot example: dokimos-examples/src/main/java/dev/dokimos/examples/springai/tutorial/
Maven dependency: dev.dokimos:dokimos-spring-ai

Before writing code, read SpringAiSupport.java to understand the available utilities.

Key utilities

SpringAiSupport provides:

asJudge(ChatClient.Builder) — wraps a Spring AI ChatClient.Builder into a JudgeLM
asJudge(ChatModel) — wraps a ChatModel directly into a JudgeLM
toTestCase(EvaluationRequest) — converts Spring AI's EvaluationRequest to Dokimos EvalTestCase
toEvaluationResponse(EvalResult) — converts Dokimos EvalResult back to Spring AI EvaluationResponse

Evaluation patterns

Simple ChatClient evaluation

@SpringBootTest
class MyChatEvaluationTest {

    @Autowired
    private ChatClient.Builder chatClientBuilder;

    @Test
    void evaluateChatbot() {
        ChatClient chatClient = chatClientBuilder.build();

        Task task = example -> {
            String response = chatClient.prompt()
                    .user(example.input())
                    .call()
                    .content();
            return Map.of("output", response);
        };

        JudgeLM judge = SpringAiSupport.asJudge(chatClientBuilder);

        ExperimentResult result = Experiment.builder()
                .name("Chatbot Evaluation")
                .dataset(Dataset.fromJson(Path.of("src/test/resources/datasets/qa.json")))
                .task(task)
                .evaluator(LLMJudgeEvaluator.builder()
                        .name("answer-quality")
                        .judge(judge)
                        .criteria("Is the response helpful and accurate?")
                        .evaluationParams(List.of(
                                EvalTestCaseParam.INPUT,
                                EvalTestCaseParam.ACTUAL_OUTPUT,
                                EvalTestCaseParam.EXPECTED_OUTPUT))
                        .threshold(0.7)
                        .build())
                .build()
                .run();
    }
}

RAG evaluation with advisors

Task task = example -> {
    String input = example.input();
    ChatClient.ChatClientRequestSpec request = chatClient.prompt().user(input);
    request.advisors(new QuestionAnswerAdvisor(vectorStore));

    String response = request.call().content();
    List<Document> docs = vectorStore.similaritySearch(input);
    List<String> context = docs.stream().map(Document::getText).toList();

    return Map.of("output", response, "context", context);
};

Converting between Spring AI and Dokimos types

EvaluationRequest request = new EvaluationRequest(userText, documents, responseContent);
EvalTestCase testCase = SpringAiSupport.toTestCase(request);
EvalResult result = evaluator.evaluate(testCase);
EvaluationResponse response = SpringAiSupport.toEvaluationResponse(result);

Dependencies

<dependency>
    <groupId>dev.dokimos</groupId>
    <artifactId>dokimos-spring-ai</artifactId>
    <version>${dokimos.version}</version>
</dependency>

Spring AI itself is a provided-scope dependency — the user must bring their own version.

Steps

Understand from $ARGUMENTS what the Spring AI application does
Determine if it's a simple ChatClient app or uses RAG advisors
Choose appropriate evaluators for the use case
Create a dataset matching the application's domain
Wire evaluation using SpringAiSupport utilities
For Spring Boot apps, set up tests with @SpringBootTest