Evaluate cornerstone content readiness, using your test set of questions and SME input, to prepare your site for Expert GenSearch.
The best way to evaluate the responses from your test set is to use the Completions download as a GenResponse tracker.
To evaluate GenResponses:
- Run the Completions download report.
- Use the report to create a tracking spreadsheet.
- Have SMEs review the responses.
- Evaluate your findings.
Run the Completions download report
The Completions Report API endpoint gives you data about the queries and responses site visitors are generating using Expert LLM tools on your site. This information shows:
- How users ask questions
- The responses generative AI returned for those questions
- The pages the responses were pulled from to answer the query.
Each row in the report represents a Completions event.
API endpoint: GET {site-url}/@api/deki/llm/completion/report?month=YYYY-MM
Admins can call the report from the browser by modifying their URL as follows: {site-url}/@api/deki/llm/completion/report?month=YYYY-MM
.
Use the report to create a tracking spreadsheet
Add the following columns to the download:
- Quality Score
- Expected Answer
- Answer Source
Have SMEs review the responses
- Expected Answer: Provide the correct or desired response that should have been generated.
- Answer Source: Indicate the source of the expected answer (e.g., specific documentation or knowledge base article).
- Quality Score: SMEs assign a quality score from 1 to 5, where:
- 5 Excellent response.
- 4 Good response with minor improvements needed.
- 3 Acceptable response needing enhancements.
- 2 Poor response requiring significant improvements.
- 1 Unsatisfactory response.
Example of response evaluation:
A user asks GenSearch, "What are Touchpoints?"
- Excellent answer (Rating = 5): "A Touchpoint is an easy-to-use embeddable tool that allows the extension of Expert content into third-party locations to efficiently achieve desired customer success outcomes."
- This response scores highly because it is a context-specific overview of what Expert Touchpoints are and what they can do.
- Unsatisfactory answer (Rating = 2): "A touchpoint is a point of contact or interaction between a business or brand and a customer."
- This response scores poorly but above a 1 because it is not incorrect, but the response is based more off an LLM's foundational training than specific information found in the Expert KB, so it likely will not meet audience intent.
Evaluate your findings
Measure your baseline using the metrics you choose, such as:
- The percent of satisfactory / unsatisfactory responses
- The total number of SME Quality Score points
Example: You have 100 questions, so 500 is the maximum available points. You might aim for ≥350 to start, and endeavor improve your content from there.