How to ground a Microsoft Agent with a Sitemap-Powered knowledge base

The term “Agent” in the Microsoft Copilot ecosystem represents an intelligent, autonomous entity built to perform tasks, answer questions, and automate processes. Whether it’s a customer service bot, a data analysis assistant, or a project manager’s aide, an agent’s effectiveness is directly tied to the quality and structure of its knowledge base.

A powerful, yet often overlooked, method for grounding a Microsoft Agent is by using a sitemap schema as the foundation of its knowledge. Here’s a guide to creating a highly effective Microsoft Agent with Copilot Studio, using a sitemap for its data.

The agent knowledge gap: Why a Sitemap?

Microsoft Copilot Studio allows you to connect an agent to various knowledge sources, including websites, files in SharePoint and OneDrive, and data from Power Platform connectors. However, for a website, Copilot Studio’s generative AI can struggle to understand the hierarchical relationships and key metadata hidden within the pages.

A sitemap, typically a .xml file, is a structured map of a website. It contains a list of URLs and, crucially, metadata about each one, such as the last modified date, change frequency, and relative priority. By using a sitemap, you are providing the generative AI with a preorganized, machine-readable blueprint of your knowledge. This is particularly effective for agents that need to answer questions based on a consistent, well defined set of information.

Designing your Sitemap for the agent.

To make your sitemap an effective knowledge base, you must treat each <url> tag as a knowledge “record” for your agent. The metadata within the sitemap can be used to store key information about each piece of content.

Here’s an example of a sitemap based on this website:

In this example the categories used to specify the blogs.

Integrating the Sitemap with Copilot Studio

The integration process is straightforward and leverages Copilot Studio’s built-in generative AI capabilities.

  1. Host the Sitemap: Upload your sitemap.xml file to a public or internally accessible URL. For enterprise applications, a SharePoint site or a dedicated web server is ideal.
  2. Connect in Copilot Studio: In your Copilot Studio project, navigate to the Generative AI settings.
  3. Specify the URL: Under the “Boost conversational coverage with generative answers” section, add the URL to your sitemap (e.g., https://yourcompany.com/sitemap.xml).
  4. Wait for Indexing: Copilot Studio will automatically crawl and index the sitemap and its contents. The generative AI will now use this structured data to “ground” its answers, ensuring it provides accurate information from the sitemap.

However, for this example I was lazy.

Just created a pdf of the Sitemap and added it as knowledge to the Agent.

Creating agent behaviors from the Sitemap

  • Generative Answers: When a user asks a question like, “How many blogs did Olaf write on Business Central?“, the agent’s generative AI will search the sitemap and its linked pages. The structured data helps the bot retrieve the correct policy page and provide a concise, accurate response based on its content.
  • Structured Information Retrieval: You can create custom topics that explicitly use the sitemap data. For example, you could have a topic that triggers when a user asks about last updated policies. The agent can then use the lastmod tags from the sitemap to list the most recently updated documents.
  • Proactive Notifications: For a more advanced agent, you could use Power Automate to monitor the sitemap file for changes. When a <lastmod> date is updated, the flow can trigger a message from the agent to a designated channel (e.g., a Teams channel) notifying users of the updated policy.

How many blogs did Olaf write on Business Central?

The answer:

Based on the available sitemap data, there is a category specifically for Business Central on Olaf’s Power Bites website, indicated by the URL ​https://olafspowerbites.com/category/business-central/

However, the exact number of blog posts within this category is not specified in the provided information. The sitemap confirms the existence of the Business Central category but does not enumerate the individual blog posts under it 

And the URL gives:

The agent gives the correct sitemap url to give an answer.

The conclusion

By using a sitemap as the backbone of your agent’s knowledge, you create a system that is not only smart but also highly manageable, scalable, and auditable, all within the robust framework of Copilot Studio.

Geef een reactie

Je e-mailadres wordt niet gepubliceerd. Vereiste velden zijn gemarkeerd met *

Deze site gebruikt Akismet om spam te verminderen. Bekijk hoe je reactie gegevens worden verwerkt.