From Local to Global Sensemaking: First Impressions of Microsoft GraphRAG (MS GraphRAG)

4 minute read

Published:

TL;DR GraphRAG replaces vector search with a lightweight knowledge-graph index and a map-reduce summarization step. The result: LLMs can tackle global questions such as “What themes span this entire corpus?” while remaining fast and token-efficient. In head-to-head tests against GPT-4-powered vector RAG, GraphRAG won 72-83 % of comparisons on answer comprehensiveness and 62-82 % on diversity, while using up to 97 % fewer context tokens for some query modes.

Knowledge Graph
Figure 1: An LLM-generated knowledge graph built using GPT-4 Turbo and Microsoft GraphRAG

What is Naive (Vector) RAG?

Retrieval Augmented Generation (RAG) refers to a system where a user query is used to retireve most relevant information from external data sources. These relevant information will be provided to an LLM to use and generate the answer based on the user query. Essentially, Naive RAG follows a two-stage approach: Retrieval and Generation.

Naive RAG Workflow
Figure 2: Traditional RAG workflow showing retrieval and generation stages

Retrieval Stage consists of two phases:

  • Indexing Phase: Documents are split into chunks, converted into vector embeddings using an embedding model, and stored in a vector index. This creates a searchable knowledge base where semantically similar content clusters together in the vector space.
  • Search Phase: When a user asks a question, the query is embedded using the same model, and the system performs approximate nearest neighbor search to find the most semantically similar document chunks. These are then ranked by similarity score to produce the top-k results.

Generation Stage: The retrieved top-k results are combined with the user query in a prompt template and fed to the language model to generate the final answer.

This approach works well for local questions where the answer can be found in a few relevant chunks. However, it struggles with global reasoning tasks that require understanding connections and themes across the entire corpus.

Why Does Classic Naive RAG Break Down?

Naive RAG excels when answers exist in a handful of chunks that fit the model’s context window. Global “sensemaking” questions, however, require reasoning across the entire corpus rather than just the top-k nearest neighbours. Sensemaking tasks require reasoning over connections which can be among entities (persons, places) and relations between them. A query like “What overarching themes emerge across the entire dataset?” makes this clear.

To address this limitation, Microsoft proposed and implemented GraphRAG which has become one of the most popular approaches. In this article, we will dive into how GraphRAG works, explore its key innovations, and see why it is proving so effective for global reasoning tasks.

How Does MS GraphRAG work?

GraphRAG is a graph-based extension of Naive RAG designed for global sensemaking over large text corpora. It uses an LLM to first build a knowledge graph of entities and their relationships, then clusters related entities into hierarchical communities. Each community is summarized by the LLM, creating a structured overview of the corpus. At query time, GraphRAG applies a map-reduce strategy: community summaries are used to generate partial answers in parallel, which are then merged into a final, coherent response.

The very first part of this workflow focuses on building a knowledge graph in three different stages:

1. Chunking the Corpes

In this step, the raw documents are split into smaller text chunks. Chunk size is a key design choice. Larger chunks reduce the number of LLM calls but may miss fine-grained information, while smaller chunks improve recall but come at a higher processing cost.

2. Extracting Entities, Relationships, and Claims

The LLM processes each chunk to extract important entities, their relationships, and relevant claims. For instance, from a sentence about a tech acquisition, the model might extract entities like NeoChip and Quantum Systems, the relationship acquired, and the claim that the acquisition happened in 2016.

3. Constructing the Knowledge Graph

The extracted entities and relationships are aggregated across the entire corpus to build a knowledge graph. Entities become nodes, relationships become edges, and repeated mentions strengthen edge weights. Descriptions are summarized and duplicates are reconciled, resulting in a structured and interconnected representation of the dataset (documents).

4. Community Detection: Structuring the Graph

Once the knowledge graph is constructed, GraphRAG applies a hierarchical community detection algorithm, specifically Leiden clustering, to group related entities into tightly connected subgraphs. This step reveals the internal structure of the corpus by identifying clusters of entities that are semantically or contextually related.

The process is recursive: it first finds broad, high-level clusters (root communities), then drills down to identify finer-grained sub-communities within them. This hierarchical grouping is key for enabling divide-and-conquer summarization, where each community is summarized independently before being merged into global insights.

In the visualization below, you can see this structure in action:

GraphRAG Communities

Left (a): Root-level communities (Level 0) show the most general groupings across the corpus.

Right (b): Sub-communities (Level 1) reveal a more detailed breakdown within each root cluster.

5.