[Bug]: graphrag index creates 637 AsyncAzureOpenAI on gutenberg QuickStart #1517

mmaitre314 · 2024-12-15T04:47:34Z

Do you need to file an issue?

I have searched the existing issues and this bug is not already filed.
My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the bug

GraphRAG v1.0.0 keeps on calling fnllm.openai.create_openai_client() during indexing instead of reusing the OpenAI client. Since fnllm creates a new DefaultAzureCredential for each create_openai_client() call (code), this restarts the authentication process and adds to indexing runtime.

Steps to reproduce

Follow the GraphRAG quickstart (https://microsoft.github.io/graphrag/get_started/) until the step graphrag index --root ./ragtest, using Entra authentication instead of API key. Open indexing-engine.log and observe repeated log lines like this:

azure.identity._credentials.managed_identity INFO ManagedIdentityCredential will use IMDS

Expected Behavior

The OpenAI client is reused and only one Entra access token is acquired for authentication.

GraphRAG Config Used

encoding_model: o200k_base

llm:
  type: azure_openai_chat
  model: gpt-4o-mini
  model_supports_json: true
  api_base: https://<snip>.openai.azure.com
  api_version: 2024-08-01-preview
  deployment_name: gpt-4o-mini

parallelization:
  stagger: 0.3
  # num_threads: 50

async_mode: threaded # or asyncio

embeddings:
  async_mode: threaded # or asyncio
  vector_store:
    type: lancedb
    db_uri: 'output\lancedb'
    container_name: default
    overwrite: true
  llm:
    type: azure_openai_embedding
    model: text-embedding-3-small
    api_base: https://<snip>.openai.azure.com
    api_version: "2023-05-15"
    deployment_name: text-embedding-3-small

### Input settings ###

input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "../../inputs/gutenberg"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

chunks:
  size: 1200
  overlap: 100
  group_by_columns: [id]

### Storage settings ###
## If blob storage is specified in the following four sections,
## connection_string and container_name must be provided

cache:
  type: file # or blob
  base_dir: "cache"

reporting:
  type: file # or console, blob
  base_dir: "logs"

storage:
  type: file # or blob
  base_dir: "output"

## only turn this on if running `graphrag index` with custom settings
## we normally use `graphrag update` with the defaults
update_index_storage:
  # type: file # or blob
  # base_dir: "update_output"

### Workflow settings ###

skip_workflows: []

entity_extraction:
  prompt: "../../prompts/default/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 1

summarize_descriptions:
  prompt: "../../prompts/default/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  enabled: false
  prompt: "../../prompts/default/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 1

community_reports:
  prompt: "../../prompts/default/community_report.txt"
  max_length: 2000
  max_input_length: 8000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  embeddings: false
  transient: false

### Query settings ###
## The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.
## See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

local_search:
  prompt: "../../prompts/default/local_search_system_prompt.txt"

global_search:
  map_prompt: "../../prompts/default/global_search_map_system_prompt.txt"
  reduce_prompt: "../../prompts/default/global_search_reduce_system_prompt.txt"
  knowledge_prompt: "../../prompts/default/global_search_knowledge_system_prompt.txt"

drift_search:
  prompt: "../../prompts/default/drift_search_system_prompt.txt"

Logs and screenshots

No response

Additional Information

GraphRAG Version: 1.0.0
Operating System: Windows 11
Python Version: 3.12
Related Issues:

The text was updated successfully, but these errors were encountered:

mmaitre314 · 2024-12-15T19:50:15Z

I found a workaround by overriding the LLM loaders in graphrag.index.llm.load_llm.loaders to ensure the OpenAI clients get reused across queries:

from pathlib import Path
from typing import Any
from fnllm import JsonStrategy, LLMEvents
from fnllm.openai import (
    AzureOpenAIConfig,
    create_openai_chat_llm,
    create_openai_client,
    create_openai_embeddings_llm,
)
from fnllm.openai.types.chat.parameters import OpenAIChatParameters
from graphrag.cli.index import index_cli
from graphrag.config.enums import LLMType
from graphrag.logger.types import LoggerType
from graphrag.index.llm.load_llm import loaders
from graphrag.index.typing import ErrorHandlerFn
import graphrag.config.defaults as defs
import tiktoken

def main():

    _initialize_llm_loader(
        LLMType.AzureOpenAIChat,
        model="gpt-4o-mini",
        model_supports_json=True,
        api_base="https://<snip>.openai.azure.com",
        api_version="2024-08-01-preview",
        deployment_name="gpt-4o-mini",
    )

    _initialize_llm_loader(
        LLMType.AzureOpenAIEmbedding,
        model="text-embedding-3-small",
        model_supports_json=False,
        api_base="https://<snip>.openai.azure.com",
        api_version="2023-05-15",
        deployment_name="text-embedding-3-small",
    )

    index_cli(
        root_dir=Path(os.path.dirname(__file__)),
        verbose=True,
        resume=None,
        memprofile=False,
        cache=True,
        logger=LoggerType.PRINT,
        config_filepath=None,
        dry_run=False,
        skip_validation=False,
        output_dir=None,
    )

def _initialize_llm_loader(
        type: LLMType,
        model: str,
        model_supports_json: bool,
        api_base: str,
        api_version: str,
        deployment_name: str,
        ) -> None:
    
    openai_config=AzureOpenAIConfig(
        model=model,
        encoding=tiktoken.encoding_name_for_model(model),
        deployment=deployment_name,
        endpoint=api_base,
        json_strategy=JsonStrategy.VALID if model_supports_json else JsonStrategy.LOOSE,
        api_version=api_version,
        max_retries=defs.LLM_MAX_RETRIES,
        max_retry_wait=defs.LLM_MAX_RETRY_WAIT,
        requests_per_minute=defs.LLM_REQUESTS_PER_MINUTE,
        tokens_per_minute=defs.LLM_TOKENS_PER_MINUTE,
        timeout=defs.LLM_REQUEST_TIMEOUT,
        max_concurrency=defs.LLM_CONCURRENT_REQUESTS,
        chat_parameters=OpenAIChatParameters(
            frequency_penalty=defs.LLM_FREQUENCY_PENALTY,
            presence_penalty=defs.LLM_PRESENCE_PENALTY,
            top_p=defs.LLM_TOP_P,
            max_tokens=defs.LLM_MAX_TOKENS,
            n=defs.LLM_N,
            temperature=defs.LLM_TEMPERATURE,
        ),
    )

    openai_client = create_openai_client(openai_config)

    if type == LLMType.AzureOpenAIChat:
        loaders[type]["load"] = lambda on_error, cache, _: create_openai_chat_llm(
            openai_config,
            client=openai_client,
            cache=cache,
            events=GraphRagLLMEvents(on_error),
        )
    elif type == LLMType.AzureOpenAIEmbedding:
        loaders[type]["load"] = lambda on_error, cache, _: create_openai_embeddings_llm(
            openai_config,
            client=openai_client,
            cache=cache,
            events=GraphRagLLMEvents(on_error),
        )
    else:
        raise ValueError(f"Unsupported LLM type: {type}")

class GraphRagLLMEvents(LLMEvents):
    def __init__(self, on_error: ErrorHandlerFn):
        self._on_error = on_error

    async def on_error(
        self,
        error: BaseException | None,
        traceback: str | None = None,
        arguments: dict[str, Any] | None = None,
    ) -> None:
        self._on_error(error, traceback, arguments)

if __name__ == "__main__":
    main()

natoverse · 2024-12-18T23:53:12Z

We believe this was a bug introducing during our adoption of fnllm as the underlying LLM library. We just pushed out a 1.0.1 patch today, please let if know if your problem still exists with that version.

github-actions · 2024-12-26T01:58:24Z

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

mmaitre314 added bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Dec 15, 2024

natoverse added the awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response label Dec 18, 2024

github-actions bot added the stale Used by auto-resolve bot to flag inactive issues label Dec 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: graphrag index creates 637 AsyncAzureOpenAI on gutenberg QuickStart #1517

[Bug]: graphrag index creates 637 AsyncAzureOpenAI on gutenberg QuickStart #1517

mmaitre314 commented Dec 15, 2024

mmaitre314 commented Dec 15, 2024 •

edited

Loading

natoverse commented Dec 18, 2024

github-actions bot commented Dec 26, 2024

[Bug]: graphrag index creates 637 AsyncAzureOpenAI on gutenberg QuickStart #1517

[Bug]: graphrag index creates 637 AsyncAzureOpenAI on gutenberg QuickStart #1517

Comments

mmaitre314 commented Dec 15, 2024

Do you need to file an issue?

Describe the bug

Steps to reproduce

Expected Behavior

GraphRAG Config Used

Logs and screenshots

Additional Information

mmaitre314 commented Dec 15, 2024 • edited Loading

natoverse commented Dec 18, 2024

github-actions bot commented Dec 26, 2024

mmaitre314 commented Dec 15, 2024 •

edited

Loading