graphRAG＋ollama离线环境本地化部署

清幽茗香 · 发表于 2024-9-3 15:25:09

graphrag论文由微软发布：FromLocaltoGlobal:AGraphRAGApproachtoQuery-FocusedSummarization，阅读地址：https://arxiv.org/pdf/2404.161301.ollama部署conda环境准备#有时候sh执行时可能报错[[notfound的错误，此时用bash执行shMiniconda3-....sh#查看安装是否成功../miniconda3/bin/conda--version#激活base环境../miniconda3/bin/condainitbashsource~/.bashrc1234567安装ollama的python依赖包pipinstallollama1下载ollama的install文件https://ollama.com/install.sh1下载后修改如下内容status"Downloadingollama..."#注释此行#curl--fail--show-error--location--progress-bar-o$TEMP_DIR/ollama"https://ollama.com/download/ollama-linux-${ARCH}${VER_PARAM}"status"Installingollamato$BINDIR..."$SUDOinstall-o0-g0-m755-d$BINDIR#$SUDOinstall-o0-g0-m755$TEMP_DIR/ollama$BINDIR/ollama#修改安装包位置$SUDOinstall-o0-g0-m755./ollama-linux-amd64$BINDIR/ollama123456789下载安装包，与install.sh放在同一目录https://github.com/ollama/ollama/releases/执行安装./install.sh1在宿主机上直接安装时，ollama会默认启动，如果没有启动的话，需要手动启动ollama服务，启动后可以打开网页：http://127.0.0.1:11434ollamaserve12.下载模型登录：https://www.ollama.com/library，查看ollama已提供的模型graphRAG需要一个大模型和一个向量模型，模型存放在…/.ollama/models位置。ollamapullmistral#llmollamapullnomic-embed-text#embeddingollamalist123由于ollama使用的是GGUF格式的模型文件，在内网中pull无法连接时，可以先在外网安装一个ollama，将模型下载下来，将models文件夹复制到内网使用。ollama采用如下命令调用模型。采用api调用ollama的模型#llmcurlhttp://localhost:11434/api/chat-d'{"model":"mistral","messages":[{"role":"user","content":"whyistheskyblue?"}],"stream":false}'#embeddingcurlhttp://localhost:11434/api/embeddings-d'{"model":"nomic-embed-text","prompt":"TheskyisbluebecauseofRayleighscattering"}'12345678910111213141516173.运行graphRAG安装graphrag依赖包pipinstallgraphrag1在某一目录下新建input目录，将一个或多个txt文件放入input目录下。官方提供的示例时pg24022.txt是狄更斯的小说《圣诞颂歌》，下载地址如下。curlhttps://www.gutenberg.org/cache/epub/24022/pg24022.txt初始化工作区，此时会生成一系列空文件夹，cache，output，settings.yaml等。python-mgraphrag.index--init--root./graphrag1修改.env文件的GRAPHRAG_API_KEY为ollama修改生成的settings.yaml文件llm:model:mistral#需要注意的是这是使用v1，尽管前面在测试ollama的api接口时，采用/api/chat，但是在graphrag中采用的是/v1/chat/completionsapi_base:http://localhost:11434/v1embeddings:llm:model:nomic-embed-textapi_base:http://localhost:11434/api123456789修改完毕后，构建图python-mgraphrag.index--root./graphrag1构图的工作流如下，基于微软在论文中提到的实现思路，执行过程GraphRAG主要实现了如下功能：SourceDocuments→TextChunks：将源文档分割成文本块。TextChunks→ElementInstances：从每个文本块中提取图节点和边的实例。ElementInstances→ElementSummaries：为每个图元素生成摘要。ElementSummaries→GraphCommunities：使用社区检测算法将图划分为社区。GraphCommunities→CommunitySummaries：为每个社区生成摘要。CommunitySummaries→CommunityAnswers→GlobalAnswer：使用社区摘要生成局部答案，然后汇总这些局部答案以生成全局答案。执行中报错，修改graphrag\llm\openai\openai_embeddings_llm.py，注意model要修改成自已使用的向量模型embedding_list=[]forinpininput: embedding=ollama.embeddings(model="nomic-embed-text",prompt=inp)`` embedding_list.append(embedding["embedding"])``returnembedding_list12345执行中报错，这是因为graphrag依赖tiktoken，联网环境下tiktoken自动下载cl100k_base编码，在离线环境中需要修改。Exceptiontype:Exceptionvalue:HTTPSConnectionPool(host=‘openaipublic.blob.core.windows.net’,port=443):Maxretriesexceededwithurl:/encodings/cl100k_base.tiktoken(CausedbyNameResolutionError(“:Failedtoresolve‘openaipublic.blob.core.windows.net’([Errno-3]Temporaryfailureinnameresolution)”))12首先根据报错信息确认blobpath：https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken手动下载cl100k_base.tiktoken文件，根据源码中read_file_cached()函数中cache_key的计算方法，将文件重命名为：9b5ad71b2ce5302211f9c61530b329a4922fc6a4在graphrag调用tiktoken位置，指定cl100k_base.tiktoken的位置：importosos.environ["TIKTOKEN_CACHE_DIR"]="/mnt/temp/graphrag"12上述方法失败了，直接修改read_file()函数，解决，不报错了。defread_file(): blobpath="https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken" cache_key=hashlib.sha1(blobpath.encode()).hexdigest() cache_dir="/mnt/temp/graphrag" cache_path=os.path.join(cache_dir,cache_key) withopen(cache_path,"rb")asf: data=f.read() returndata12345678运行完毕后，输出结果保存在output目录下，包含了一系列结果文件，图谱结果存于parquet、graphml文件中。output/{curent_datetime}/artifacts/1create_final_entities.parquet：查看提取的实体数据create_final_nodes.parquet：知识图谱中各类节点的信息create_final_relationships.parquet：查看关系数据create_final_community_reports.parquet：存放了通过社区算法聚合而来的各个社区信息描述和create_final_text_units.parquet：源文件被分割后的文本片段进行全局查询之前，先安装依赖。pipinstalllangchain_community1#全局查询#询问本作的故事主题#Global（全局）代表我对整本书提问#global查询模式下的数据来源是create_final_nodes.parquet、create_final_entities.parquet和create_final_community_reports.parquetpython-mgraphrag.query--root./ragtest--methodglobal"Whatarethetopthemesinthisstory?"#局部查询#询问具体的细节#本地搜索方法从知识图谱中识别出与用户输入语义相关的一组实体。这些实体作为进入知识图谱的切入点，使得可以提取进一步的相关细节，例如连接的实体、关系、实体协变量和社区报告。此外，它还从与识别出的实体相关的原始输入文档中提取相关的文本片段。然后，这些候选数据源会被优先排序和过滤，以适应预定义大小的单个上下文窗口，从而用于生成对用户查询的响应#local查询模式下的数据来源除了global涉及的数据之外，还包含create_final_text_units.parquet和create_final_relationships.parquetpython-mgraphrag.query--root./ragtest--methodlocal"WhoisScrooge,andwhatarehismainrelationships?"12345678910调用graphrag.query时报错OSerror，定位后发现，接口调用过程中使用了lance.write_dataset()函数将向量写到本地目录，而上述安装与初始化操作在容器挂载目录/mnt下执行，无写入权限，因此将整个工作区移动到/home的个人目录下。global提问报错：json.decoder.JSONDecodeError:Expectingvalue:line1column1(char0)修改…/graphrag/query/structured_search/global_search/search.py#定位到报错的函数：_map_response_single_batch#这是因为ollama中LLM的api的输入参数格式与graphrag默认LLM的格式不一样，因此，需要进行修改#查阅ollama的参考文献，找到mistral模型的/v1/chat/completions接口的输入参数格式#search_messages=[#{"role":"system","content":search_prompt},#{"role":"user","content":query},#]search_messages=[{"role":"user","content":search_prompt+"\n\n###USERQUESTION###\n\n"+query}]12345678修改之后，成功执行。local提问报错：ZeroDivisionError:Weightssumtozero,can’tbenormalized修改…/graphrag\query\llm\oai\embedding.py#ollama版本不同时，需要根据实际情况修改fromlangchain_community.embeddings.ollamaimportOllamaEmbeddings#定位到forattemptinretryer:#withattempt:#将embedding计算修改为如下embedding=(OllamaEmbeddings(model=self.model).embed_query(text)or[])#定位到asyncforattemptinretryer:#withattempt:#将embedding计算修改为如下embedding=(awaitOllamaEmbeddings(model=self.model).embed_query(text)or[])12345678910修改之后，成功执行。4.讨论GraphRAG最核心的卖点就在于一定程度上解决了聚焦于查询的总结性(QueryFocusedSummarization，QFS)任务，提出了一种全局图形RAG方法，将知识图生成、检索增强生成（RAG）和查询焦点摘要（QFS）相结合，以支持对整个文本语料库进行人类情境理解。项目中实体抽取完全采用大模型实现，并且在三元组的schema方面也未设置任何约束。

		自动登录	找回密码
密码			会员注册