返回
RCreddit.com
12
·开发者社区 · RSS

What (and how) are you using for free, local web-search and web-fetching with LLM agents?

查看原文
推荐理由

这条记录涉及生成能力或端侧推理进展,适合跟踪模型效率、部署门槛和应用机会。

I am relatively new to self-hosted agentic LLMs and want to figure out what the most popular and high-quality tools are that I can provide or connect to a self-hosted agent to search for information on-demand on the web or read provided links (web-fetching).

I've heard about the ability to self-host SearXNG, but I have a few questions: * How do I provide access to it for the agent? Should I write/download an MCP (Model Context Protocol) server for it, some Skill with scripts or use a custom script and put it in a harness like Pi.dev? * How should I deal with extracting useful data from HTML? Should I use something like microsoft/markitdown to feed only the useful text to the LLM as a result? * How do I handle bot detection? I know some websites (especially those protected by Cloudflare) reject "robots" visiting their sites, meaning I might need to use headless browsers to simulate human behavior. But how do I deal with CAPTCHAs from search engines, Cloudflare, or Google?

I recently came across an advertised project that bundles solutions for these problems: Johell1NS/browser-search . Has anyone tried it?

If you know of any other tool setups or approaches to handle web-search and web-fetch locally (preferably via docker-compose

), I would be glad to hear them.

主题标签限时活动端侧推理
原始关键词#fetching#agents#search#local#using#free
查看原文reddit.com
单一来源,暂无交叉验证
What (and how) are you using for free, local web-search and web-fetching with LLM agents? · BuzzRadr