生成 llm.txt 文件(用于 LLM 爬虫)
生成 llm.txt 文件,帮助 AI/LLM 爬虫更好地理解和索引你的网站内容。类似 robots.txt 但专为 AI 设计,包含内容授权、网站描述和重要页面指引。
/plugin marketplace add huifer/claude-code-seo/plugin install huifer-claude-seo-assistant@huifer/claude-code-seo--verbose生成 llm.txt 文件,帮助 AI/LLM 爬虫(如 GPTBot, ClaudeBot, Perplexity)更好地理解和索引网站内容。llm.txt 类似于 robots.txt,但是专门为 LLM 爬虫设计的协议。
--verbose: 生成更详细的描述(可选)
什么是 llm.txt?
llm.txt 是一个新兴的标准,类似于 robots.txt,但是专门为 AI/LLM 爬虫设计的。它:
为什么需要 llm.txt?
/llm-txt
输出:
# llm.txt 文件
为你的 Next.js 项目生成的 llm.txt 文件:
```txt
# LLM Crawler Directives
# Allow LLM crawlers to index this site
User-agent: *
Allow: /
# Site Information
Name: YourBrand
Description: [网站描述 - 1-2 句话]
Website: https://yourdomain.com
Language: zh-CN, en-US
# Content Scope
Topics: [主要话题列表]
Content-Type: blog, documentation, tutorials
# Important Pages
Homepage: https://yourdomain.com
About: https://yourdomain.com/about
Blog: https://yourdomain.com/blog
# Attribution Requirements
Require-Attribution: true
Attribution-URL: https://yourdomain.com
# License
License: https://yourdomain.com/license
Content-Use: educational, informational
# Contact
Contact-Email: contact@yourdomain.com
将此文件放置在项目根目录的 public/ 文件夹中:
路径: public/llm.txt
复制上面的内容到 public/llm.txt 文件。
创建后,访问 https://yourdomain.com/llm.txt 验证文件可访问。
SF Plumbing Services 或 Tech Blog提供专业的管道维修和维护服务,覆盖旧金山湾区。包括紧急维修、预防性维护和管道安装服务。plumbing, emergency services, home maintenance, DIY tutorialstrue - 要求 AI 引用时提供链接false - 不要求educational, informationalall(允许所有用途)
### 示例 2:详细模式
```bash
/llm-txt --verbose
输出:
# llm.txt 文件(详细版)
```txt
# LLM Crawler Directives
# Allow LLM crawlers to index this site
User-agent: *
Allow: /
# Disallow admin and private areas
Disallow: /admin/
Disallow: /api/
Disallow: /private/
# Site Information
Name: YourBrand
Description: YourBrand 是一个专注于 [主题] 的权威网站。我们提供 [服务类型],帮助 [目标受众] [解决问题]。内容包括 [内容类型],旨在 [目标]。
Website: https://yourdomain.com
Language: zh-CN, en-US
Founded: 2020
# Content Scope
Topics: [话题 1], [话题 2], [话题 3], [话题 4]
Content-Type: blog posts, tutorials, guides, case studies
Target-Audience: [受众描述]
Update-Frequency: Weekly
# Important Pages
Homepage: https://yourdomain.com
About: https://yourdomain.com/about
Blog: https://yourdomain.com/blog
Services: https://yourdomain.com/services
Contact: https://yourdomain.com/contact
# Featured Content
Featured-Article-1: https://yourdomain.com/blog/[article-1] - [描述]
Featured-Article-2: https://yourdomain.com/blog/[article-2] - [描述]
Featured-Guide: https://yourdomain.com/guides/[guide] - [描述]
# Content Guidelines
Content-Style: Professional, educational, practical
Tone: Informative, helpful, expert
Accuracy: All content is fact-checked and regularly updated
# Attribution Requirements
Require-Attribution: true
Attribution-URL: https://yourdomain.com
Attribution-Text: "Source: YourBrand (https://yourdomain.com)"
# License
License: https://yourdomain.com/license
Content-License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Content-Use: educational, informational, with attribution
Commercial-Use: Contact for permission
# Contact
Contact-Email: contact@yourdomain.com
Contact-Form: https://yourdomain.com/contact
Social-Media: https://twitter.com/yourbrand, https://linkedin.com/company/yourbrand
# Additional Information
Last-Updated: 2024-01-15
API-Documentation: https://yourdomain.com/api-docs
RSS-Feed: https://yourdomain.com/feed.xml
Description(详细描述):
SF Plumbing Services 是旧金山湾区的专业管道维修公司。我们提供 24/7 紧急管道服务、预防性维护和管道安装。我们的团队由持证专业管道工组成,拥有超过 15 年的服务经验。Language(语言):
zh-CN, en-US, es-ESFounded(成立时间):
Target-Audience(目标受众):
Homeowners, property managers, DIY enthusiastsUpdate-Frequency(更新频率):
Daily, Weekly, MonthlyContent-Style(内容风格):
Professional, educational, practical列出你希望 AI 特别关注的内容:
Featured-Article-1: https://yourdomain.com/blog/seo-guide - Complete SEO guide for beginners
Featured-Article-2: https://yourdomain.com/blog/ai-trends - AI trends in 2024
Featured-Guide: https://yourdomain.com/guides/plumbing-diy - DIY plumbing maintenance guide
Accuracy(准确性):
All content is fact-checked and regularly updatedTone(语调):
Informative, helpful, expert-friendlyContent-License(内容许可证):
CC BY 4.0 - Creative Commons AttributionCC BY-NC 4.0 - Non-commercial onlyCC BY-SA 4.0 - ShareAlikeAll Rights Reserved - 保留所有权利Commercial-Use(商业使用):
Allowed - 允许商业使用Contact for permission - 需要联系许可Not allowed - 不允许商业使用API-Documentation(API 文档):
RSS-Feed(RSS 订阅):
Last-Updated(最后更新):
### 示例 3:博客网站
```bash
/llm-txt --verbose
输出(针对博客):
# llm.txt for Blog
```txt
# LLM Crawler Directives
# Allow LLM crawlers
User-agent: *
Allow: /
# Site Information
Name: Tech Insights Blog
Description: Tech Insights 是一个技术博客,专注于编程、Web 开发和人工智能。我们提供实用的教程、深入的技术分析和最新的行业趋势。我们的目标是为开发者社区提供有价值的学习资源。
Website: https://techinsights.com
Language: en-US, zh-CN
Founded: 2020
Author: [Your Name]
# Content Scope
Topics: programming, web development, JavaScript, React, Next.js, AI, machine learning, tutorials, best practices
Content-Type: blog posts, tutorials, code examples, case studies
Target-Audience: Software developers, web developers, students
Update-Frequency: Weekly
# Important Pages
Homepage: https://techinsights.com
About: https://techinsights.com/about
Blog: https://techinsights.com/blog
Archive: https://techinsights.com/archive
# Featured Content
Featured-Post-1: https://techinsights.com/blog/nextjs-SEO - Next.js SEO 完全指南
Featured-Post-2: https://techinsights.com/blog/react-hooks - React Hooks 深度解析
Featured-Tutorial: https://techinsights.com/tutorials/graphql - GraphQL 入门教程
# Content Guidelines
Content-Style: Technical, educational, practical
Code-Quality: All code examples are tested and maintained
Tone: Professional, approachable, beginner-friendly
# Attribution Requirements
Require-Attribution: true
Attribution-URL: https://techinsights.com
Attribution-Text: "Source: Tech Insights Blog (https://techinsights.com)"
# License
License: https://techinsights.com/license
Content-License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Content-Use: educational, informational, with attribution
Commercial-Use: Contact for permission
# Contact
Contact-Email: hello@techinsights.com
Contact-Form: https://techinsights.com/contact
Social-Media: https://twitter.com/techinsights, https://github.com/techinsights
# Additional Information
Last-Updated: 2024-01-15
RSS-Feed: https://techinsights.com/feed.xml
Newsletter: https://techinsights.com/subscribe
### 示例 4:商业服务网站
```bash
/llm-txt
输出(针对商业网站):
# llm.txt for Business Website
```txt
# LLM Crawler Directives
# Allow LLM crawlers
User-agent: *
Allow: /
# Disallow private areas
Disallow: /admin/
Disallow: /dashboard/
Disallow: /api/
# Site Information
Name: SF Plumbing Services
Description: SF Plumbing Services 是旧金山湾区的专业管道维修公司。我们提供 24/7 紧急管道服务、预防性维护和管道安装。我们的团队由持证专业管道工组成,拥有超过 15 年的服务经验。
Website: https://sfplumbing.com
Language: en-US
Founded: 2008
# Content Scope
Topics: plumbing, emergency services, home maintenance, pipe repair, drain cleaning, water heaters
Content-Type: service pages, guides, FAQ, blog posts
Target-Audience: Homeowners, property managers, businesses
# Important Pages
Homepage: https://sfplumbing.com
Services: https://sfplumbing.com/services
Emergency: https://sfplumbing.com/emergency
About: https://sfplumbing.com/about
Blog: https://sfplumbing.com/blog
Contact: https://sfplumbing.com/contact
# Featured Content
Service-Guide: https://sfplumbing.com/guide/emergency-plumbing - 紧急管道问题处理指南
FAQ: https://sfplumbing.com/faq - 常见问题解答
# Content Guidelines
Content-Style: Professional, service-oriented
Accuracy: All information is verified by licensed plumbers
Tone: Helpful, trustworthy, expert
# Attribution Requirements
Require-Attribution: true
Attribution-URL: https://sfplumbing.com
Attribution-Text: "Source: SF Plumbing Services (https://sfplumbing.com)"
# License
License: https://sfplumbing.com/terms
Content-License: All Rights Reserved
Content-Use: informational, with attribution
Commercial-Use: Not allowed without permission
# Contact
Contact-Email: info@sfplumbing.com
Phone: +1-415-555-0123
Contact-Form: https://sfplumbing.com/contact
Social-Media: https://twitter.com/sfplumbing, https://facebook.com/sfplumbing
# Additional Information
Last-Updated: 2024-01-15
Service-Areas: San Francisco, Oakland, San Jose, Peninsula
Hours: 24/7 Emergency Service
Service-Areas(服务区域):
Hours(营业时间):
Phone(电话):
### 示例 5:禁止 LLM 索引
```bash
/llm-txt
输出(禁止 LLM):
# llm.txt for Restrictive Policy
```txt
# LLM Crawler Directives
# Disallow all LLM crawlers
User-agent: *
Disallow: /
# Site Information
Name: YourBrand
Description: [描述]
Website: https://yourdomain.com
# Content Policy
Content-Use: Not allowed for LLM training or indexing
License: All Rights Reserved
Contact: legal@yourdomain.com for inquiries
# Alternative: Allow with attribution
# User-agent: *
# Allow: /
# Require-Attribution: true
# Content-License: CC BY 4.0
完全禁止:
允许但不授权:
有条件允许:
## 最佳实践
### 1. 明确内容许可
```txt
# 推荐
Content-License: CC BY 4.0
Content-Use: educational, informational
Require-Attribution: true
# 避免
Content-License: None
Content-Use: Unknown
Last-Updated: 2024-01-15
Update-Frequency: Weekly
Contact-Email: contact@yourdomain.com
Contact-Form: https://yourdomain.com/contact
Featured-Article-1: https://yourdomain.com/article-1
Featured-Article-2: https://yourdomain.com/article-2
Featured-Guide: https://yourdomain.com/guide
Content-Style: Professional, educational
Accuracy: All content is fact-checked
Tone: Informative, helpful
将 llm.txt 放在 public/ 目录:
public/
llm.txt
robots.txt
sitemap.xml
app/llm.txt/route.ts:
import { NextResponse } from 'next/server'
export async function GET() {
const llmTxt = `
# LLM Crawler Directives
User-agent: *
Allow: /
Name: YourBrand
Description: Your description
Website: https://yourdomain.com
...
`.trim()
return new NextResponse(llmTxt, {
headers: {
'Content-Type': 'text/plain',
'Cache-Control': 'public, max-age=86400',
},
})
}
# 开发环境
curl http://localhost:3000/llm.txt
# 生产环境
curl https://yourdomain.com/llm.txt
访问 https://yourdomain.com/llm.txt 确认:
/llm.txtllm.txttext/plainA:
两者应该同时使用以获得最佳控制。
A: 不是必须的,但建议创建。如果没有 llm.txt:
A: 使用以下配置:
User-agent: *
Disallow: /
A: llm.txt 本身不是法律文件,但它:
对于法律保护,建议:
A: 越来越多的 AI 公司正在支持:
随着标准的普及,支持会越来越广泛。
A: 取决于你的目标:
开放共享(推荐用于博客):
Content-License: CC BY 4.0
Require-Attribution: true
限制使用:
Content-License: All Rights Reserved
Content-Use: Contact for permission
商业网站:
Content-License: All Rights Reserved
Content-Use: Informational only
Commercial-Use: Not allowed
llm.txt 与其他标准配合使用:
robots.txt - 搜索引擎爬虫控制
User-agent: *
Allow: /
llm.txt - AI/LLM 爬虫控制
User-agent: *
Allow: /
sitemap.xml - 网站结构
<url><loc>https://yourdomain.com/</loc></url>
ads.txt - 广告验证
google.com, pub-XXXXXXXXXXXXXXXX, DIRECT, f08c47fec0942fa0
/robots-txt - 生成 robots.txt 文件/seo-check - 验证文件可访问性/seo-audit - 检查整体配置llm.txt