SEO 自动化:关键词到内容批量生成
基于 Qwen2.5-7B + Keyword意图分析,构建跨境电商独立站 SEO 内容批量生成流水线,实现周产 500 篇 SEO 文案,Google 有机流量提升 3 倍。
项目背景
独立站卖家的内容生产瓶颈:产品页需要 SEO 文案,Blog 需要持续输出相关文章,但人工写作成本高、速度慢。一篇 800 词的 SEO 文章,人工写作 + 关键词植入 + Meta 描述需要 2-3 小时,周产 20 篇已是极限。
本系统用 Qwen2.5-7B 构建 SEO 内容生成 pipeline:输入关键词列表,自动产出标题、H1-H3 结构、正文、Meta Description,配套 XML Sitemap 更新指令。周产 500 篇,Google 有机流量 6 个月内提升 3 倍。
技术架构
Pipeline 分四步:关键词聚类(用搜索量 + 竞争度把相似关键词归组,避免重复内容)→ 意图分类(Informational / Commercial / Transactional)→ 内容生成(结构化 prompt 控制文章长度和 H 标签)→ 质量过滤(关键词密度检测 + AI 自评打分)。
import openai
from collections import defaultdict
openai_client = openai.OpenAI(api_key=OPENAI_API_KEY)
INTENT_PROMPT = """Classify the search intent of: {keyword}
Options: informational, commercial, transactional.
Return only one word."""
CONTENT_PROMPT = """Write a {word_count}-word SEO article for keyword: "{keyword}"
Requirements:
- Include H2 and H3 headings naturally
- Primary keyword density: {density}%
- Include 3 related LSI keywords naturally
- End with a call-to-action
- Meta description: under 155 chars
Output JSON: {{"title": "...", "h2": [...], "h3": [...], "body": "...", "meta": "..."}}"""
def classify_intent(keyword: str) -> str:
resp = openai_client.chat.completions.create(
model='qwen2.5-7b',
messages=[{'role': 'user', 'content': INTENT_PROMPT.format(keyword=keyword)}],
)
return resp.choices[0].message.content.strip().lower()
def generate_seo_article(keyword: str, word_count: int = 800, density: float = 1.8):
resp = openai_client.chat.completions.create(
model='qwen2.5-7b',
messages=[{'role': 'user', 'content': CONTENT_PROMPT.format(
keyword=keyword, word_count=word_count, density=density
)}],
response_format={'type': 'json_object'},
)
return resp.choices[0].message.content
keywords = ['best running shoes for flat feet', 'marathon training plan beginner', 'hydration pack for trail running']
for kw in keywords:
intent = classify_intent(kw)
print(f"Keyword: {kw} -> Intent: {intent}")
article = generate_seo_article(kw)
print(f"Generated article: {article[:100]}...")核心代码
关键词聚类是避免内部竞争的关键——同一产品线的关键词要分散到不同文章,不能让两篇文章竞争同一个搜索词。用 Agglomerative Clustering 把搜索量差 < 20%、竞争度差 < 30% 的关键词归为一组,每组分配一篇主文章。
from sklearn.cluster import AgglomerativeClustering
import numpy as np
def cluster_keywords(keywords: list, volumes: list, competition: list):
features = np.column_stack([
np.log(volumes) / np.log(max(volumes)),
competition / max(competition),
])
n_clusters = max(len(keywords) // 5, 3)
model = AgglomerativeClustering(n_clusters=n_clusters)
labels = model.fit_predict(features)
clusters = defaultdict(list)
for kw, label in zip(keywords, labels):
clusters[label].append(kw)
return clusters
keywords = ['running shoes men', 'running shoes women', 'trail running shoes', 'marathon shoes']
volumes = [45000, 38000, 12000, 8000]
competition = [8.5, 7.2, 3.1, 2.8]
clusters = cluster_keywords(keywords, volumes, competition)
for cid, kws in clusters.items():
print(f"Cluster {cid}: {kws}")关键指标
· 生成速度:单篇 800 词文章平均 8s,周产 500 篇稳定输出
· 关键词密度合格率:Primary keyword 密度 1.5-2.5%,达标率 94%
· Google 有机流量:6 个月从 1200 UV/天 → 4800 UV/天
· 人工审核通过率:生成文章人工抽检,语法正确率 96%,需修改率 4%
AI 自评打分:用 Qwen2.5-7B 给自己生成的文章打分(关键词覆盖、结构完整度、可读性),< 70 分自动触发重新生成。实测自评与人工评分相关性 R² = 0.71,可作为初筛机制。