ChatGLM 是一系列开源的中英双语对话语言模型,基于 General Language Model (GLM) 架构。
ChatGLM-6B
- 参数量:62 亿
- 特点:
- 支持中英双语问答
- 采用量化技术,最低仅需 6GB 的 GPU 显存(INT4 量化级别)
- 经过约 1T 中英双语标识符的训练,辅以监督微调、反馈自助和人类反馈强化学习
- 对话流畅,部署门槛较低
ChatGLM2-6B
- 参数量:62 亿
- 特点:
- 基于初代模型的开发经验,全面升级了基座模型
- 使用 GLM 的混合目标函数,经过 1.4T 中英标识符的预训练与人类偏好对齐训练
- 性能显著提升,在多个数据集上表现优异(如 MMLU、CEval、GSM8K、BBH 等)
- 上下文长度扩展至 32K,允许更多轮次的对话
- 推理速度提升 42%,在 INT4 量化下,6G 显存支持的对话长度由 1K 提升到 8K
- 权重对学术研究完全开放,并允许免费商业使用
ChatGLM3-6B
- 参数量:62 亿
- 特点:
- 由智谱 AI 和清华大学知识工程实验室联合发布
- 保留了前两代模型的对话流畅性和低部署门槛
- 引入了新的特性和改进,进一步提升了对话性能和用户体验
GLM-4 系列
- 包括:GLM-4、GLM-4-Air、GLM-4-9B 等
- 特点:
- 预训练于十万亿标识符,主要为中英双语,辅以 24 种语言的小规模语料
- 通过多阶段后训练过程实现高质量对齐,包括监督微调和人类反馈学习
- 在多个评估指标上接近或超越 GPT-4,如 MMLU、GSM8K、MATH、BBH、GPQA 和 HumanEval
- GLM-4 All Tools 模型进一步对齐以理解用户意图,并自主决定使用何种工具(如网页浏览器、Python 解释器、文本到图像模型等)来完成复杂任务
应用场景
智能客服
ChatGLM 可以用于企业的智能客服系统,通过自动回复和问题解答等功能,显著提升客服效率,降低人力成本,同时提高客户满意度。具体应用包括:
- 客户咨询:实时回答客户的常见问题,提供产品信息和使用指导。
- 投诉处理:自动记录和分类客户投诉,提供初步解决方案或转接人工客服。
教育
在教育领域,ChatGLM 可以用于创建互动学习工具,回答学生的提问,提供学习建议和资源推荐。具体应用包括:
- 在线辅导:为学生提供个性化的学习辅导,解答学术问题。
- 教育内容生成:自动生成教学材料、练习题和答案解析。
医疗
在医疗领域,ChatGLM 可以辅助医生和患者进行初步的健康咨询和诊断建议,提升医疗服务的效率和质量。具体应用包括:
- 健康咨询:回答患者的常见健康问题,提供预防和保健建议。
- 病情跟踪:帮助医生记录和分析患者的病情变化,提供个性化的治疗建议。
金融
在金融领域,ChatGLM 可以用于客户服务、风险评估和市场分析等方面,提升金融服务的智能化水平。具体应用包括:
- 投资咨询:提供实时的市场分析和投资建议。
- 风险管理:自动识别和评估潜在的金融风险,提供预警和应对方案。
内容生成
ChatGLM 可以用于生成各种类型的文本内容,包括新闻报道、博客文章、产品描述等,帮助内容创作者提高效率。具体应用包括:
- 新闻生成:根据输入的关键词或事件,自动生成新闻报道。
- 营销文案:生成吸引人的产品描述和广告文案。
语言翻译
ChatGLM 支持中英双语处理,可以用于语言翻译和跨语言交流,帮助用户克服语言障碍。具体应用包括:
- 实时翻译:提供即时的文本翻译服务,支持多种语言之间的互译。
- 跨语言对话:帮助用户进行跨语言的实时对话交流。
编程辅助
ChatGLM 还可以用于编程辅助,帮助开发者生成代码、调试程序和解决编程问题。具体应用包括:
- 代码生成:根据描述生成相应的代码片段。
- 错误排查:分析代码中的错误并提供修复建议。
ChatGLM 开源版本
ChatGLM-6B
- 参数量:62 亿
- 特点:
- 中英双语支持:ChatGLM-6B 在 1:1 比例的中英语料上训练了 1T 的 token 量,兼具双语能力。
- 低部署门槛:结合模型量化技术,ChatGLM-6B 在 INT4 量化级别下最低仅需 6GB 显存即可进行推理,使得普通用户和小型企业也能够轻松部署。
- 高效的参数微调方法:提供了基于 P-Tuning v2 的高效参数微调方法,开发者可以针对自己的应用场景定制模型。
- 开源协议:ChatGLM-6B 权重对学术研究完全开放,并允许免费商业使用,促进了技术的广泛应用和发展。
ChatGLM2-6B
- 参数量:62 亿
- 特点:
- 性能提升:基于初代模型的开发经验,ChatGLM2-6B 使用了 GLM 的混合目标函数,经过了 1.4T 中英标识符的预训练与人类偏好对齐训练,在多个数据集上的性能取得了显著提升。
- 更长的上下文:基于 FlashAttention 技术,将上下文长度从 2K 扩展到了 32K,并在对话阶段使用 8K 的上下文长度训练,允许更多轮次的对话。
- 更高效的推理:基于 Multi-Query Attention 技术,推理速度相比初代提升了 42%,在 INT4 量化下,6G 显存支持的对话长度由 1K 提升到 8K。
- 开放协议:ChatGLM2-6B 权重对学术研究完全开放,并允许免费商业使用。
ChatGLM3-6B
- 参数量:62 亿
- 特点:
- 基础模型升级:ChatGLM3-6B 的基础模型采用了更多样的训练数据、更充分的训练步数和更合理的训练策略,在语义、数学、推理、代码、知识等不同角度的数据集上表现优异。
- 功能支持:ChatGLM3-6B 采用了全新设计的 Prompt 格式,原生支持工具调用(Function Call)、代码执行(Code Interpreter)和 Agent 任务等复杂场景。
- 开源序列:除了对话模型 ChatGLM3-6B 外,还开源了基础模型 ChatGLM-6B-Base 和长文本对话模型 ChatGLM3-6B-32K。
ChatGLM is a series of open-source bilingual dialogue language models (Chinese and English) based on the General Language Model (GLM) architecture.
ChatGLM-6B
- Parameters: 6.2 billion
- Features:
- Supports Chinese and English bilingual Q&A.
- Uses quantization technology, requiring only 6GB of GPU memory at INT4 quantization level.
- Trained on approximately 1T Chinese and English tokens, supplemented by supervised fine-tuning, self-feedback, and reinforcement learning from human feedback.
- Provides smooth dialogue with low deployment barriers.
ChatGLM2-6B
- Parameters: 6.2 billion
- Features:
- Developed based on the experience from the first-generation model, with comprehensive base model upgrades.
- Trained on 1.4T Chinese and English tokens using a hybrid objective function from GLM, aligned with human preferences.
- Significant performance improvements across multiple datasets (e.g., MMLU, CEval, GSM8K, BBH).
- Extended context length to 32K, allowing more rounds of dialogue.
- Inference speed increased by 42%, and at INT4 quantization, the context length supported by 6GB of GPU memory increased from 1K to 8K.
- Model weights are fully open for academic research and allow free commercial use.
ChatGLM3-6B
- Parameters: 6.2 billion
- Features:
- Jointly released by Zhipu AI and the Knowledge Engineering Lab at Tsinghua University.
- Retains the conversational fluency and low deployment threshold of the previous two generations.
- Introduces new features and improvements, further enhancing dialogue performance and user experience.
GLM-4 Series
- Models Include: GLM-4, GLM-4-Air, GLM-4-9B, etc.
- Features:
- Pretrained on 10 trillion tokens, primarily bilingual (Chinese and English), with small-scale corpora in 24 other languages.
- Achieves high-quality alignment through multi-stage post-training, including supervised fine-tuning and human feedback learning.
- Performs exceptionally well across multiple benchmarks, such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, matching or surpassing GPT-4.
- GLM-4 All Tools model is aligned to understand user intent and autonomously decide which tools to use (e.g., web browsers, Python interpreters, text-to-image models) to complete complex tasks.
Applications
Smart Customer Service
ChatGLM can be used in enterprise intelligent customer service systems to improve efficiency and reduce labor costs while enhancing customer satisfaction. Specific applications include:
- Customer Inquiries: Responds in real-time to common customer questions, providing product information and usage guidance.
- Complaint Handling: Automatically records and categorizes complaints, providing initial solutions or transferring to human customer service.
Education
In the education sector, ChatGLM can be used to create interactive learning tools, answer student questions, and offer study suggestions and resource recommendations. Specific applications include:
- Online Tutoring: Provides personalized tutoring to students, answering academic questions.
- Content Generation: Automatically generates teaching materials, exercises, and answer explanations.
Healthcare
In healthcare, ChatGLM can assist doctors and patients with initial health consultations and diagnostic advice, improving the efficiency and quality of medical services. Specific applications include:
- Health Consultations: Answers common health questions from patients and provides prevention and wellness suggestions.
- Condition Tracking: Helps doctors record and analyze patient condition changes, providing personalized treatment suggestions.
Finance
In the financial sector, ChatGLM can be applied to customer service, risk assessment, and market analysis, enhancing the intelligence level of financial services. Specific applications include:
- Investment Advice: Provides real-time market analysis and investment suggestions.
- Risk Management: Automatically identifies and evaluates potential financial risks, offering warnings and countermeasures.
Content Generation
ChatGLM can generate various types of text content, such as news reports, blog posts, and product descriptions, helping content creators improve efficiency. Specific applications include:
- News Generation: Automatically generates news reports based on input keywords or events.
- Marketing Copy: Creates engaging product descriptions and advertisement copy.
Language Translation
ChatGLM supports bilingual Chinese-English processing and can be used for language translation and cross-language communication, helping users overcome language barriers. Specific applications include:
- Real-Time Translation: Provides instant text translation services, supporting translation between multiple languages.
- Cross-Language Dialogue: Facilitates real-time conversations across different languages.
Programming Assistance
ChatGLM can also assist in programming by helping developers generate code, debug programs, and solve coding problems. Specific applications include:
- Code Generation: Generates code snippets based on descriptions.
- Error Troubleshooting: Analyzes code errors and provides suggestions for fixes.
Open-Source Versions
ChatGLM-6B
- Parameters: 6.2 billion
- Features:
- Bilingual Support: ChatGLM-6B was trained on 1T tokens from Chinese and English corpora at a 1:1 ratio, enabling bilingual capabilities.
- Low Deployment Barriers: ChatGLM-6B can run on just 6GB of GPU memory using INT4 quantization, making it easy for individual users and small businesses to deploy.
- Efficient Fine-Tuning: Provides an efficient parameter fine-tuning method based on P-Tuning v2, allowing developers to customize the model for specific use cases.
- Open Source License: The model weights are fully open for academic research and allow free commercial use, promoting wide technological adoption and development.
ChatGLM2-6B
- Parameters: 6.2 billion
- Features:
- Performance Enhancement: ChatGLM2-6B builds on the first-generation model, using a hybrid objective function from GLM and trained with 1.4T Chinese and English tokens aligned with human preferences. It shows significant performance improvements across multiple datasets.
- Extended Context Length: With FlashAttention technology, the context length is extended from 2K to 32K, and trained with 8K context length during dialogue, allowing for more extended conversations.
- More Efficient Inference: With Multi-Query Attention, inference speed is increased by 42%, and INT4 quantization allows for a conversation length increase from 1K to 8K tokens on 6GB of GPU memory.
- Open License: ChatGLM2-6B’s weights are fully open for academic research and allow free commercial use.
ChatGLM3-6B
- Parameters: 6.2 billion
- Features:
- Base Model Upgrades: ChatGLM3-6B uses more diverse training data, more training steps, and optimized training strategies, excelling across datasets on semantics, math, reasoning, code, and knowledge.
- Functionality Support: ChatGLM3-6B introduces a newly designed prompt format, natively supporting tool invocation (Function Call), code execution (Code Interpreter), and agent tasks for complex scenarios.
- Open-Source Sequence: In addition to the dialogue model ChatGLM3-6B, the base model ChatGLM-6B-Base and long-text dialogue model ChatGLM3-6B-32K are also open-sourced.