Stable Diffusion

Stable Diffusion 是一种深度学习模型,专门用于将文本转换为图像。它能够生成高质量、逼真的图像,只需输入任何文本描述即可实现。

Stable Diffusion 1.x 系列

  • Stable Diffusion 1.1:这是最早的版本之一,使用了 237,000 步在 256×256 分辨率下训练,并在 512×512 分辨率下进行了 194,000 步的训练。
  • Stable Diffusion 1.2:在 1.1 版本的基础上继续训练,增加了 515,000 步的训练,使用了 “laion-improved-aesthetics” 数据集。
  • Stable Diffusion 1.3 和 1.4:这些版本在前一个版本的基础上进行了进一步的优化和改进,使用了更多的训练步数和更高质量的数据集。
  • Stable Diffusion 1.5:这是最受欢迎的版本之一,使用了 595,000 步的训练,并在 laion-aesthetics v2 5+ 数据集上进行了优化。

Stable Diffusion 2.x 系列

  • Stable Diffusion 2.0:引入了新的文本编码器和超分辨率模型,能够生成更高分辨率的图像(最高可达 2048×2048)。此外,还增加了深度引导的图像生成功能(depth2img),可以根据输入图像的深度信息生成新图像。
  • Stable Diffusion 2.1:在 2.0 版本的基础上进行了进一步优化,但由于模型容量较大,使用率相对较低。

Stable Diffusion 3.x 系列

  • Stable Diffusion 3:这是最新的版本,具有显著改进的多主体提示、图像质量和拼写能力。该版本目前处于早期预览阶段,模型参数范围从 800M 到 8B 不等。

Stable Diffusion XL (SDXL)

  • SDXL 1.0:于 2023 年 7 月发布,支持原生 1024×1024 分辨率,并改进了肢体和文本的生成能力。



  • 基础计划:每月 $10,包含 200 分钟的 GPU 时间。
  • 标准计划:每月 $30,适合需要更多计算资源的用户。
  • 高级计划:每月 $60,提供更高的计算资源和更多的功能。

API 访问

  • 按积分收费:用户可以购买 API 访问积分计划,价格分别为 $29、$49 和 $149。这些积分计划没有培训费用,只有 API 访问费用。
  • 积分价格:每 1000 积分 $10,足够生成大约 500 张 SDXL 图片。



  • 创意图像生成:艺术家和设计师可以使用 Stable Diffusion 来生成创意图像,探索新的视觉风格,或作为灵感来源。
  • 插图与概念艺术:用于生成插图、概念艺术和角色设计,帮助艺术家快速实现创意。
  • 风格迁移:将照片转换为特定的艺术风格,如梵高画风,或进行局部修改如改变天空颜色、增强细节等。


  • 游戏资产生成:游戏开发者可以利用 Stable Diffusion 快速创建游戏资产,如角色、环境和道具,加速原型设计和游戏内容的迭代。
  • 动态场景生成:通过文本描述生成高质量的游戏场景,提升开发效率并丰富游戏视觉多样性。


  • 背景与特效生成:在电影和动画制作中,Stable Diffusion 可以用于生成背景、特效或角色设计,帮助艺术家和导演实现视觉创意。
  • 视频内容生成:设计师可以输入关键词、描述或风格提示来快速生成各种概念设计和艺术作品。


  • 广告创意:用于生成广告海报、产品包装设计、网站背景图等,提升广告创意的视觉效果。
  • 社交媒体内容:新闻机构、社交媒体平台和营销团队可以利用该技术根据文本描述自动生成相关配图,用于文章、博客、报道的可视化内容创作。


  • 概念草图与渲染图:设计师可以利用 Stable Diffusion 快速生成概念草图、产品渲染图、场景布局等多种设计素材,大大提升工作效率。
  • 2.5D 建筑场景:生成具有立体感的建筑场景效果图,用于建筑设计、游戏开发等领域。


  • 在线教育材料:生成教育和培训材料的图像内容,帮助学生更好地理解复杂概念。
  • 医疗模拟:在医疗领域,经过适当训练后,Stable Diffusion 类模型可以用于病变部位的模拟或正常组织结构的重建。


  • 产品图生成:生成具有品牌调性的产品图,如美妆护肤类模型或包装设计,提升产品展示效果。
  • 虚拟试穿:在时尚领域,用户可以使用 Stable Diffusion 来模拟不同服装的穿着效果。

Stable Diffusion 是由 Stability AI 开发并开源的文本生成图像模型。开源版本允许用户自由下载、使用和修改模型,适用于各种个人和商业用途。

Stable Diffusion is a deep learning model designed specifically for converting text into images. It can generate high-quality, realistic images from any text description.

Stable Diffusion 1.x Series

  • Stable Diffusion 1.1: One of the earliest versions, trained for 237,000 steps at a 256×256 resolution and 194,000 steps at a 512×512 resolution.
  • Stable Diffusion 1.2: Continued training from version 1.1 with an additional 515,000 steps, using the “laion-improved-aesthetics” dataset.
  • Stable Diffusion 1.3 and 1.4: These versions further optimized the model with more training steps and higher-quality datasets.
  • Stable Diffusion 1.5: One of the most popular versions, trained for 595,000 steps and optimized on the laion-aesthetics v2 5+ dataset.

Stable Diffusion 2.x Series

  • Stable Diffusion 2.0: Introduced a new text encoder and super-resolution models, allowing for higher resolution image generation (up to 2048×2048). It also added depth-guided image generation (depth2img), which can generate new images based on the depth information of input images.
  • Stable Diffusion 2.1: Further optimized version 2.0, though it saw lower usage due to the model’s larger size.

Stable Diffusion 3.x Series

  • Stable Diffusion 3: The latest version with significant improvements in multi-subject prompts, image quality, and spelling accuracy. This version is currently in early preview, with model parameters ranging from 800M to 8B.

Stable Diffusion XL (SDXL)

  • SDXL 1.0: Released in July 2023, this version supports native 1024×1024 resolution and improves the generation of limbs and text in images.

Pricing Models

Subscription Plans

  • Basic Plan: $10 per month, includes 200 minutes of GPU time.
  • Standard Plan: $30 per month, suitable for users needing more computational resources.
  • Premium Plan: $60 per month, offering higher computational resources and more features.

API Access

  • Credits-Based Pricing: Users can purchase API access through credit plans, priced at $29, $49, and $149. These credit plans have no training fees, only API access fees.
  • Credit Pricing: 1,000 credits cost $10, enough to generate approximately 500 SDXL images.

Key Application Scenarios

Art Creation and Design

  • Creative Image Generation: Artists and designers can use Stable Diffusion to generate creative images, explore new visual styles, or find inspiration.
  • Illustration and Concept Art: Used for generating illustrations, concept art, and character designs, helping artists quickly realize their creative ideas.
  • Style Transfer: Transforms photos into specific artistic styles, such as Van Gogh’s painting style, or makes localized modifications, such as changing sky colors or enhancing details.

Game Development

  • Game Asset Generation: Game developers can use Stable Diffusion to quickly create game assets like characters, environments, and props, accelerating prototype design and game content iteration.
  • Dynamic Scene Generation: Generates high-quality game scenes based on text descriptions, improving development efficiency and enriching the visual diversity of games.

Film and Animation Production

  • Background and Special Effects Generation: In film and animation production, Stable Diffusion can be used to generate backgrounds, special effects, or character designs, helping artists and directors bring their visual ideas to life.
  • Video Content Creation: Designers can input keywords, descriptions, or style prompts to quickly generate various concept designs and artistic works.

Advertising and Marketing

  • Ad Creatives: Used to generate creative advertising images, product packaging designs, website backgrounds, and more, enhancing the visual appeal of advertisements.
  • Social Media Content: News outlets, social media platforms, and marketing teams can use the technology to automatically generate visual content for articles, blogs, and reports based on text descriptions.

Architecture and Product Design

  • Concept Sketches and Renderings: Designers can use Stable Diffusion to quickly generate concept sketches, product renderings, scene layouts, and more, significantly boosting work efficiency.
  • 2.5D Architectural Scenes: Generates three-dimensional architectural scene renderings for architectural design, game development, and other fields.

Education and Training

  • Online Educational Materials: Generates images for educational and training materials, helping students better understand complex concepts.
  • Medical Simulation: In the medical field, after appropriate training, models like Stable Diffusion can be used to simulate disease-affected areas or reconstruct normal tissue structures.

E-Commerce and Retail

  • Product Image Generation: Generates brand-consistent product images, such as models for beauty and skincare products or packaging designs, enhancing product display effectiveness.
  • Virtual Try-On: In the fashion industry, Stable Diffusion can simulate the effect of wearing different clothing, allowing users to virtually try on outfits.

Stable Diffusion was developed and open-sourced by Stability AI. The open-source version allows users to freely download, use, and modify the model, making it suitable for a wide range of personal and commercial uses.
