关于“AI模型开源”社区论坛说不是代码开源而是权重参数开

水木社区手机版

展开|楼主|同主题展开|返回

主题:关于“AI模型开源”社区论坛说不是代码开源而是权重参数开
anylinkin|2025-02-01 10:27:44|
【以下文字转载自 NewExpress 讨论区】
发信人: anylinkin (ALK), 信区: NewExpress
标  题: Re: 关于“AI模型开源”社区论坛说不是代码开源而是权重参数开
发信站: 水木社区 (Sat Feb  1 10:26:04 2025), 站内

关于open-weight with inference code, DeepSeek的进一步解释如下：

1. What Does "Open-Weight" Mean?

When DeepSeek says they are "open-weight with inference code", they are rele
asing:

　- Trained Weight Values: The numerical values of the parameters (weights a
nd biases) learned during training.

    Example: A file like model_weights.bin containing the trained weights fo
r a model like DeepSeek-MoE.

  - Inference Code: Scripts to load these weights into the model architectur
e and generate outputs (e.g., answer questions, write code).

Key Clarification:

   - Model Architecture: The structure (e.g., transformer layers, attention
heads) is fixed and predefined.

   - Weight Values: The numbers stored in the architecture’s parameters are
learned from data and released openly.

→ "Open-weight" = Sharing the trained weights (values), not the architectur
e parameters themselves.

2. What Is Training Not Open ?

When we say "training is not open," this refers to:

  - Training Code: The actual code/algorithms used to train the model from s
cratch (e.g., optimization logic, loss functions, distributed training frame
works).

  - Training Data: The datasets used to teach the model (e.g., text corpora,
images, or proprietary data).

- Training Infrastructure: Hardware setups, hyperparameters (e.g., learning
rates), and fine-tuning pipelines.

- Full Model Stack: Enterprise tools, APIs, or products like DeepSeek-R1.

【在 anylinkin 的大作中提到: 】
: 标  题: Re: 关于“AI模型开源”社区论坛说不是代码开源而是权重参数开
: 发信站: 水木社区 (Sat Feb  1 09:37:00 2025), 站内
:
: DeepSeek自己的回答："DeepSeek is open-weight with inference code, but not fu
: lly open-source."
:
: 并且，它还告诉我，AI model的Open-Source也是分等级的，并且直接否认仅仅开放权重
: 参数不等于开源，因此它认为LLaMA自称是开源的，实际不是开源。
:
:
: Fully Open-Source: Requires releasing code + weights + training details/data
: . Examples include Pythia (code, weights, and dataset links) and BLOOM.
:
: Partially Open: Many models are "open" only in the sense that weights and in
: ference code are public, but training code/data are withheld (e.g., Stabilit
: y AI's Stable Diffusion shares weights but not full training data).
:
: Open Weights ≠ Open Source: Projects like LLaMA (weights-only) are sometime
: s mislabeled as "open-source" but lack critical components for full openness
: .
:
:
: 不过社区的讨论都认为，DeepSeek是目前　AI领域“开源”程度最好的模型。
:
:
: 【在 Jessler 的大作中提到: 】
: : 你自己看看有没有code不就可以了，github code source里都有license的
:
: --
:
: ※ 来源:·水木社区 mysmth.net·[FROM: 223.104.38.*]
--
FROM 223.104.38.*