【 以下文字转载自 NewExpress 讨论区 】
发信人: anylinkin (ALK), 信区: NewExpress
标 题: Re: 关于“AI模型开源”社区论坛说不是代码开源而是权重参数开
发信站: 水木社区 (Sat Feb 1 10:26:04 2025), 站内
关于open-weight with inference code, DeepSeek的进一步解释如下:
1. What Does "Open-Weight" Mean?
When DeepSeek says they are "open-weight with inference code", they are rele
asing:
- Trained Weight Values: The numerical values of the parameters (weights a
nd biases) learned during training.
Example: A file like model_weights.bin containing the trained weights fo
r a model like DeepSeek-MoE.
- Inference Code: Scripts to load these weights into the model architectur
e and generate outputs (e.g., answer questions, write code).
Key Clarification:
- Model Architecture: The structure (e.g., transformer layers, attention
heads) is fixed and predefined.
- Weight Values: The numbers stored in the architecture’s parameters are
learned from data and released openly.
→ "Open-weight" = Sharing the trained weights (values), not the architectur
e parameters themselves.
2. What Is Training Not Open ?
When we say "training is not open," this refers to:
- Training Code: The actual code/algorithms used to train the model from s
cratch (e.g., optimization logic, loss functions, distributed training frame
works).
- Training Data: The datasets used to teach the model (e.g., text corpora,
images, or proprietary data).
- Training Infrastructure: Hardware setups, hyperparameters (e.g., learning
rates), and fine-tuning pipelines.
- Full Model Stack: Enterprise tools, APIs, or products like DeepSeek-R1.
【 在 anylinkin 的大作中提到: 】
: 标 题: Re: 关于“AI模型开源”社区论坛说不是代码开源而是权重参数开
: 发信站: 水木社区 (Sat Feb 1 09:37:00 2025), 站内
:
: DeepSeek自己的回答:"DeepSeek is open-weight with inference code, but not fu
: lly open-source."
:
: 并且,它还告诉我,AI model的Open-Source也是分等级的,并且直接否认仅仅开放权重
: 参数不等于开源,因此它认为LLaMA自称是开源的,实际不是开源。
:
:
: Fully Open-Source: Requires releasing code + weights + training details/data
: . Examples include Pythia (code, weights, and dataset links) and BLOOM.
:
: Partially Open: Many models are "open" only in the sense that weights and in
: ference code are public, but training code/data are withheld (e.g., Stabilit
: y AI's Stable Diffusion shares weights but not full training data).
:
: Open Weights ≠ Open Source: Projects like LLaMA (weights-only) are sometime
: s mislabeled as "open-source" but lack critical components for full openness
: .
:
:
: 不过社区的讨论都认为,DeepSeek是目前 AI领域“开源”程度最好的模型。
:
:
: 【 在 Jessler 的大作中提到: 】
: : 你自己看看有没有code不就可以了,github code source里都有license的
:
: --
:
: ※ 来源:·水木社区 mysmth.net·[FROM: 223.104.38.*]
--
FROM 223.104.38.*