stackoverflow 有个比较 c++ stackful 和 stackless 协程的帖子

水木社区手机版

主题:stackoverflow 有个比较 c++ stackful 和 stackless 协程的帖子
楼主|hgoldfish|2023-05-29 23:22:04|只看此ID
stackful 协程是指之前 boost 里面实现的 boost.context, boost.fiber 等等协程方案。基本原理是保存寄存器、jmp指令、恢复寄存器。

而 stackless 协程是指 c++20 实现的 co_await, co_yield 这个语法。它把协程的代码变换成为另外一段 c++ 的类型，类似于 lambda 那样继承一个专门的协程类型，然后调用它的方法。

这里写出了两种协程的对比。

stackless coroutines

    stackless coroutines (C++20) do code transformation (state machine)
    stackless in this case means, that the application stack is not used to store local variables (for instance variables in your algorithm)
    otherwise the local variables of the stackless coroutine would be overwritten by invocations of ordinary functions after suspending the stackless coroutine
    stackless coroutines do need memory to store local variables too, especially if the coroutine gets suspended the local variables need to be preserved
    for this purpose stackless coroutines allocate and use a so-called activation record (equivalent to a stack frame)
    suspending from a deep call stack is only possible if all functions in between are stackless coroutines too (viral; otherwise you would get a corrupted stack)
    some clang developers are sceptical that the Heap Allocation eLision Optimization (HALO) can always be applied

stackful coroutines

    in its essence a stackful coroutine simply switches stack and instruction pointer
    allocate a side-stack that works like a ordinary stack (storing local variables, advancing the stack pointer for called functions)
    the side-stack needs to be allocated only once (can also be pooled) and all subsequent function calls are fast (because only advancing the stack pointer)
    each stackless coroutines requires its own activation record -> called in a deep call chain a lot activation records have to be created/allocated
    stackful coroutines allow to suspend from a deep call chain while the functions in between can be ordinary functions (not viral)
    a stackful coroutine can outlive its caller/creator
    one version of the skynet benchmarks spawns 1 million stackful coroutines and shows that stackful coroutines are very efficient (outperforming version using threads)
    a version of the skynet benchmark using stackless coroutiens was not implemented yet
    boost.context represents the thread's primary stack as a stackful coroutine/fiber - even on ARM
    boost.context supports on demand growing stacks (GCC split stacks)

https://stackoverflow.com/questions/57163510/are-stackless-c20-coroutines-a-problem
--
FROM 117.24.94.*
1楼|GoGoRoger|2023-05-30 09:09:49|只看此ID
无栈协程太难用了，不能减少程序员心理负担
【在 hgoldfish 的大作中提到: 】
: stackful 协程是指之前 boost 里面实现的 boost.context, boost.fiber 等等协程方案。基本原理是保存寄存器、jmp指令、恢复寄存器。
:
: 而 stackless 协程是指 c++20 实现的 co\_await, co\_yield 这个语法。它把协
: ..................

发自「今日水木 on M2007J17C」

※ 来源:·水木社区 http://www.mysmth.net·[FROM: 36.112.192.*]

FROM 36.112.192.*
2楼|ylh0315|2023-05-30 17:39:34|只看此ID
stackless的激活记录，不懂，也不知道怎样建立和维护。
【在 hgoldfish 的大作中提到: 】
: stackful 协程是指之前 boost 里面实现的 boost.context, boost.fiber 等等协程方案。基本原理是保存寄存器、jmp指令、恢复寄存器。
: 而 stackless 协程是指 c++20 实现的 co_await, co_yield 这个语法。它把协程的代码变换成为另外一段 c++ 的类型，类似于 lambda 那样继承一个专门的协程类型，然后调用它的方法。
: 这里写出了两种协程的对比。
: ...................
--
FROM 221.218.61.*
3楼|ylh0315|2023-05-30 17:43:00|只看此ID
stackfull，stack的空间占用，类似多线程。
如果1百万携程，空间占用是很恐怖的。
可以考虑stack池，对应与线程池的设想。
实现过一个，还行。
【在 hgoldfish 的大作中提到: 】
: stackful 协程是指之前 boost 里面实现的 boost.context, boost.fiber 等等协程方案。基本原理是保存寄存器、jmp指令、恢复寄存器。
: 而 stackless 协程是指 c++20 实现的 co_await, co_yield 这个语法。它把协程的代码变换成为另外一段 c++ 的类型，类似于 lambda 那样继承一个专门的协程类型，然后调用它的方法。
: 这里写出了两种协程的对比。
: ...................
--
修改:ylh0315 FROM 221.218.61.*
FROM 221.218.61.*
4楼|hgoldfish|2023-05-30 18:08:50|只看此ID
空间占用不严重。因为现在 linux/openbsd 等现代发行版都早就实现了自动增长的栈。一开始只给你 4KB，随着协程函数的运行才会继续增长。

所以你一次性创建 1m 协程，也只会占用 4GB 的内存空间。

Windows 也有这个功能，但我现在还没有找到哪个 API 可以创建这种自动增长的栈内存。但我知道 Windows 确实是有实现的。因为 CreateFiber() 这个函数有这个功能。

【在 ylh0315 的大作中提到: 】
: stackfull，stack的空间占用，类似多线程。
: 如果1百万携程，空间占用是很恐怖的。
: 可以考虑stack池，对应与线程池的设想。
: ...................
--
FROM 59.60.25.*
5楼|ylh0315|2023-05-30 18:13:32|只看此ID
自增栈可以用mmap就行，我也是这么用，自顶向下，自增。
【在 hgoldfish 的大作中提到: 】
: 空间占用不严重。因为现在 linux/openbsd 等现代发行版都早就实现了自动增长的栈。一开始只给你 4KB，随着协程函数的运行才会继续增长。
: 所以你一次性创建 1m 协程，也只会占用 4GB 的内存空间。
: Windows 也有这个功能，但我现在还没有找到哪个 API 可以创建这种自动增长的栈内存。但我知道 Windows 确实是有实现的。因为 CreateFiber() 这个函数有这个功能。
: ...................
--
FROM 221.218.61.*
6楼|ylh0315|2023-05-30 18:15:41|只看此ID
但是，这1M协程，早晚都会投入工作，都投入了，内存占用可观。
因为是中间件，用户可以把任何第三方应用插进来，你无法预估栈用量。
【在 ylh0315 的大作中提到: 】
: 自增栈可以用mmap就行，我也是这么用，自顶向下，自增。
--
修改:ylh0315 FROM 221.218.61.*
FROM 221.218.61.*
7楼|hgoldfish|2023-05-30 18:17:07|只看此ID
怎么能这么想呢。stackless 的协程都投入工作，也得占用那个内存啊。而且因为 stackless 做同一件事的内存开销更大，需要不断地在堆里面分配内存，最终占用的内存说不定还会多很多呢。

【在 ylh0315 的大作中提到: 】
: 但是，这1M协程，早晚都会投入工作，都投入了，内存占用可观。
--
FROM 59.60.25.*
8楼|ensonmj|2023-05-30 18:18:48|只看此ID
go不就是这么实现的吗？去挖一挖看看他在windows下调的啥api

【在 hgoldfish (老鱼) 的大作中提到: 】
:  空间占用不严重。因为现在 linux/openbsd 等现代发行版都早就实现了自动增长的栈。一开始只给你 4KB，随着协程函数的运行才会继续增长。
:
:  所以你一次性创建 1m 协程，也只会占用 4GB 的内存空间。
:
--
FROM 223.104.5.*
9楼|hgoldfish|2023-05-30 18:19:38|只看此ID
go, python, java, c# 这几门语言的协程实现和 c++ stackful 是不一样的啊。只有 c++ 才能搞“保存寄存器”，“jmp 跳转”，“恢复寄存器”这种直接在机器指令级别干活的实现方案。其它语言要么有虚拟机，要么有 GC，不能这么搞的啊。

【在 ensonmj 的大作中提到: 】
: go不就是这么实现的吗？去挖一挖看看他在windows下调的啥api
--
FROM 59.60.25.*