看别的东西给了这么一个 reddit 上的帖子,大概解释了一下 TSO
https://www.reddit.com/r/hardware/comments/i0mido/apple_silicon_has_a_runtime_toggle_for_tso_to/
TSO, aka. total store ordering, is a type of memory ordering, and affects how cores see the operations performed in other cores. Total store ordering is a strong guarantee provided by x86, that very roughtly means that all stores from other processors are ordered the same way for every processor, and in a reasonably consistent order, with exceptions for local memory.
In contrast, Arm architectures favour weaker memory models, that allows a lot of reordering of loads and stores. This has the advantage that in general there is less overhead where these guarantees are not needed, but it means that when ordering is required for correctness, you need to explicitly run instructions to ensure it. Emulating x86 would require this on practically every store instruction, which would slow emulation down a lot. That's what the hardware toggle is for.
---
In other words, Apple has, of course, been playing the very long game. TSO is quite a large benefit to emulating x86, hence why Rosetta 2 appears to put out a very decent 70% of native chip performance, that and install time translation for everything but JIT features. That's on a chip not even meant to be a mac chip, so with further expanded caches, a wider, faster engine, perhaps applying the little cores to emulation which they're not currently, and so on, x86_64 performance should be very very decent. I'm going to dare upset some folks and say perhaps even be faster in emulation than most contemporary x86 chips of the time, if you only lose 20% of native performance when it's all said and done, it doesn't take much working backwards to figure where they'd need to be, and Gurman said they were aiming for over 50% faster than Intel.
--
FROM 122.57.164.*