Your question is a little vague: You’re comparing a specific product of Apple (the M1 SoC) to “Intel’s”: Intel’s what? For example, Intel’s higher-end x86–64 parts can still retire more instructions per second given the right workload (mostly because they have quite a few more cores).
Still, it’s true that for single-threaded workloads M1 sits at the top of the pile, and it is also has a significant lead overall in its power-consumption category (about 10–20W of sustained TDP). I’ve seen many commentators try to attribute that lead to one particular factor (usually, the fact that it is built using TSMC’s latest 5nm process). But, IMO, that is misguided: It’s not just one tweak that gives Apple its advantage, but a whole host of factors:
• Apple has been designing for low power consumption from the start. I’ve been told that they can “power down” parts of their SoC at a finer grain in both time and space than their competition, but I’ve never seen specific support for that claim. (Although PA Semi, the chip design company they acquired many years ago, had expertise in that area, and possibly even patents.) Low power design isn’t just about getting more out of a battery: It’s about keeping the temperature of the chip within operating range. Once things get too hot, a CPU has to reduce its energy intake by slowing down, so that it can stay in a safe operating zone. So low power usage is important for all performance-critical applications.
• Apple design appears to be very aggressive with respect to out-of-order execution. Some test suggests that the M1’s ROB (reorder buffer) has well over 600 entries, whereas Intel and AMD’s leading CPUs tend to have less than half that. Similarly, they can handle more outstanding memory transactions than the competition.
• The ARM AArch64 ISA has some intrinsic advantages over Intel’s x86–64 ISA. I suspect AArch64 is easier to decode, for a start, which (among others) reduces the pressure on the pipeline. It also has a weaker memory model (which allows it to retire some memory instructions more aggressively).
• The M1 does also really well with memory in general. Their caches tend to be larger and yet very fast (their level-1 instruction cache, in particular is noticeably larger than Intel’s or AMD’s designs). That combines well with the aggressive attempts at out-of-order executions. Having memory in the same package as the M1 SoC probably helps a little as well.
• And yes, being on the latest 5nm fab node gives a nice push also.
• As does Apple’s ability to drop legacy support (e.g., M1 only supports AArch64; AArch32 support was dropped from Apple chips some years ago).
And that’s just talking about the CPU. The GPU, Neural Engine, and some special computational units dedicated to linear algebra provide even more benefits if exploited with adapted software.
In the end, I think the “why?” is answered by “because they attacked the problem more aggressively”.
What are the pros and cons of the new Apple M1 chip?
The benchmarks cited by Franklin Veaux seemed suspiciously high to me, so I asked my brother about it. He’s an electrical engineer working on CPU design, and he really knows his stuff.
He said that technically, yes, the M1 is much much faster, but only because Apple laptops use passive cooling. Assuming only passive cooling, power efficiency IS speed, since the processor speed is thermally limited. Intel’s x86 CPU is designed for active cooling, so of course it’s going to have its bacon roasted in a passively cooled environment. For Apple’s use cases—tightly packed little laptops—the M1 is dramatically faster, but it will still lose to the x86 if placed into an actively cooled desktop environment.
Regarding the onboard graphics beating out mid-range GPUs; that is utter hogwash on all metrics except perhaps for video transcoding (watching, recording, or converting video formats). The GTX 1050 Ti (released in 2016) that Apple is so proud of beating counts only as an entry-level GPU, not mid-range. It was mid-range four year ago, but four years is a long time in the world of computer hardware. A modern mid-range GPU would be something like the GTX 1650 Ti, which dramatically outperforms Apple’s M1 (second row vs. third row):
Note: This benchmark was performed with the Apple hardware emulating x86 instructions on its ARM processor, a minor handicap. There is no apples to apples comparison for these benchmarks because the x86 benchmark would have to be completely overhauled to run natively on ARM, at which point, it isn’t really the same benchmark, is it?
It’s easy to find laptops containing a GTX 1650 Ti for $1,300 the same price as an Apple MacBook.
In summary, if you want graphics processing power and you don’t care about bulky form factor, low battery life, and noisy fans, then you can beat the Apple M1 on both CPU and GPU processing power and with an ordinary Intel CPU and Nvidia GPU gaming laptop.
However, if portability, battery life, and silent operation are important to you, then Apple’s M1 is leaps and bounds forward in efficient microprocessor technology.
Addendum:
If Apple’s M1 were to use its embedded NN processor to perform super-sampling like Nvidia’s DLSS does, it may be able improve its performance further by rendering lower resolution images and scaling them up realistically with Machine Learning. I was unable to confirm if they do or do not do anything like this in their current rendering pipeline.
0 Comments