I understand (at least sort of) the difference between CPU and GPU cores but what is the difference between GPU and NPU cores? Seems like lots of overlap
A modern GPU is a massively parallel programmable vector processor optimized for simultaneously running a large number of data-parallel programs, with focus on tasks in graphical domain. The NPU (on Apple hardware at least) is a more limited-function processor optimized for calculating convolutions at small area and power cost.
The overlap is that convolutions can be expressed as a data-parallel programs. But a general-purpose vector processor like a GPU is not the most efficient way to do these kinds of calculations. That’s why GPUs that are actually fast at ML have some dedicated circuitry for these tasks.
The primary reason why Apple NPU exists is efficiency. It can only perform limited types of jobs, and it’s not particularly fast, but it uses much Less energy than the GPU, and it also frees the GPU up for other tasks.
I am very curious to see where Apple will take this. Last year they had a flurry of patents describing a more advanced NPU. At the same time, the GPU is the largest processor in the system (by area), and it would make sense to improve its ML capabilities (especially since Apple could achieve major speedups with only minor die area investment).