But why?
Honestly that where the "solution in search of a problem" aspect is with throwing extremely customized iPhone (or even iPad Pro ) SoCs at Macs. That is a bunch of 'could do that", but the 'why' is quite thin.
Making a Mac specific ARM SoC has a huge 'why' economically. Almost zero justification there.
There has been some movement around T.2 taking some load off the CPU and doing co-processing.
Not really much movement at all. Several mainstream SSD Flash controllers have been able to do self encryption for more than several years. All Apple did with enabling offload, self-encryption drives is to gain parity with what was already available on the competitive Windows for years previous. Apple's solution is a bit more secure ( so being a bit different and late is somewhat excusable ), but this really isn't as much a movement as to "keeping up with the Joneses"
If the T2 is primarily a security co-processor you do not want to put random user apps/drivers onto the T2. As it is the T2 has "a lot" going on and relatively complicated stack for a security processor. Piling more and 'random' stuff only will open more vectors to be exploited.
Similar issue with the T1-T2 driving the touchbar ... that was more a Intel GPU output limitation and that really isn't an "offload". The T-series GPU drives the screen by copying a framebuffer from the main memory.
The T2 taking audio/video inputs is as much driven by security as it is "offloading". ( again 'random' software augments can't get at the raw audio video which makes them more secure. ). There is some workload shifting there but it tends to be on the way in ( picture/video pre-processed into usable form. HEVC or natural language recognition done or eventually FaceID. ). Doubfult this is going to mutate into more general compute offload.
Apple buying up parts of Dialog for power management. More of that may get weaved into future T-series but that processor workload isn't an x86-64 one.
But I'm not sure why you'd farm a bunch of work out to little ARM CPUs when you have a big honkin screaming fast Xeon CPU right there. Even if each little ARM board was as fast as a few of the Xeon cores, a single Xeon CPUs these days still has 20-30 cores alone.
If they were magically hooked up into a NUMA, flat memory space. You'd have something similar to what Intel tried with their "x86 GPU" and Phi series. But if that is a completely different memory space you run into same issue as discrete GPUs run into.
However, 4-6 iPhone SoC slapped together on a single card... that's largely a waste of time. Each of those with their own memory is more like a cluster of Raspberry Pi's than being anywhere near competitive with any of the GPUs for GPGPU kinds of workloads. Almost everyone's GPUs beat ARM if it comes down to just doing "embarrassingly parallel math" at high perf/w workloads. Apple putting an empty x16 PCi-e v3 slot in the next Mac Pro would do far more to enable that path than any hand waving folks are doing around stuff multiple iPhone SoC into a Mac Pro.
Such a setup would make sense on a low end Intel box where you needed ARM CPUs to beef up a slow Intel CPU. But on a Mac Pro why would you want to bypass the speed of the Xeon for much slower and smaller ARM chips?
it doesn't even make sense there if they are all independent SoC without a shared, flat memory space.
Pushing every expanding workloads down to the T-series makes very little sense. It is at odds with the core objectives and it probably wouldn't buy a whole lot even if tried. Even if Apple switched to an iPad Pro SoC still leaving the t-series decoupled and more secure would give higher security if going to enable external drives (and normal "laptop" like multiple boot options ).