It’s a really confusing subject since DDR4 works so much differently from LPDDR4. I spent some time trying to understand it, and who I’m not sure that I got it 100% right, the gist of it is that LPDDR4 memory channel width is only 32 bit as opposed to DDR4 64 bit. So LPDDR4 has more bandwidth per bit, but less bits so to speak. To get most out of it, high performance implementations use quad channel configurations. This includes Intel Ice Lake and I think chips like A12X. That’s why LPDDR4 can outperform DDR4, abs that’s one reason why such configurations are more expensive.
It's in the GB database. Look for iPhone13
Ah, thank you. I was searching on things like "A14" and getting nowhere.
About LPDDR4 - If you look at ark.intel.com, the processors which support LPDDR4 also support 2-channel DDR4. I interpret this to mean (given Intel's norms) that these chips have a total of 128 data bits, and two command channels (one per 64-bit DDR4 DIMM).
The stuff you're having trouble with is that, unlike normal DDR4, the LPDDR4 spec focuses exclusively on the PoP packaging used used in cellphones and some tablets. DDR3/4 memory chips traditionally have only one command channel per chip package. A LPDDR4 PoP package has 64 data bits, and four command channels. Each command channel governs 16 data bits, not 32. SoC designers can choose a variety of ways to utilize the DQ (data) bits and CA (command-address) channels. You can gang all the CAs up and run them in lockstep, meaning it functions as if it's a 1-channel 64-bit wide memory. You can run each CA channel independently, so it's like four 16-bit wide memories, or 2x32. There's a total of eight ways to do it, so it's easy to get confused because you might find conflicting info based on how individual implementations did it.
Some of these modes reduce bandwidth per bit, but none of them can ever increase it. The purpose of more channels in a single PoP package is to increase the number of active pages (better for random access performance).
You can still make the calculation "Mbps per pin * # data pins" to figure out the theoretical max bandwidth. iFixit's teardown found this LPDDR4 PoP in the iPhone 12:
MT53D512M64D4UA-046 XT:F Micron DRAM LPDDR4 32G 512MX64 FBGA QDP Z21M datasheet, inventory, & pricing.
www.mouser.com
It's 64 bits with a maximum clock frequency of 2133 MHz = 4266 Mbps per pin. 64*4266 = 273,024 Mbps = 34.128 GB/s. GB4 measures 29.0 GB/s, which is about 85% of theoretical max.
That's actually pretty reasonable. If you're doing nothing but stupidly linear access patterns, it's possible to achieve numbers as high as 90% on plain old DDR3. Now, is it impressive that GB4 can do this while the entire rest of the system in the SoC is also injecting its own memory accesses in the mix? Yes. Quite impressive. Speaks well of the hit rate on Apple's whole-SoC last-level cache.