It's not just gluing, it actually require two Max chip to manufacture and since the die size is double, the risk is also double than before. Beside, Max chip itself is already big just like RTX 4080's size. Big die means more expensive. Ultra Fusion also requires yield which cost multiple times than Max. Ultra chip is twice bigger and therefore, much harder to mass produce thanks to Ultra Fusion. Someone explained this from 2:35. Beside, big die means low yield which is a basic logic. M1,2 Ultra is way bigger than 4090.
First, doubling the die size does not double the risk of defects. If the defect rate is z, and the defect rate is uniform, then doubling the die size increases the defect rate by a factor of 2-z.*
Second, you don't make an Ultra by doublng the die size, you make it by fusing two Max's. And there's a fundamental mathematical difference in the risk from fusing two chips of a given die size, vs. that from doubling the die size (and the attendant economic consequences thereof).
Let's use some arbitrary numbers to illustrate:
Suppose a wafer costs $30,000, and has room for either 200 Max chips or 100 monolithic Ultras (probably a bit less than 100, since the tiling ratio for squares onto a fixed circle decreases with the size of the squares, but we'll ignore that here).
Further suppose the critical defect rate is uniform, and that 70% of the Max chips are critical-defect-free (CDF). Then to get a CDF monolithic Ultra, you'd need both "Max halves" to be CDF, and the chance of that is 70% x 70%= 49%.
Thus you can get 200 x 70% = 140 Max chips per wafer, resulting in a cost of 30,000/140 = $214/Max chip. Hence if you make an Ultra by fusing two Max's, it's 2 x 214 = $428/fused Ultra chip plus the cost of fusing (which would need to incorporate the failure rate for this step, which is entirely separate from the chip defect rate).
But if you're making a monolithic Ultra, then you can get 100 x 49% = 49 chips/wafer, and the cost would be $30,000/49= $612/monolithic Ultra chip.
*We see this in the above example: It went from a 1-70% = 30% chance of a critical defect at Max size, to 1-49% = 51% at monolithic Ultra size, and 51%/30% = 1.7 = 2 – 0.3.
*Here's how you derive it: If the defect rate is z, then then the chance of having zero defects is 1-z. And the chance of having zero defects over double the area is (1-z)^2. Then the chance of having a defect in a double-sized chip is:
1 – (1-z)^2. Thus the ratio of the defect rate in a double-sized chip to a regular-sized chip is:
(1 – (1-z)^2)/z = 2 – z.
To those who are wondering why we can't just take the defect rate in the Max chip and square it to get the defect rate in a monolithic Ultra: That would be calculating the wrong thing. There, we'd be calculating the chance that
both halves of the monolithic chip have defects. We're not looking for that; we're instead looking for the chance that
either half of the monolithic chip has defects. And the simplest way to calculate that is to look at the chance of the opposite case—that
neither half has defects—and subtract that from 1, hence 1–z.