Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Misleading article, as the yields are better at the outset than they were with the 5nm startup, where TSMC went from a 50% yield to an 80% yield within a month.

Expect that to happen also.

Still believe this is more about shifting M2 inventory than any real shortage, as Apple would know from the 5nm experience that yields at the outset would be around 50% where in fact they exceed that for the 3nm at 55%.

So the 3nm is already better yield than the 5nm at this stage so expect a ramp up in yields and again Apple would know this.
That's a bogus claim. It takes about a month to produce a chip. There is no way the yield could be improved that much in just one iteration.
 
That's a bogus claim. It takes about a month to produce a chip. There is no way the yield could be improved that much in just one iteration.

hulk-endgame.gif
 
Last edited:
Now they’re going to stay on the real axis and not introduce complex numbers into the mix.

If you add chiplets, especially in the 3rd dimension, you increase silicon density per unit area. I think chiplets are part of the future of Silicon. Otherwise, why create the UCIe?
You increase transistor density per unit package area. Not per unit wafer area.
Now that can be useful, but it describes something else. It doesn't change the cost per processed silicon wafer or transistor. (If anything, the stacking adds processing steps.)

As I remarked earlier, the strategies we have for actually increasing density require more processing steps in themselves, and thus increases costs. And they increase density only quite modestly. Btw, Anandtech just published a readable report from an ongoing symposium where TSMC demonstrated the current state of their N2 process plans, estimating an aggregate density increase from N5->N2 of just under 50%. Over the same time span that Moore's law would have predicted 800%+ ... (more details, less readable account here.)
It's over, and has been for a while, it's just that the lithographic steam train grinds to a gradual halt, not a dead stop. For the more technically minded, here is an article that discusses the limitations of High-NA EUV. There are a ton of really bright minds working to extend lithographic advances just a bit more, and even more money thrown at the problem, but at some point the gains will be too marginal to make sense for a sufficiently large portion of the market to drive the investments necessary. And as we can see right now, even with those investments the gains are quite small, not even close to Moores law.
 
Last edited:
  • Like
Reactions: redbeard331
3 nm is unbelievably small. It’s quite amazing that we are able to produce at that size. No wonder the yield is t that good yet.
 
That's a bogus claim. It takes about a month to produce a chip. There is no way the yield could be improved that much in just one iteration.
Cycle time for N3 is 4 months. That means the first quantitative measurement of final yield takes 4 months, so yield ramping will be >4 months. There are intermediate points at which they can evaluate the process so improvements are continuous during the first cycle as well.
 
Now they’re going to stay on the real axis and not introduce complex numbers into the mix.

If you add chiplets, especially in the 3rd dimension, you increase silicon density per unit area. I think chiplets are part of the future of Silicon. Otherwise, why create the UCIe?

There is pragmatic technical limits on what you can stack also. For very high performance there are limiting thermal constraints. ( e.g. AMD's 3D cache just stacks cache over cache. ).

Chiplets will help where can increasingly kick a larger fraction of the parts that don't scale anymore off the die. But that gets tricky if don't want to take a perf/watt hit. All the workarounds have trade-offs. Thermal complications , costs , etc.

UCIe is far more addressing controlling costs rather than providing density increases. If there is a standard way for package building blocks to be connected after being put together by a wider set of designer/builders then there is a bigger ecosystem to spread development costs over. It gets cheaper to build more specialized SoC packages that address narrower areas. [ e.g. company A & B build specialized 'core only' dies. company C makes a PCI-e and general I/O die, Company D makes a Optical networking die. Company E makes memory + L4 cache die. A + C + E , A + D + E , B + C + D + E , B + E + E + C are a variations of packages that can be put together. The completed package still gets the "System on a Chip" effect , but using multiple dies to get there. ] As the costs for the fabbing the ultra performance core die goes up if make the die smaller by covering less and make the complements cheaper .. then have wrangle some control over overall system cost increases.

The sometimes happens now. Folks can license PCI-e or memory controller IP designs. Can license Arm core designs. And then there is some work to put them all on the same die. The problem the fabrications work arounds going forward are not universally effective. SRAM stopped scaling. Analog I/O stopped scaling. So it isn't just generic IP that a licensee can signally lay down on top of a completely general purpose fab methodology.

UCIe has chance to usher in a new era that shifts somewhat away from the generic , 'do everything' CPU era. Where the generic general purpose processor just got cheaper and cheaper ( at higher and higher clocks) allowing it to be applied to more and more use cases. There will still be a market for generic , general purpose machines. But their encroachment on top end performance and/or high value add computation will get weakened a bit.
 
  • Like
Reactions: xpaulso
There is pragmatic technical limits on what you can stack also. For very high performance there are limiting thermal constraints. ( e.g. AMD's 3D cache just stacks cache over cache. ).

Chiplets will help where can increasingly kick a larger fraction of the parts that don't scale anymore off the die. But that gets tricky if don't want to take a perf/watt hit. All the workarounds have trade-offs. Thermal complications , costs , etc.

UCIe is far more addressing controlling costs rather than providing density increases. If there is a standard way for package building blocks to be connected after being put together by a wider set of designer/builders then there is a bigger ecosystem to spread development costs over. It gets cheaper to build more specialized SoC packages that address narrower areas. [ e.g. company A & B build specialized 'core only' dies. company C makes a PCI-e and general I/O die, Company D makes a Optical networking die. Company E makes memory + L4 cache die. A + C + E , A + D + E , B + C + D + E , B + E + E + C are a variations of packages that can be put together. The completed package still gets the "System on a Chip" effect , but using multiple dies to get there. ] As the costs for the fabbing the ultra performance core die goes up if make the die smaller by covering less and make the complements cheaper .. then have wrangle some control over overall system cost increases.

The sometimes happens now. Folks can license PCI-e or memory controller IP designs. Can license Arm core designs. And then there is some work to put them all on the same die. The problem the fabrications work arounds going forward are not universally effective. SRAM stopped scaling. Analog I/O stopped scaling. So it isn't just generic IP that a licensee can signally lay down on top of a completely general purpose fab methodology.

UCIe has chance to usher in a new era that shifts somewhat away from the generic , 'do everything' CPU era. Where the generic general purpose processor just got cheaper and cheaper ( at higher and higher clocks) allowing it to be applied to more and more use cases. There will still be a market for generic , general purpose machines. But their encroachment on top end performance and/or high value add computation will get weakened a bit.
Yes but with chiplets you can use different litography processes to make the various components. Not everything needs to be made on the same node. So for the components that aren't scaling as well, you can use older/cheaper nodes for them. For the I/O chiplet for example. Then the more advanced logic, e.g., general purpose compute, and specialized compute like accelerators, they can be put on the newer more dense nodes. If you can stack additional chiplets or cache in 3 dimensions, like AMD is attempting with Zen4x3d, then you can fit other logic/memory in the 3rd dimension. Then you tie the chiplets together with a high speed low latency interposer.

So from the outside looking in, in the same die area, you can fit more silicon in the same space. And by not using a pure monolithic design, you don't have to throw away the whole chip if there is a manufacturing defect in one of the chiplets. Just use another chiplet. Yields can improve.

Of course, one issue is with heat management but I think the chiplet strategy is a path forward for keeping Moore's Law somewhat alive.

It will be interesting to see if a packaging company like Intel or others use Chiplets from vendors like an Intel Core chiplet, an Arm/Risc-V chiplet, an aquantia ethernet controller chiplet, an accelerator for video encode/decode, etc. PCI and PCIe was one of the best innovations to occur in the PC space. We shall see if and or how that model scales down to chiplets with UCIe.
 
Last edited:
That's a bogus claim. It takes about a month to produce a chip. There is no way the yield could be improved that much in just one iteration.


TSMC N3 has been in 'at risk' production for over a year. It is a continuous feedback cycle that goes into feeding back corrections/improvements into the pipeline of wafers being processed. There is a quality improvement in how effective the corrections are when do a larger statistical sampling of the adjustments in addition to the cycle length.



However,

Misleading article, as the yields are better at the outset than they were with the 5nm startup, where TSMC went from a 50% yield to an 80% yield within a month.


Even if yields were incrementally better, the dramatic increase in wafer costs means that 55% isn't as economically as good as the 50% of N5. If the wafer cost is up >20% then that incremental 5% isn't a 'winner' if trying to control final system costs. Part of the problem here is that what is "good enough" to tag something as ready for "High volume production" is likely going to diverge a bit between TSMC and their customers.

Also, you might mean a Quarter. TSMC'S own chart ( without exact detail) reveals it wasn't a month.


Manufacturing%20Excellence%204.mp4_snapshot_01.04_%5B2021.06.01_20.43.54%5D.jpg



The bulk of the very rapid yield gains happen before TSMC tags something as "Mass Production" / "High Volume Production". The easier to fix stuff happens pretty early in the early 'at risk' and initial ramp. N3 probalby has a substantively different curve ( because much less so an straight forward adjustment on the same theme. ) , but big chunks are still likely front loaded in mulitple quarters back from MP.

Even several iterations back the bake times were inside of a month.

From the EETimes article


" ... we believe N3 yields at TSMC for A17 and M3 processors are at around 55% [a healthy level at this stage in N3 development], and TSMC looks on schedule to boost yields by around 5+ points each quarter.” ..."



There has been lots of "doom and gloom" painted on N3 though. It is likely the progression during the -Q4 , -Q3 , -Q2 , period of time wasn't as rapid large chunks as with N5 or N7. It took all the way to the end of the predicted 2H '22 window to get to 'MP' status.



Still believe this is more about shifting M2 inventory than any real shortage, as Apple would know from the 5nm experience that yields at the outset would be around 50% where in fact they exceed that for the 3nm at 55%.

M2 isn't made on the same production line as N3B (or N3E) is. N3B and N3E may have high overlap, but M2 is off in another factory. And the M2 inventory shift was probably a mix of demand changes and the M2 Pro/Max sliding about a quarter into the future. If the quarter slide was know early enough Apple could have bulked up on plain M2 to consume the wafers their original plan had going to M2 Pro/Max and then killed off the plain M2 flow closer to where ramped M2 Pro/Max (moved part of the bubble in the wafer assignment flow to later. ) . Make too many plain M2's and then stop to even it back out.



That N3B takes substantively longer to make creates a problem for Apple's technology arbitrary September deadline.


So the 3nm is already better yield than the 5nm at this stage so expect a ramp up in yields and again Apple would know this.

The longer cycle time for N3B means that the yield adjustments will be slower than N5. A flatter curve. But more so for Apple it makes there "just in time" inventory control more coarse grained. If they say 'slow down'/'speed up' and it takes 4 months for the output to slow / speed up ... that is a bigger issue. More like driving a very long and extremely heavy freight train then a tractor-trailer truck.


One of the primary reasons N3E is coming and has a much higher adoption rate is that the 'bake time' is faster and the yields will be better sooner. The trade off is that there isn't as much of a density increase. It is easier to make (and hence cheaper to buy) , but not as much of a gap in what you get.
 
Any indication anywhere that TSMC has started N3E?

'started' -- as in working on. Yes. about a year ago TSMC said N3E would arrive about a year after N3. ( N3 was 2H '22 so 3E was 2H '23 ) A bit later they said N3E would start about a quarter earlier than N3's start. So if N3 finally went to mass production ( high volume manufacturing) in Dec '22 ( end of Q4 ), that would make end of Q3 roughly about August-September timeframe. Still in the 2H '23 timeframe, just toward the beginning of that target range rather than the end.

N3E is simpler to make but still pretty likely looking at about 3 month bake times. So even if put wafer blanks into the a HVM production line that started in July , Apple would not have product until October-November timeframe.

'started' -- as in high volume manufacturing status ( relatively high number of wafers) . No. TSMC has still recently stated 2H '23 ( which earliest would be July).

'started' -- as in 'at risk' where customers are creating a substantive number of dies using more than a handful of wafers for product validation testing. Yes. ( TSMC has tape outs for N3E products).



It looks like N3P will reverse some of the density backslide that N3E does. I wouldn't be surprised if Apple skips N3E. At least for the relatively larger M-series dies. The 2H '24 start for N3P HVM is yet again not a good fit for the need to ramp the iPhone SoC in 1H of the year. That leaves some hope for N3E. Apple could just do slight more conservative roll outs.

for year n, shift to using fab process from TSMC that went to HVM in year 2H n-1 . Wouldn't have to worry so much if there were a delay or if early yields were 'high enough' , etc.
 
Yes but with chiplets you can use different litography processes to make the various components. Not everything needs to be made on the same node. So for the components that aren't scaling as well, you can use older/cheaper nodes for them. For the I/O chiplet for example. Then the more advanced logic, e.g., general purpose compute, and specialized compute like accelerators, they can be put on the newer more dense nodes. If you can stack additional chiplets or cache in 3 dimensions, like AMD is attempting with Zen4x3d, then you can fit other logic/memory in the 3rd dimension. Then you tie the chiplets together with a high speed low latency interposer.

So from the outside looking in, in the same die area, you can fit more silicon in the same space. And by not using a pure monolithic design, you don't have to throw away the whole chip if there is a manufacturing defect in one of the chiplets. Just use another chiplet. Yields can improve.

1. the chiplets themselves , even though cover a subset, are still going to have have defects on some dies. Chopping a 100mm^2 die up into 4-5 24mm^2 dies isn't going to have the same yield uplift and chopping some 450mm^2 into 225mm^2 or 125mm^2 sized chunks.


2. "in the same die area" really should be "in the same package area". You really haven't 'maintained' Moore's Law if you start using more volume. The density isn't really going up; you've just shifted to consuming a different dimension. Bad enough that "3nm" doesn't really measure actual physical features. Taking density off into the synthesized multiple dimensions into a single unidimensional number is just going further into the 'swamp'.

3. "components that aren't scaling as well" ... again that is totally indicative that Moore's Law has 'died' on you. If complimentary chiplets stop shrinking that will bring new problems. ( if core count goes up but cache size is flat ... that isn't going to be generally useful on a broad set of workloads. . You'd have more cores with a higher memory bandwidth consumption threshold and relatively little more memory bandwidth. "out of order" execution leans heavily on branch prediction and branch prediction relies on a hyper local cache of values. If that cache doesn't shrink well you have problems because the level of physical locality coupling is very high. )

It will be interesting to see if a packaging company like Intel or others use Chiplets from vendors like an Intel Core chiplet, an Arm/Risc-V chiplet, an aquantia ethernet controller chiplet, an accelerator for video encode/decode, etc. PCI and PCIe was one of the best innovations to occur in the PC space. We shall see if and or how that model scales down to chiplets with UCIe.

PCI/PCI-e is a very late binding , "just in time" way to compose systems. System builders used it and not so much chip package builders. UCIe is going to a be a relatively early binding ( soldered to the interposer with as small as interface pads/connectors as gain possibly use. ).

Minimally, it won't be as broad. Likely will require far more technical support and cooperation between the components being designed.
 
  • Like
Reactions: falkon-engine
Yes but with chiplets you can use different litography processes to make the various components. Not everything needs to be made on the same node. So for the components that aren't scaling as well, you can use older/cheaper nodes for them. For the I/O chiplet for example. Then the more advanced logic, e.g., general purpose compute, and specialized compute like accelerators, they can be put on the newer more dense nodes. If you can stack additional chiplets or cache in 3 dimensions, like AMD is attempting with Zen4x3d, then you can fit other logic/memory in the 3rd dimension. Then you tie the chiplets together with a high speed low latency interposer.

So from the outside looking in, in the same die area, you can fit more silicon in the same space. And by not using a pure monolithic design, you don't have to throw away the whole chip if there is a manufacturing defect in one of the chiplets. Just use another chiplet. Yields can improve.

Of course, one issue is with heat management but I think the chiplet strategy is a path forward for keeping Moore's Law somewhat alive.

It will be interesting to see if a packaging company like Intel or others use Chiplets from vendors like an Intel Core chiplet, an Arm/Risc-V chiplet, an aquantia ethernet controller chiplet, an accelerator for video encode/decode, etc. PCI and PCIe was one of the best innovations to occur in the PC space. We shall see if and or how that model scales down to chiplets with UCIe.
You do realize how the M1 Ultra works, do you not? It's not like Apple is unaware of how chiplets operate! Not to mention that Apple has been pushing fancier packaging for a decade now, at a more uniform pace than Intel.

I'm sure they will switch to options like chiplets *when it makes sense* which is a rather different point from when the internet hype machine says it should happen.
 
You do realize how the M1 Ultra works, do you not? It's not like Apple is unaware of how chiplets operate! Not to mention that Apple has been pushing fancier packaging for a decade now, at a more uniform pace than Intel.

I'm sure they will switch to options like chiplets *when it makes sense* which is a rather different point from when the internet hype machine says it should happen.
Of course I know about the ultra. I was talking about UCIe! So not chiplets from apple, but rather a broader chiplet strategy employed across the PC/computing space.
 
1. the chiplets themselves , even though cover a subset, are still going to have have defects on some dies. Chopping a 100mm^2 die up into 4-5 24mm^2 dies isn't going to have the same yield uplift and chopping some 450mm^2 into 225mm^2 or 125mm^2 sized chunks.


2. "in the same die area" really should be "in the same package area". You really haven't 'maintained' Moore's Law if you start using more volume. The density isn't really going up; you've just shifted to consuming a different dimension. Bad enough that "3nm" doesn't really measure actual physical features. Taking density off into the synthesized multiple dimensions into a single unidimensional number is just going further into the 'swamp'.

3. "components that aren't scaling as well" ... again that is totally indicative that Moore's Law has 'died' on you. If complimentary chiplets stop shrinking that will bring new problems. ( if core count goes up but cache size is flat ... that isn't going to be generally useful on a broad set of workloads. . You'd have more cores with a higher memory bandwidth consumption threshold and relatively little more memory bandwidth. "out of order" execution leans heavily on branch prediction and branch prediction relies on a hyper local cache of values. If that cache doesn't shrink well you have problems because the level of physical locality coupling is very high. )



PCI/PCI-e is a very late binding , "just in time" way to compose systems. System builders used it and not so much chip package builders. UCIe is going to a be a relatively early binding ( soldered to the interposer with as small as interface pads/connectors as gain possibly use. ).

Minimally, it won't be as broad. Likely will require far more technical support and cooperation between the components being designed.
Very fair points. I’m just interested to see what, if anything, comes of ucie.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.