Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
Not open for further replies.
Im having hard time understanding, how "koyoot" is not in any way suggesting the gender of the user behind this nickname.


End of off-topic.
 
Im having hard time understanding, how "koyoot" is not in any way suggesting the gender of the user behind this nickname.
Are you a puppie or a kitten? Or should we just call you "fluffy"?

http://www.urbandictionary.com/define.php?term=kyoot

Top definition
kyootunknown
From cute, something adorable and fluffy. Often used to refer to puppies and similarly kyoot objects.
Awww, that kitten is sooo kyoot!!

Omg! That's so kyoot!!
by wolfie November 14, 2004

I think that I'll go with "Fluffy".
 
Are you a puppie or a kitten? Or should we just call you "fluffy"?



I think that I'll go with "Fluffy".
Isn't my nick's phonetic version of this:
coyote-standing.ngsversion.1459714905525.adapt.1900.1.jpg
 
GV104 - 3072 CUDA cores, 256 Bit GDDR6

At the same time we can estimate the core count of GV107. GM107 was almost a third of GM104. GP107 was almost a third of GP104. GV107 has to be exactly third of GV104, because of layout of the cores in the die. So 1024 CUDA cores for GV107.

Current information also is not really "sure" about memory for GV 107. It can be either GDDR5 or GDDR5X.
 
Isn't my nick's phonetic version of this:
coyote-standing.ngsversion.1459714905525.adapt.1900.1.jpg
In Texas, where they are plentiful, 2 variations exist. Go here and click the American pronunciation.

http://dictionary.cambridge.org/dictionary/english/coyote

Heck I clicked it and it only pronounces one. You can deduce the other.


Sometimes in popular culture, hicks, hillbillies, or other "not too smart" and "countrified" folks pronounce it kai' yoot.

Sorry for the thread hijack.
 
IPC increase with jump from 192 core/256 KB Register File Size, to 128 cores/256 KB RFS was around 33%.

Jump in IPC from 128 cores/256 KB RFS, to 64 cores/256 KB RFS is around 50%.

Thats why 3072 CUDA core/GDDR6 GV104, with this layout should be around 65% faster than GP104, exceeding performance of GTX 1080 Ti, in the same thermal envelope as GTX 1080.

This is also the reason why GTX 2050 Ti/GV107 should be very close in performance to GTX 980 Ti. The difference in performance in current generation of games, between GTX 1060 and GTX 980 Ti is 20%. And 2050 Ti is faster than GTX 1060.

In essence next year, we will be able to buy 6 core/12 Thread IceLake CPU from Intel with 35W TDP, and GTX 2050 Ti, both with performance of HEDT computer from 2-3 years ago, and in 1/4th of power consumption.

Actually pretty mindblowing when you consider that there will be fanless design of GTX 2050 Ti...
 
Micron and SK Hynix are going to come up with 16 nm GDDR5 in 2018.

There will be new GPUs that will use this memory, namely: GV107. GDDR5X looks to be reserved for GV106.

Currently 20 nm GDDR5 price is around 6$, and went up from 4.5$, because of demand. We will see in upcoming months increase in GPU prices, because of this price change. Lets just say, that memory subsystem cost for GP107 went from 18$ to 25$, for a GPU that has end price of 149$.

16 nm process will lower manufacturing costs, for all vendors. That is why I expect that we will see 192 bit memory bus on GV107, which should either maintain 149$ price tag, or be 10$ more expensive.

GDDR6 is two times more expensive than GDDR5, at current stage, so it will only be seen in highest price margin GPUs.
8 memory chips, each costing 9$ is quite expensive ;). I expect similar rollout, because of this, for Volta as was with Pascal. GV104 costing at the start around 699$, and cut down version around 450$.

Nvidia pushed itself to similar corner as AMD did, with first GCN. The cores have such massive throughput that you have to have high enough bandwidth to feed them. Memory compression tools are not going to cut it, because 149$ chip will be 100% 1440p Ultra settings ready.

I mentioned before the IceLake. So far it looks like similar rollout as Coffee Lake. First batch of CPUs Q4 2018, and later, in early 2019 - the rest of the stack. It will be hell of of a combo: IceLake and Volta, but as for today, I am getting ready to buy Coffee Lake, and Volta GV107, when it comes out. It will be one heck of efficient machine.

From architecture perspective few slides to understand, how fine grained scheduling Volta has:
NVIDIA-V100-SM-Core.jpg

This is how it looks in simpler manner, without the cores. SM is partitioned into 4 pieces: Sub Cores:
NVIDIA-Volta-GV100-SM.jpg

NVIDIA-Volta-V100-SM-Microarchitecture.jpg

This is how it then filters down into each Sub Core:
NVIDIA-Volta-V100-Sub-Core.jpg

As you can see each Sub Core has L0 cache, and can work on different part of work, independently from other SubCores. Extremely granulated scheduling, on very low-level will increase the efficiency because the cores are going to be fed all of the time.
And lastly: how scheduling looks like in Volta L1 cache:
NVIDIA-Volta-V100-Shared-Memory.jpg


Nvidia did absolutely great job here.
 
Last edited:
It's a shame Apple will continue to ignore NVIDIA, despite how much better their GPUs are.
Nah. The only thing Nvidia has better than AMD is Windows software, and CUDA. Actual hardware IPC is lower than AMD GCN, and Nvidia with Volta actually, just have paired with it. There is no point in looking for Nvidia under Mac OS, software is better for AMD, and you will get better results. Unless you focus on Windows and gaming, then yes, you will get better results with Nvidia.
 
Nah. The only thing Nvidia has better than AMD is Windows software, and CUDA. Actual hardware IPC is lower than AMD GCN, and Nvidia with Volta actually, just have paired with it. There is no point in looking for Nvidia under Mac OS, software is better for AMD, and you will get better results. Unless you focus on Windows and gaming, then yes, you will get better results with Nvidia.

Oh, so Vega can do more than 120 TFLOPs for Deep Learning? I must've missed that.

NVIDIA on macOS is a chicken and egg problem right now. Given the lack of official hardware over the last 4+ years, their investment in macOS software has clearly declined. If Apple was to return to NVIDIA for an official product, then the software quality would improve.
 
Oh, so Vega can do more than 120 TFLOPs for Deep Learning? I must've missed that.

NVIDIA on macOS is a chicken and egg problem right now. Given the lack of official hardware over the last 4+ years, their investment in macOS software has clearly declined. If Apple was to return to NVIDIA for an official product, then the software quality would improve.
There is only one Nvidia GPU that can do 120 TFLOPs of DL operations, and it would NEVER come to Mac, so it is moot point.

Apple software, for ML is processed on FP16, in Metal 2. I think the amount of power available in Vega is enough for this, at this time, at least. Navi will also bring DL tech to AMD GPUs, and Im pretty sure that they will have higher capabilities than Nvidia.

As for software for Nvidia, you forgot one thing. If IPC is the same for both companies, the only differences you will see in software is exactly that: software performance. If Nvidia will have the same quality drivers as AMD has on Apple ecosystem, if software will be equally optimized for both companies, why would you want to pick the GPU from vendor which is more expensive, and gives you nothing in exchange?

Remember: there is, and there will be no CUDA on Mac. So what is the technical reason for switch? The only understandable switch would be switch from Polaris 11, and 10 to GV107, 106 and 104. But iMac Pro, and Mac Pro GPUs should stay on AMD.
 
The rhetorical reply would be that there is no CUDA on Windows nor on Linux, nor will there ever be.
Does Microsoft or Linux have official support for any API?

Apple does have. And that changes a lot.
 
Status
Not open for further replies.
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.