Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Oh I agree, they should not have done that. In a perfect world CUDA would be available for AMD cards.

Pretty sure AMD released a tool that converts CUDA to their architecture, so it basically already is. A quick Google search reveals:

https://github.com/ROCm-Developer-Tools/HIP

Edit: Granted, code that is tuned for the NV architecture is likely to not run very well on the AMD architecture, just like code that is tuned for AMD's GPUs might not run as efficiently on NVIDIA.
 
just to point out that "gaming" engines are increasingly the environment in which professional applications are based - the better your "gaming" performance, the better your VR / Archviz / 3D workspace performance, which is a lot more important to the experience of speed and utility of a tool, than shaving a few seconds off a queued task.
 
  • Like
Reactions: Flint Ironstag
Say no more.... What's the point of translating code that runs poorly?

I should've said "out of the box". Once you have it running on your AMD GPU, then you can tune it so that it runs better of course, without having to do all the hard work of porting the CUDA code in the first place.
 
Say no more.... What's the point of translating code that runs poorly?
AMD HIP translates 96% of CUDA code automatically. That rest 4% is your required optimization. And it translates this code not only for AMD GPUs, but ALL GPUs.
 
Say no more.... What's the point of translating code that runs poorly?
That's the price for all of these tools, performance is usually very poor. They also don't work on every project out there and good performance is usually only achieved on proof-of-concept projects. The rest is manual work that has to be done. This might be worth it when building a software base, which will serve for years to come. Doing it for projects to verify results or try some modifications is usually not worth it. I'd say maybe we keep using 10% of code from other projects and the rest ends up in the trash after we've tested and played with it. The timeframe for this is usually 3 to 6 months max and then the cycle repeats.
 
That's the price for all of these tools, performance is usually very poor. They also don't work on every project out there and good performance is usually only achieved on proof-of-concept projects. The rest is manual work that has to be done. This might be worth it when building a software base, which will serve for years to come. Doing it for projects to verify results or try some modifications is usually not worth it. I'd say maybe we keep using 10% of code from other projects and the rest ends up in the trash after we've tested and played with it. The timeframe for this is usually 3 to 6 months max and then the cycle repeats.
Have you translated any code from CUDA to ROCm OpenCL with HIP to say something like that?

Has ANYONE here ever translated anything from CUDA to OpenCL with HIP, to talk about its performance?

I do not need your answers. I know them already.
 
I do not need your answers. I know them already.
The things you know are very little unfortunately. Drinking some Kool-Aid for any manufacturer doesn't really get you very far in the real world. I have no doubt AMD works for your specific area of application and that's what matters for each individual case. Keep believing everyone is wrong and you're the only one right. If you research, publish, have to colaborate with people from around the world on different projects it becomes somewhat apparent that there are solutions out there that might work in some cases, but not on others. If it's so great and works without problems, why isn't everyone using it? Why are we pretty much locked in a NVIDIA eco system to be able to do the things we need to do without adding tons of work as overhead? Unless everyone steps out of it at the same time, nothing is going to change. AMDs GPU marketshare fell from 2016 to 2017. We'll have to wait for the 2018 numbers but I doubt it'll climb.

Tell you what, why don't you write and publish a paper and speak at a HPC conference to let everyone know how to do it right? Also let people know how to be able to save tons of money by not buying NVIDIA clusters and replace them with AMD and where they get the support from AMD on a level that NVIDIA provides. Please let us know when and where you publish it and what conference you'll speak at. I'll try to be there and I can guarantee a lot of people from the scientific community all around the world will be very interested in your work. This will get you much further than rambling on the internet claiming things that are clearly the opposite of what the real world is like. Don't take my word on it, visit universities and research centers and have a look for yourself.
 
The things you know are very little unfortunately. Drinking some Kool-Aid for any manufacturer doesn't really get you very far in the real world. I have no doubt AMD works for your specific area of application and that's what matters for each individual case. Keep believing everyone is wrong and you're the only one right. If you research, publish, have to colaborate with people from around the world on different projects it becomes somewhat apparent that there are solutions out there that might work in some cases, but not on others. If it's so great and works without problems, why isn't everyone using it? Why are we pretty much locked in a NVIDIA eco system to be able to do the things we need to do without adding tons of work as overhead? Unless everyone steps out of it at the same time, nothing is going to change. AMDs GPU marketshare fell from 2016 to 2017. We'll have to wait for the 2018 numbers but I doubt it'll climb.

Tell you what, why don't you write and publish a paper and speak at a HPC conference to let everyone know how to do it right? Also let people know how to be able to save tons of money by not buying NVIDIA clusters and replace them with AMD and where they get the support from AMD on a level that NVIDIA provides. Please let us know when and where you publish it and what conference you'll speak at. I'll try to be there and I can guarantee a lot of people from the scientific community all around the world will be very interested in your work. This will get you much further than rambling on the internet claiming things that are clearly the opposite of what the real world is like. Don't take my word on it, visit universities and research centers and have a look for yourself.
I asked you a very specific simple question.

Have you, or anyone of you, here, ported ANYTHING from CUDA to OpenCL using HIP, in your life, to say anything about performance of the code.

You have answered: "drinking company's cool-aid...".

I told you. I did not needed your answers. Because you HAVE NEVER PORTED ANYTHING FROM CUDA to OpenCL using HIP and regardless of that - you talk about performance of the code, and how it runs.

I do not care about what HPC needs. There is ton of clueless people in the industry who believe they know everything, even if they have NEVER used products from different company, yet talk about their usability. About performance. About code that runs inefficiently.

If you want to at least SOUND smart - port code from CUDA, to OpenCL, optimize it and test it. Then talk.

Here is nice overview on ROCm on Lunix. https://www.phoronix.com/scan.php?page=news_item&px=ROCm-Compute-Stack-Overview

And one little last bit. From the evidence gathered it appears it is not me who drank some manufacturers cool-aid...
 
And I've answered that question before. Usually I do not comment on things I have not done, period. This is usually how serious scientific research works.
And please do not spread lies about other people. It's bad enough you do it about other things. Oh how I love the internet. Your attitude here shows you know little, it might be true you don't care about others, so speak for yourself and not other people and their work. Plenty of papers out there pointing out proof-of-concept to real-world examples. Everyone can check those out (might have to pay for it on some portals). Again, publish, speak at a conference, let people know. Doubt it will happen, but there needs to be a separation from baseless internet ramblings to seriousness and applications in use.

What I and many people need are solutions to make our work easier. We don't care where it comes from. The current market situation is showing a clear trend.
 
OpenCL is the solution. But the community chooses to be locked-in.
That might indeed be the case, yes. But the problem is, there are good reason to choose to be locked in (see my other posts). So AMD has to seriously up their game and even then, it's not something that's gonna happen in a year, or two or anytime soon. I see this among my colleagues (focused on big data), most are not even willing to consider anything else due to additional work required. We need real alternatives that work with the push of a button.
 
That might indeed be the case, yes. But the problem is, there are good reason to choose to be locked in (see my other posts). So AMD has to seriously up their game and even then, it's not something that's gonna happen in a year, or two or anytime soon. I see this among my colleagues (focused on big data), most are not even willing to consider anything else due to additional work required. We need real alternatives that work with the push of a button.
In properly optimized code situation, AMD GPUs are as fast as Nvidia's. No matter what AMD will do with their hardware(ALUs are just that - ALUs). You haven'ported properly your code, or even used AMD GPU with properly ported code if you do not know this. There is plenty of things that AMD does not have solution for, yet - lack of libraries. And those cannot be ported as easily as simple CUDA code.

You cannot say to AMD: "Up your game", if in the same paragraph you say: most are not even willing to consider anything else due to required additional work. Im sorry but that proves the point of what I have said: HPC industry is full of clueless people.

No matter how good ROCm platform will be - it will always be second choice for people in this industry because of the Mindshare Nvidia has. Period. I've been here for past 5 years. Im done with explaining to people simplest things.

Best example? Ray Tracing. AMD has tremendous tool: ProRender engine that allows you to do real-time ray tracing on EVERYTHING: Intel, AMD, Nvidia, PowerVR, ImaginationTechnologies, etc. GPUs.

NOBODY is interested. But when Nvidia announces RTX - ITS REVOLUTIUON!!!111oneoneone We need to buy more Nvidia GPUs!

Simple as it can be.
 
Just to recap (as I am about to decide which GPGPU path to go):
Historically (that is, looking back some 8 years-ish or so) there have been two solutions: OpenCL and CUDA. Now, I looked into both without getting too involved, and from what I researched I came to the following conclusion:
CUDA seems to have won the GPGPU wars, at least for now. CUDA appears to be superior, mostly because it is way easier to program and in addition was/is better documented/supported.
This view is also corroborated by the fact that CUDA is the industry's de-facto standard, as @GrumpyCoder mentioned.

That said, AMD seems to have decided to not accept this and at least try to challenge nVidia for its ML/AI crown.

ROCm seems to be an interesting approach. Has anyone gained any experience with CUDA and ROCm? Say, if you had to develop some new piece software (as opposed to port some existing system) - what are the benefits / caveats using either library?
 
I am - time to upgrade about 30 of them.
I see this a lot, curious what we'll do for our clusters. I'll try to rent in the cloud for my projects in the near future, but can't obliviously speak for everyone in the world (over even my colleagues). Just need the Quadro cards in some cases. Then again, since the whole world has no clue what they're talking about, maybe everyone should be educated and learn how to do things properly. Then NVIDIA would be bankrupt next year and we wouldn't have to buy these expensive cards. :p
 
Say, if you had to develop some new piece software (as opposed to port some existing system) - what are the benefits / caveats using either library?
I briefly mentioned this, if you're starting new or your code base will remain for years and you don't plan to switch, pick whatever suits you most. If you don't need to port, if you don't need to scale to huge clusters and if you don't need support from the vendor, it doesn't really matter, assuming no platform is discontinued at some point.

You're right, CUDA is a little easier to get into, but that's maybe just for beginners. I see this with my bachelor/master computer science students. If you're a little familiar with game engines, I'd say it's like the difference between getting started with Unity vs Unreal.

As far as performance goes, some things are better in OpenCL, some in CUDA. It all depends on what you want to do. I'd recommend making a list or decision matrix based on your specific requirements and go from there. There are plenty of papers available on the subject and if you're only working on one project it's well worth the time to do the research in advance and then make a decision. Here's one I've looked at a while ago (at least that's what my reference manager tells me): https://dl.acm.org/citation.cfm?id=3110356.

But again, there's plenty more available. I usually start with a google scholar search for publications and go from there.
 
Thx for the link, but my student times have passed. Fyi the article is also available here, w/o paywall: https://arxiv.org/pdf/1704.05316.pdf

OpenCL is out of the question; its either ROCm or CUDA. Availability of nVidias in the cloud suggest going this path, however, vendor lock-in and the fact that my main machine is a Mac w/o intention to change that (no nVidia Webdrivers for Mojave available as of yet) suggest AMD is the way to go.

Edit: seems no ROCm support for OSes other than Linux. Guess the decision is clear then

Thanks to all
 
Last edited:
I briefly mentioned this, if you're starting new or your code base will remain for years and you don't plan to switch, pick whatever suits you most. If you don't need to port, if you don't need to scale to huge clusters and if you don't need support from the vendor, it doesn't really matter, assuming no platform is discontinued at some point.

You're right, CUDA is a little easier to get into, but that's maybe just for beginners. I see this with my bachelor/master computer science students. If you're a little familiar with game engines, I'd say it's like the difference between getting started with Unity vs Unreal.

As far as performance goes, some things are better in OpenCL, some in CUDA. It all depends on what you want to do. I'd recommend making a list or decision matrix based on your specific requirements and go from there. There are plenty of papers available on the subject and if you're only working on one project it's well worth the time to do the research in advance and then make a decision. Here's one I've looked at a while ago (at least that's what my reference manager tells me): https://dl.acm.org/citation.cfm?id=3110356.

But again, there's plenty more available. I usually start with a google scholar search for publications and go from there.
I don't disagree - but this comment seems to assume that one is creating an independent standalone application.

In my field (ML and AI), there are huge libraries of code to help you build your application. If you need an FFT or a deep neural network - you don't code CUDA operations for the FFT or the DNN.

You use higher level functions from the supported cuFFT or cuDNN libraries. Or even higher level libraries like TensorFlow that call cu* libraries.

So for us, it's irrelevant whether AMD or Nvidia hardware is faster or cheaper. The fact is that the support ecosystem for CUDA is far richer, and in the overall picture I can spend six times more for an Nvidia GPU and deliver solutions faster and cheaper than writing my own for AMD GPUs.

Sorry, Koyoot - but the cost of the GPUs is a really minor part of the cost of the project. The CUDA ecosystem saves us far more than the difference in price between a Vega and a Volta.
 
Sorry, Koyoot - but the cost of the GPUs is a really minor part of the cost of the project. The CUDA ecosystem saves us far more than the difference in price between a Vega and a Volta.
Which is why you ultimately are wasting money. Because once you port your CUDA code using HIP to any other paltform, and you are good to go with it - everywhere. AI and ML are very well supported on AMD platform, and Vega 64 is offering 95% of performance of GV100 chip, but at the same time costs 1/6th the price. Buy 30 GPUs at once and you get the difference(900 000$ vs 150 000$).

At least read about the functionality on ROCm and AMD platform.
 
I don't disagree - but this comment seems to assume that one is creating an independent standalone application.
It's what I assumed, yes. You're absolutely right about what you say. I'm active in the same field (my original background is in graphics and image processing with 10+ years focus in medical image processing though) and it seems like we're using the same tools. I agree with what you say, it just saves so much time.

The actual hardware cost is a fraction of what goes into projects. Every bit of time saved during development is less money spent. As far as hardware costs and discount goes, we're getting 50% off from NVIDIA for Jetson boards we put in our drones, cars and robots. I can only imagine how much discount cloud operators get when buying cards in the thousands year after year.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.