GPU VEGA NANO

Boil

SFF Guru
Nov 11, 2015
1,253
1,094
I meant like if a full block also cools the m.2 drives, would it (most probably) restrict access to the SSDs because a waterblock is in the way?

I remember seeing a pic of a prototype with Samsung Pros in them, but apparently google lost the image ._.

I would think that a hypothetical full-cover water block would also cover the M.2 SSDs...

Unless, the M.2 drives were mounted on the rear of the PCB...?

I dunno, I guess time will tell...!
 

Boil

SFF Guru
Nov 11, 2015
1,253
1,094
Okay, this image of the Polaris-based Radeon Pro SSG GPU shows that the M.2 drives are on the front side of the board; so any waterblock would have to either cover (and hopefully also provide cooling to) the SSDs, or have openings on the block to allow access to the M.2 SSDs...

I can only imagine that the M.2 SSDs would also be in the same location on the newer Vega-based version of the GPU...

 

EdZ

Virtual Realist
May 11, 2015
1,578
2,107
in a recent ltt video (amd let me help build their server. or something like that on floatplane) Linus metioned that unline nvidia quadro that recuire nvlink to share vram,the new vega instinct can share and acces vram simple over pcie. So would it be possible that a simple pcie ssd would be all that is required as opposed to an on card m.2 solution?
AMD is a strong proponent of what they call a Unified Memory Architecture where the CPUs and GPUs all share memory addresses, so any of them can access the memory attached to other devices (rather than how it is traditionally done where the blocks are copied from one memory space to another).
On the physical side, this is a case of AMD's propensity for taking an existing technique and giving it a fancy marketing brand. Existing GPUs, AMD and Nvidia, professional and consumer, have had the ability to use DMA for many years to directly access data across the PCIe bus (and even dip into main system RAM and jump over to the SATA bus) without involving the CPU. That's what DMA is.
On the logical (memory addressing) side, things are less simple. If you map memory outside the on-board vRAM transparently, applications will start dipping into pools with massive (orders of magnitude) latency penalties without knowing about it. If you rely on explicit DMA, applications may not use those pools at all [1] or may just use traditional transfer requesting because they need that data available at the lowest latency. It's only when you start adding multiple paths to the same storage (e.g. parallel PCIe and NVLink or Infinity Fabric) that using a single logical pool but with explicit compartmentalisation starts making more sense than normal DMA, by moving the memory access decisions from the application to the driver to allow more intelligent link utilisation.

As an aside, you can do GPU-to-GPU over the PCIe bus for CUDA, but I'm not sure if that's exposed on the GeForce side outside of SLI.

[1] Note that gaming is one situation where data is already cached aggressively into vRAM - to the point where the only real time you will not see your vRAM 'usage' maxed out is when the entire level/chunk contents are smaller than your available vRAM - so going "hey, you can now access system RAM and backing stores through the GPU memory pool!" is likely to be met with "OK, but we're already dumping all our stuff to vRAM anyway because we want to avoid accessing those stores directly in the first place"
 

BirdofPrey

Standards Guru
Sep 3, 2015
797
493
On the physical side, this is a case of AMD's propensity for taking an existing technique and giving it a fancy marketing brand. Existing GPUs, AMD and Nvidia, professional and consumer, have had the ability to use DMA for many years to directly access data across the PCIe bus (and even dip into main system RAM and jump over to the SATA bus) without involving the CPU. That's what DMA is.
First off, I realized I made a mistake with the name. I was referring to Heterogeneous System Architecture. The point of which is to allow zero copy operations by sharing a single virtual address space rather than copying between disparate address spaces. I'm not sure you'd exactly call it AMD renaming an existing technique, though, as I mention, it is more or less an extension of NUMA to different types of processors at the same time (NUMA being where each CPU has its own memory pool, but they all share the same virtual space so every CPU can access every bank).

On the logical (memory addressing) side, things are less simple. If you map memory outside the on-board vRAM transparently, applications will start dipping into pools with massive (orders of magnitude) latency penalties without knowing about it. If you rely on explicit DMA, applications may not use those pools at all [1] or may just use traditional transfer requesting because they need that data available at the lowest latency. It's only when you start adding multiple paths to the same storage (e.g. parallel PCIe and NVLink or Infinity Fabric) that using a single logical pool but with explicit compartmentalisation starts making more sense than normal DMA, by moving the memory access decisions from the application to the driver to allow more intelligent link utilisation.
Yeah, there's always issues when applications aren't aware of the underlying system architecture.

[1] Note that gaming is one situation where data is already cached aggressively into vRAM - to the point where the only real time you will not see your vRAM 'usage' maxed out is when the entire level/chunk contents are smaller than your available vRAM - so going "hey, you can now access system RAM and backing stores through the GPU memory pool!" is likely to be met with "OK, but we're already dumping all our stuff to vRAM anyway because we want to avoid accessing those stores directly in the first place"
Yeah, fair point. I wouldn't be surprised if that's part of the reason HSA hasn't really gone anywhere (the other part, of course, being that Nvidia tends to shun AMD stuff, and likes to push their own proprietary technologies, and Intel can also be picky).
Do remember though that GPUs are used for more than just gaming. Gaming being a real time process basically HAS to put the required data as close to the GPU as possible, so even with MMU access to system memory, still has to cache locally. The type of rendering they do for CGI and just general video editing isn't time sensitive, so doesn't need to be aggressively cached, but tends to involve larger datasets meaning storage in system memory or storage. And even though it isn't time sensitive (frames taking too long in a game causes stuttering, frames taking too long for pre-rendered whatever just means the job takes longer), it's still beneficial to reduce the time taken, so you can get more work done.
====
Anyways, regardless of how moving data around happens, the PCIe bus still represents a latency bottleneck and potential bandwidth bottleneck. I would say the Radeon SSG is something akin to when Intel separated the CPU cache from the system bus so that cached data could be accessed without interference from main memory or device access.[/quote][/quote]
 
  • Like
Reactions: Phuncz

EdZ

Virtual Realist
May 11, 2015
1,578
2,107
The point of which is to allow zero copy operations by sharing a single virtual address space rather than copying between disparate address spaces.
That's what DMA is: directly accessing memory/storage locations on other devices.
Yeah, fair point. I wouldn't be surprised if that's part of the reason HSA hasn't really gone anywhere (the other part, of course, being that Nvidia tends to shun AMD stuff, and likes to push their own proprietary technologies, and Intel can also be picky).
It's far from not gone anywhere, it's been in active use for years in HPC & renderfarms for workloads where it's of benefit. You even have crazier stuff like rDMA where devices can query memory stored on a completely different machine, or PGAS for unified memory across a computing cluster (an old enough concept that it has FORTRAN implementations).
Anyways, regardless of how moving data around happens, the PCIe bus still represents a latency bottleneck and potential bandwidth bottleneck. I would say the Radeon SSG is something akin to when Intel separated the CPU cache from the system bus so that cached data could be accessed without interference from main memory or device access.
For the on-board SSDs on the Radeon SSG, those SSDs still sit on the other side of a PCIe bus, the bus just happens to route from the GPU to the SSDs directly on the card PCB rather than running across the motherboard. If the GPU and SSDs are on the other side of a CPU-PCH link that could limit to PCIe x4, but if they're on the same root (ew.g. GPU and SSDs both on the CPU hub or both on the chipset hub) then unless that hub itself has some bizzarre limitation there should be no bottleneck. One unknown is the configuration of the PEX8747 bridge chip. That has a 16x interface on one side and two bifurcatable 16x interfaces on the other, and AMD have not documented how it is connected between the GPU, SSDs and PCIe card-edge. I would expect the 16X side to connect to the 16X interface on the GPU, and the output to be split between the card-edge (at 16x) and the two SSDs (the other 16x split into two 8x, and 8x split into two 4x). This would minimise any bottlenecking other than bandwidth used between the GPU and host machine reducing bandwidth available between the GPU and the SSDs, but nothing above what you would be limited by using DMA anyway.
 

Kmpkt

Innovation through Miniaturization
KMPKT
Feb 1, 2016
3,382
5,935
GG Vega Nano with the power consumption numbers coming out of Vega 64 and 56. There is no way this thing performs without disgusting power spiking or massive under clocking : (

http://www.tomshardware.com/reviews/amd-radeon-rx-vega-64,5173-17.html

300W peak draw with 400W spikes and performance on par with the 1080. By comparison the 1080 pulls 180 peak draw with spikes in the 250W range.
 

TinyHH

Efficiency Noob
Jun 6, 2017
7
16
Unfortunately i don't think the Vega Nano will be much faster than the R9 Nano i already enjoy to justify an upgrade. Hope they prove me wrong and take my money.
 
  • Like
Reactions: AleksandarK

W1NN1NG

King of Cable Management
Jan 19, 2017
616
532
based off the benchmarks I wouldn't waste the money going amd for this for the 64 and 56, based on bitwits benchmarks they didnt outperform either the 1070 or 1080 and with powerdraws like that it looks like nvidia might be our home free
 
  • Like
Reactions: AleksandarK

Phuncz

Lord of the Boards
SFFn Staff
May 9, 2015
5,842
4,906
apparently the Vega 56 could be undervolted rather well and it resulted in a 35% lower power consumption. That would put it in a good range of the suspected 150W TDP but it would indeed not come close to a GTX 1080 performance-wise.

I'll be hanging on to my GTX 1080 Mini for atleast the next generation, even though I'll have to miss FreeSync until the next Radeon.
 
  • Like
Reactions: AleksandarK

Kwirek

Cable-Tie Ninja
Nov 19, 2016
186
198
apparently the Vega 56 could be undervolted rather well and it resulted in a 35% lower power consumption. That would put it in a good range of the suspected 150W TDP but it would indeed not come close to a GTX 1080 performance-wise.

I'll be hanging on to my GTX 1080 Mini for atleast the next generation, even though I'll have to miss FreeSync until the next Radeon.

Unfortunately, what I read on GamersNexus was that they couldn't properly undervolt the card outside outside synthetic tests. But who knows, it might get better?
 
  • Like
Reactions: AleksandarK

W1NN1NG

King of Cable Management
Jan 19, 2017
616
532
apparently the Vega 56 could be undervolted rather well and it resulted in a 35% lower power consumption. That would put it in a good range of the suspected 150W TDP but it would indeed not come close to a GTX 1080 performance-wise.

I'll be hanging on to my GTX 1080 Mini for atleast the next generation, even though I'll have to miss FreeSync until the next Radeon.
just buy a gsync Nvidia will always be on top that's always the way its been power consumption wise for gpus and temps.
Reviewers are trying to justify them based off price point, but I may save 50 bucks now, but I might spend 50 bucks later on down the road with the power the things drawing, so
 
  • Like
Reactions: AleksandarK

MarcParis

Spatial Philosopher
Apr 1, 2016
3,629
2,722
Pfff vega is a big failure...especially versus R9 fury/nano....just so frustrating!
2 years and rx vega 64 is just +30% performance above r9 fury.
Rx vega 56 is just +14% vs r9 fury...
...
...worse....
Performance / watts is pretty close to R9 nano/fury....i'm so disappointed!

AMD is facing exact contrary of previous situation : very good CPU and GPU below competition...

Source : http://www.hardware.fr/articles/968-16/recapitulatif-performances.html
 
  • Like
Reactions: AleksandarK

Phuncz

Lord of the Boards
SFFn Staff
May 9, 2015
5,842
4,906
just buy a gsync Nvidia will always be on top that's always the way its been power consumption wise for gpus and temps.
Reviewers are trying to justify them based off price point, but I may save 50 bucks now, but I might spend 50 bucks later on down the road with the power the things drawing, so
The G-Sync equivalent of the screen I have is about $/€ 1300, I'm not going to buy a new screen any time soon. Especially when Nividia could just add FreeSync (or Adaptive Sync) support via a driver, while G-Sync is proprietary to Nvidia.
 

Soul_Est

SFF Guru
SFFn Staff
Feb 12, 2016
1,536
1,928
Pfff vega is a big failure...especially versus R9 fury/nano....just so frustrating!
2 years and rx vega 64 is just +30% performance above r9 fury.
Rx vega 56 is just +14% vs r9 fury...
...
...worse....
Performance / watts is pretty close to R9 nano/fury....i'm so disappointed!

AMD is facing exact contrary of previous situation : very good CPU and GPU below competition...

Source : http://www.hardware.fr/articles/968-16/recapitulatif-performances.html
I read the reviews as well. It is disappointing to see that happen although according to Anandtech, it's still GCN. This is what AMD gets from botching the ATI acquisition years ago. Qualcomm has the talent that they desperately need. That said however, the preliminary review on Phoronix show that it does well under Linux with the open source driver stack and will only improve from there. For those reasons and the open driver stack, I'll get one or a Raven Ridge chip when I can.
 
Last edited:

3lfk1ng

King of Cable Management
SFFn Staff
Bronze Supporter
Jun 3, 2016
906
1,713
www.reihengaming.com