A few nit-picks:
This black shield in the center of the card is the false cover that needs to be popped off should you ever wish to take the take the GPU apart.
The adhesive-attached front plate only needs to be removed to disassemble the
cooler into its individual components. To remove the cooler entirely form the card (e.g. for watercooling) you only need to remove the screws in the backplate and the whole thing comes off, same as with the previous FE and reference cards.
From
Derbauer's testing, the RTX cards respond better to OCed memory and stock GPU core clocks than a GPU core overclock and stock memory. Slightly better performance than a max power target 100% fan OC, but with not noticeable increase in power consumption, so likely a good choice for SFF.
Instead, most everyone prefers the use of M.2 SSDs as they create no clutter and consume no space. As a result of this decision, many users are running their PCIe 3.0 x16 lanes at x8 with the M.2 drive(s) running at x4 (you can confirm this using
GPU-Z and looking at the Bus Interface). This causes no measurable impact on performance for previous generation cards like the GeForce GTX 1080ti but with the GeForce RTX 2080ti, there are some small but measurable losses.
On Intel:
As far as I am aware, no consumer socket (Hx series, AKA LGA 115x) boards feed the m.2 slots with CPU lanes. ALL feed them with lanes from the PCH, which has its own dedicated DMI 3.0 link (similar to PCIe 3.0 x4) to the CPU that is not shared by the CPU PCIe lanes, laving the x16 PCIe slot unencumbered. This is because Intel's PCIe RAID works with chipset lanes, but not CPU lanes.
Things are a bit different on the 'enthusiast'/HEDT Xx99 platforms (which practically means the two ASRock ITX boards). There, there are plenty of CPU PCIe lanes to feed multiple m.2 slots along with an x16 slot for a GPU, and the CPU supports VROC (RAID on CPU PCIe lanes). However, the ASRock x299 ITX/AC has only two of it's m.2 ports connected to CPU lanes. The third is connected to the PCH, so it can be used for an Optane transparent cache.
On AMD:
With Ryzen, a single m.2 slot can be fed by an extra x4 PCIe 3.0 link from the CPU in addition to the x16 PCIe 3,0 link used for a GPU (or other card). Any additional m.2 slots are instead PCIe
2.0 (not 3.0!) and fed from the chipset. The single theoretical exception are the A300/X300 'un-chipsets', where there is no chipsets and everything is connected to the CPU directly (as it has a handful of SATA and USB links on board). This in theory leaves the PCIe 3.0 x4 link usually occupied by the chipset free for an m.2 slot, but I have not seen a board do this in practice.
With Threadripper/Epyc, there are plenty of CPU PCIe 3.0 lanes available, but these are a pretty poor choice for gaming i nthe first place, so rather a moot point.
Worse yet, if 12nm is replaced by 7nm(Ampere) by the start of 2020, this generation may never truly get a chance to shine.
The chance of 7nm being able to produce large dies (like the Pascal and Turing cards) anytime soon seems pretty low. The physics issues that TSMC face are identical to those Intel are facing with 10nm, and have also only been able to produce similarly small dies at acceptable volumes. 7nm may produce efficient small-die GPUs, but high-end GPUs are going to be sticking to larger processes or commanding very high prices (e.g. the Radeon Instinct MI60, at 330mm^2 around the size of the GTX1070's GP104 but with a pricing target in the GV100 realm). On top of process scale issues, demand will also provide upward pricing pressure: Apple are eating all TSMC's limited output, and next in line are AMD for low-volume-high-margin parts (and with a departure from the shared-die approach used on Zeppelin to Epyc getting a separate die from 'Zen 3' likely meaning those parts will remain on 14nm or move to Samsung's 10nm/8nm which are of the same feature size as 14nm).
tl;dr a 7nm Ampere may pick up the 'lower end' (2060 on down) but are unlikely to occupy the higher end range the 2070/2080 do. Nvidia are likely making thin margins (remember their sales are mostly die-to-AIB rather than FE cards) on TU102 unless yeilds are truly exceptional*, a similar performance class GPU (even an unchanged die-shrink) on 7nm would cost more, not less; cost/transistor has been rising since 28nm.
DLSS is being advertised as a way to make supported games run approximately 40% faster.
How? From what I understand, DLSS appears to be rendering the scene at a lower resolution, upscaling it, and using AI to make the picture closely match the resolution that it’s set to emulate.
In most cases, the quality difference won’t be noticeable, but the supporting title will run at higher framerates than it would at that higher resolution.
In theory, it sounds very compelling -an absolute dream for improved performance in 4k/VR, but there isn’t much information to go off of. Nobody quite knows how this technology works so we’re left scratching our heads and guessing at this point.
How it works is (relatively) simple: Nvidia uses their big SaturnV Turing-powered supercomputer to render demo runs of a game to two resolutions simultaneously: a render target resolution (e.g. 2560x1440, AKA 'low res') and a 'final' resolution (e.g. 3840x2160 AKA 'high res') with 64x SSAA (the SSAA here is to produce nice alias-free training images for the NN, because aliasing is high frequency noise that the NN could mistakenly view as a desired output). A NN is then trained on that mountain of frames to go "if you see feature X in 'low res', it should look like the same location in 'high res'". With sufficiently variation in training frames (i.e. the demo loop traverses all levels, views every entity in every angle in as many lighting conditions as possible, etc) you have a NN that takes a
locally rendered 'low res' frame, and will spit out a 'high res' frame. In short, it's a 'smart' upscaled tuned to a specific game, but tuned by brute force rather than manual tweaking. The massive amount of work done to train the NN (and produce its training datasets) is all done offline only once, to produce a relatively lightweight 'inference' NN that can be run by everyone in real-time.
At this point, you might be asking, “Why not go wireless?” and the answer is simple, there doesn’t exist a consumer-grade wireless solution that is safe to mount on your head that uses little power to transmit that next level of information.
It;s entirely down to there being n system with an acceptable combination of high bandwith and low end-to-end latency. Safety has nothing whatsoever to do with it as a: RF is non-ionising, so causes only surface heating (as would holding a warm mug to your head) and b: the transmitter is across the room, not on your head.
Personally this would be a 'pro' too: it's rigid as a board, no 'GPU sag' possible unless your rear PCIe bracket itself is flimsy enough to bend!
----
* This is into unrelated wild-ass-guess territory, but my suspicion is that abnormally high Turing yeilds early in the development cycle may well be the reason for the rushed release. Nvidia may have been expecting to have only a handful of viable dies to feed the Quadro RTX cards and to later sell off the lower binned remnants as a 'Titan RTX', but ended up with a glut of quality dies and popped out the GeForce RTX series at short notice. This would explain why the PCBs are Quadro-grade overspecced (not enough time to design newer consumer-grade PCBs, and why the RTX 2070 came out with a delay), the coolers are overbuilt (not enough time to pare the design down for efficient manufacture), why the lower end of the range is absent (Ampere would have lacked RT and Tensor cores entirely), and why RT and DLSS features are only starting to be implemented in engines (intention was for the Quadro RTX cards to seed development before a GeForce RTX launch in later generations). As for
why they would rush them to market? The combination of high GPU pricing that could support an experimental card launch, and complete lack of competition in the high end, lets them jump the gun on hybrid rendering. It's also a bet that's worked out well for them in the past after all (Hardware T&L, unified shaders).