Nvidia has revealed the RTX 4060Ti and 4060 to the world today. Pricing starts at $299 for the non-Ti model, and rises to $499 for the 16GB Ti variant. Nvidia is stating that these cards are targeting the high frame rate 1080P gaming community, and will be available on Mat 24th for the 4060Ti 8GB, and in July for the other two cards. Let’s dive into this announcement by starting with the specs.
The RTX 4060Ti is clearly going to be the more performant of the two GPUs as it has an additional 1,280 CUDA cores, slightly higher boost clock, and a substantially higher base clock. Interestingly, both cards have a 128-bit data bus to their GDDR6 memory. Narrower busses like this are usually reserved for sub $200 GPUs, and the RTX 3060Ti had a 256-bit bus. However, the 4060Ti has 32MB of L2 cache as compared to the just 4MB of L2 cache on the RTX 3060Ti.
This is hardly the first time we’ve seen a large cache buffer used to increase performance in this method. AMD is using it on their current GPUs, and both the Xbox 360 and Xbox One S had similar setups. Nvidia is claiming that the combined and optimized performance will exceed the 256bit bus of the older 3060Ti with a effective performance of 554GB/s compared to the RTX 3060Ti’s 448GB/s.
So how does this work?
A GPU has different layers of memory it can access with each layer getting further away from the CUDA cores that do the work. Every step further, makes it slower. Let’s look at it from a human perspective. Think of doing research. L1 Cache is the same as having the data you need sitting in front of you (L1 Cache) at your finger tips. L2 cache is still fast, but now that same data is in a drawer in your desk. GPU VRAM is similar to having the necessary data on a bookshelf across the room. In our example, system memory is akin to having box in another room with all the data you need. Finally, when you need to get up, go outside, and take a bus to your local library to get your data, you’re akin to having the data you need on a SSD or HD.
The closer the data, the faster. The bigger the desk, the more you can store close to you. Nvida has just made your desk 8 times bigger albeit at the cost of reducing how much you can carry back and forth to that bookshelf in your room by half. As long as are smart about the data you put on your desk, you can work faster.
One aspect that catches the attention of the SFF community is the reduced power usage. Nvidia rates the 4060Ti at 160 watts TGP, where as the 3060Ti is 200 Watts. Average gaming power pull is rated at 140 watts for the 4060Ti, where as the 3060Ti pulls 197 watt. I can attest first hand to the power draw of the 3060Ti as I’m typing this on a system configured with a RTX 3060Ti in the form of a MSI Aero ITX model. It certainly does draw those power levels if not a bit more. The single fan design while convenient for small cases, can get a bit loud. Simply put, I can hear it through my headphones. The GTX 1070 MSI Aero ITX pulled similar power numbers to those claimed for the 4060Ti and was dead silent at below 50C, and barely audible while gaming with good ventilation.
From a raw gaming performance standpoint, Nvidia is claiming 22 TFLOPs of raw shader compute power. This is just a bit more than the RTX 3070 has or roughly 15% faster than the 3060Ti. It’s not the gain in raw performance that fans were hoping for as the 3060Ti out performed the RTX 2080 when it launched. Nvidia is counting on their frame generation technology to make up the performance difference and claims that users will experience a full 70% performance improvement over the RTX 3060Ti.
Frame generation, for those that don’t know, creates artificial frames to smooth out games. This is very similar in concept to turning on image smoothing on your TV to make a cinematic movie feel like a soap opera. While this appears to work mostly ok at this point for single player games, many users of competitive games have avoided it due to input latency. When the artificial frame is generated on screen, sudden inputs may be lagged as player truly only controls their inputs during the actual render stage In other words, while your game might be running at 60FPS instead of 30FPS, it will still control like a 30FPS game.
While the raw shader power may leave those hoping for an RTX 3080 killer at $400 disappointed, Nvidia is claiming that the RT cores offer nearly 60% higher performance and the Tensor cores offer about 170% increase in performance over the RTX 3060Ti. Additionally the RTX 4060Ti supports AV1 encoding and decoding which can provide a big boost for those workloads.
Of course all of this has to be tested to be proven. Check back for our official review soon.
Check out Nvidia’s RTX 4060 and 4060Ti announcement by CLICKING HERE.