News

AMD bets on gaming purely with RDNA

AMD with RDNA seeks to focus on video games, compared to GCN which was a hybrid design, reducing consumption to increase performance.

This past night / early morning have presented the new AMD RX 5700 XT and RX 5700. The company's new graphics cards are based on the Navi architecture. For the development of Navi, RDNA has been developed, a new design system with new technologies to improve performance and efficiency. Above all, this new design seeks to maximize performance in games to be able to compete directly with NVIDIA.

GCN was a hybrid design between gaming and massive data computing, thinking in this case for professional segments. The problem was that the gaming versions were not up to NVIDIA and had an excessive consumption. It is because they were designed for brute force, regardless of energy efficiency, hence the high consumption. But with RDNA things change, since it has been specifically thought about gaming, optimizing for this market.

First iteration of Navi based on GCN + RDNA

Jumping from GCN to an RDNA design directly is not feasible, because this new process adds technologies that did not exist. There is no optimization for RDNA or it is still very primary and therefore a combination with GCN is sought. The first iteration of Navi will be a hybrid between the two systems, seeking to offer power and adjust consumption. It will not be until Navi 20 when we see purely RDNA graphics and see what this new manufacturing process is capable of.

GCN is a great architecture for heavy math calculations thanks to its great performance in TFLOPS and its parallelization. The Radeon Vega 64, depending on its characteristics, should pulverize the GTX 1080, but it does not succeed. That is because the use of the cores and caches for games is not very good.

Navi improves efficiency in these fields, because according to AMD, it has a new combination of computing units. In addition, it has a new multi-level cache hierarchy and an optimized graphical flow.

amd rx 5700 xt smooth su

This is Navi internally

The company has presented two charts based on Navi. The first is a full implementation of Navi, while the second is a reduced version. Two versions that differ only in the number of Compute Units.

Navi 10 has 40 Compute Units and each of these Compute Units has 64 Stream Processors. A simple multiplication certifies the total 2 Stream Processors of the RX 560 XT.

They are somewhat Vega 64 (4 096 Stream Processors) and Vega 56 (3 584 Stream Processors). The difference is that Navi for each Compute Unit has a second scalar unit that handles mathematical problems. Additionally, it has a second programmer that offers us twice the capacity of instructions than the previous generation. This is what makes it so much more efficient in gaming workloads.

amd navi rdna

Important differences with GCN

GCN has four SIMD16 units. Basically each unit has the capacity to process 16 elements simultaneously, but with a problem, latency and usability. GCN cannot process an instruction in a single clock cycle. It generally takes four clock cycles to run from start to finish. This is not a problem if you perform four-step calculations, like a fused adder, but it is not efficient in simple instructions.

Under this case, if the pipeline cannot correct the cycle loss, the graph becomes inefficient. So GCN is very good for complex instructions like scientific calculations, but not good at games, unless it is channeled very well. And this is where the problem lies. To give its full potential, GCN needs complex workloads and very good scheduling to get its full performance.

amd navi rdna 1

RDNA is more optimized than GCN

To correct this problem, instead of implementing a SIMD16 with four clock instructions, a SIMD32 has been implemented in RDNA. The SIMD32s are dual for a single clock cycle, which makes it much more efficient in games. RDNA is more effective for games than GCN. Allows you to run a full Wave32 on a smaller execution unit, going from a Wave64 on GCN to a single loop SIMD32.

The compiler is allowed the choice via a draw based call, whether it wants to run Wave32 or Wave64. Of course, always under SIMD32 in workload functions. It is possible to improve parallelism, allowing two adjacent Compute Units to be combined. This allows us to create a larger workgroup to try to reduce latency

amd navi rdna 2

Low latency, cache enhancement, and single thread enhancement

Simply explained, it goes from GCN to RDNA to reduce latency, improve single-threaded performance, and improve cache efficiency. It simply offers to do more useful work by Compute Unit and by clock cycle. This is basically the reason why we shouldn't compare 40 Compute Units GCN to 40 Compute Units RDNA.

Now why focus on single threaded performance and efficiency when gaming relies on parallelization. The reason is that even though there are tens of thousands of threads running, it is not easy to keep them in GCN with a wide range of diverse workloads. Here we can glimpse the great changes in RDNA.

amd navi rdna 3

Assuming your instructions don't have too many dependencies, have scalar units, and have double programming on more than one SIMD. This should offer a considerable performance improvement and greater efficiency.

Radeon has also taken something from Ryzen, specifically they have added dedicated L1 cache and the cache loading bandwidth closest to the ALU is doubled. This reduces latency when accessing the cache at all levels. This improves the effective bandwidth, remaining the requested data is kept in cache instead of being obtained from slower memories.

Optimization and improvements for greater efficiency

There are many improvements made. It should be noted that in RDNA, color compression has been improved through the pipeline. In graphics it performs maximum compression, to the best of its ability, to minimize the amount of bandwidth used. The delta color compression algorithm has been improved, allowing shaders to read and write compressed color data directly. The chip display can directly read the compressed and stored data in the memory subsystem.

As we have already commented with the improvements of the CU, it is also sought to increase the available bandwidth and be more efficient in consumption, with respect to GCN.

amd navi rdna 5

Conclusion

RDNA is an architecture optimized by and for gaming, while GCN was a hybrid architecture. AMD has realized that it must compete with NVIDIA in its field, in gaming and that is why it has developed graphics optimized for games. They are thus far from HBM2 memories, much more expensive than GDDR memories.

AMD itself recognizes that, at the same frequencies, RDNA achieves 25% more performance in games than its predecessor. But it is that, if we take into account the amount of CU and that it is based on the 7nm lithography, the improvement can be 50% or more. It will not be until the second iteration of Navi when we see purely RDNA and see what it is capable of.

Show more

Robert Sole

Director of Contents and Writing of this same website, technician in renewable energy generation systems and low voltage electrical technician. I work in front of a PC, in my free time I am in front of a PC and when I leave the house I am glued to the screen of my smartphone. Every morning when I wake up I walk across the Stargate to make some coffee and start watching YouTube videos. I once saw a dragon ... or was it a Dragonite?

Related publications

Leave your comment

Your email address will not be published. Required fields are marked with *

Button back to top
CLOSE

Ad blocker detected

This site is funded through the use of advertising. We always make sure that the advertising is not too intrusive for the reader and we prioritize the reader's experience on the website. However, if you block the ads, part of our funding will be reduced.