6

Arm GPUs built on new 5th Gen GPU architecture - Announcements - Arm Community b...

 1 year ago
source link: https://community.arm.com/arm-community-blogs/b/announcements/posts/arm-gpus-built-on-new-fifth-gen-architecture
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Arm GPUs built on new 5th Generation GPU architecture to redefine visual computing

GPU-post-image.png_2D00_900x506x2.png?_=638203430797225216
May 29, 2023
10 minute read time.

Last year Arm launched a brand-new flagship GPU called ‘Immortalis’, which is proving to be an outstanding success. Immortalis-G715 is already leading on performance and efficiency across various benchmarks for the latest high performing smartphone devices using premium and flagship SoCs. On ray tracing, an Immortalis-G715 based SoC provides 67 and 52 percent better performance compared to SoCs with competing GPUs¹. Meanwhile, on variable rate shading (VRS) benchmarks, the flagship GPU provides between 5 and 52 percent better performance compared to the same SoC competitors².

The leading performance and efficiency of Immortalis-G715

We are continuing this tremendous momentum by launching the latest flagship Arm Immortalis-G720 GPU. This is our alongside new Mali-G720 and Mali-G620 GPUs to complete our portfolio of world-class GPUs that target a broad range of consumer devices. After four generations of GPUs on the 4th Generation Valhall architecture, the latest Arm GPUs are built on a brand-new 5th Generation GPU architecture known as 5th Gen. This signals Arm’s long-standing commitment to the next generation of visual computing on consumer devices.

The new GPUs continue to represent leading performance and efficiency, providing more realistic, immersive, and advanced gaming experiences on mobile, as well as longer playing time 'on-the-go' without draining the battery. The 5th Gen architecture enables all the GPUs to improve graphics performance at a system level. This means we are not just getting the best performance from the GPU alone, but also from interactions with external memory, the latest CPU clusters and system level caches (SLCs). These are all incorporated into the new Arm Total Compute Solution (TCS23).

Addressing the demand for more graphics performance

For developers, Arm’s GPUs provide the largest target base for their applications with over 9 billion Arm GPUs shipped to date. This is 1 billion more than last year and more than 1 GPU for every person on Earth. Alongside the GPUs, Arm’s industry-leading graphics features, optimizations and development tools help developers create the very best application experiences.

Introducing the 5th Gen GPU architecture

The 5th Gen architecture will be the foundation of Arm’s future GPUs, enabling new game-changing graphics features as the world enters the next era of visual computing. In its first year, 5th Gen targets three key processing trends – scene complexity, better graphics, and memory system power.

The key graphics processing trends

Scene complexity

There has been an explosion of scene complexity on mobile, as developers strive for better quality visuals. The challenge is that the greater scene complexity means geometry-related memory accesses can dominate the available bandwidth and impact performance. Through 5th Gen’s work to improve the graphics pipeline, users will be able to run their favorite games at higher frames per second (FPS). It will also make the next generation of high geometry games and real-time 3D applications possible on mobile.

The work on the graphics pipeline to manage these more complex scenes starts with the introduction of the deferred vertex shading (DVS) pipeline. This revolutionizes the geometry dataflow in Arm GPUs. Through DVS, performance can be scaled to larger core counts, enabling Arm’s partners to reach higher performance points in the future. DVS also helps to maintain a consistent framerate across the most complex gaming scenes, while future-proofing for next-generation geometry content.

Already we are seeing the introduction of DVS benefiting performance in some scenes across a range of popular gaming content. These include 33 percent less bandwidth used on Genshin Impact, 26 percent less bandwidth used on Fortnite and 41 percent less bandwidth used on Elven Ruins, a gaming scene demo from Epic Games in the Unreal Engine for game developers and creators. This architectural innovation also brings smoother gameplay and more life-like, realistic gaming experiences commonly associated with PC and Console to mobile. And it is not just gaming scenes and applications, DVS delivers 37 percent less bandwidth for a leading CAD application for architects, which turns their CAD plan for buildings into a digital reality through a real-time 3D view.

Deferred Vertex Shading (DVS)

Better graphics

As developers create more stunning visuals through their applications, the use of high dynamic range (HDR) rendering is increasing. HDR rendering is a developer trend that Arm actively supports, so we have improved the performance for this feature for better visuals. Immortalis-G720 helps to manage the performance impact of high-depth textures that are used in HDR rendering. The introduction of DVS means scenes with very complex geometry can be rendered with processing to spare. In a demo prepared by Arm engineers for the 2023 Game Developer Conference (GDC), we showed a 31 percent improvement on WRITE bandwidth and an estimated 20 percent FPS improvement with the 5th Gen-based Immortalis-G720 compared with the Valhall-based Immortalis-G715. This leaves room in the graphics pipeline to add PC quality effects, like real-time dynamic lighting, bloom effect and depth of field.

Memory System Power

We are increasingly seeing memory system power as a major contributor to thermals in processors being pushed to the limit. Looking at last year’s Arm Total Compute Solution (TCS22), we saw a large amount of memory system power being used across DRAM, Interconnect and Memory. Through 5th Gen, we want to take a significant portion of this power and allocate it to the GPU for better visuals, as well as being able to use any extra power savings to extend battery life.

Arm’s highest performing, most efficient GPUs ever

Alongside the introduction of the 5th Gen architecture, there have been a wealth of performance and efficiency improvements throughout the designs of the Immortalis and Mali GPUs. Immortalis-G720 offers support for 10 cores or more and the option of using optimized physical IP to accelerate system-on-chip (SoC) designs. The unrivaled scalability of Mali-G720 and Mali-G620 means we are bringing premium graphics features to a broad range of consumer devices, like smartphones, laptops, DTVs, set-top boxes (STBs) and wearables. Mali-G720 offers support for 6 to 9 cores, while Mali-G620 supports 5 cores or fewer to allow Arm’s premium licensees to re-use their design work quickly across additional markets.

Arm's most efficient GPUs ever

We have delivered widespread improvements to reduce GPU power consumption, leading to more time to work and play on-the-go for the end-user. Each Immortalis and Mali GPU provides the highest levels of energy efficiency ever, 15 percent more energy efficient on average than the previous generation. This means more performance per watt and greater sustained performance. There has also been a major increase in system-level efficiency, with an up to 40 percent reduction in memory bandwidth usage and reduced CPU load. This has been achieved through adding new GPU instructions and driver optimisations.

On performance, the new GPUs provide the highest levels of performance – 15 percent peak performance improvements on average. We are also doubling the architectural throughput for 64bpp Texturing for improved high dynamic range textures.

For all the GPUs, we provide higher performing shading rates for VRS, a graphics feature that we introduced in last year’s Arm GPUs. For developers, this means that the performance of the 4x2 and 4x4 shading rates has increased. VRS helps to mitigate the increased shading costs that come from complex shading, ensuring that high quality graphics are delivered where it matters.

Alongside improvements to our VRS offering, we are adding greater flexibility to the developer experience through enabling optimized 2x Multi-Sampling Anti-Aliasing (MSAA). This removes the need to escalate to 4x MSAA for developers who want 2x MSAA for their applications. For applications that ask for 2x MSAA, we have measured a 7 percent performance uplift compared to 4x MSAA. The new GPUs also offer improved support for Vulkan dynamic buffers.

Continuous commitment to machine learning

Arm remains committed to developing and testing our GPUs against new applications for machine learning (ML). One key ML use case is 3D scene reconstruction, which renders novel views of real-world scenes on the mobile GPU. Exploring this particular use case, we have seen the Immortalis-G720 achieve 25 percent more peak performance and consume 22 percent less memory bandwidth compared with Immortalis-G715.

We are also balancing power across our CPUs and GPUs for a consistent user experience. Through Unity’s ML agents, which allows developers to train intelligent agents within games and simulations, we demonstrated at GDC 2023 that 100 such agents can now be processed alongside complex graphics on mobile SoCs.

Key Machine Learning use cases on the GPU

Wide ecosystem support

We maintain our ecosystem support for game developers through a wide range of GPU tools and resources. Arm Mobile Studio, which is free to download, provides a range of profiling, performance analysis and debugging tools, so developers can optimize the performance and efficiency of their applications.

Soon, we will be launching Arm Frame Advisor, a frame-based profiler for games supporting OpenGL ES 3.2 and Vulkan 1.1. Frame Advisor uses a layer driver to capture all API calls in a frame, with the analysis engine providing contextual feedback to the developers. This feedback identifies opportunities for developers to increase the performance of their applications through providing:

  • Visualizations of the render graph and frame data flow;
  • Information about best practice violations; and
  • Information about budget violations, such as exceeding a GPU cycle count or GPU power budget.

Early release tests of Frame Advisor are planned with selected game studios, with the tool being publicly available at the end of 2023.

We are also working with our ecosystem to develop new technologies with selected partners. The Arm, Google, and Unity collaboration on adaptive performance is a good example, with this optimizing GPU utilization within fixed power and thermal constraints.

Ray tracing continues to be a very popular graphics feature. We have been working closely with partners who are adopting this technology on silicon and in devices, and using ray tracing techniques for gaming applications. For example, we are working in collaboration with Tencent Games and MediaTek to drive industry adoption through Smart Global Illumination (SmartGI) and also developing best practice documentation to support game developers.

The benefits of the Unreal Engine 5

Finally, we are working with Epic Games to enable its Unreal Engine 5 desktop renderer on Android. This will ensure that desktop quality rendering and graphics are delivered by Immortalis GPUs. We have created the Steel Arms demo to test the developer experience with our GPU products and demonstrate how the renderer can enable high-quality graphics. These include rich bloom effects, high-quality physically-based shading, vivid blur effects, and detailed real-time reflections.

A summary of the new Arm GPUs

More efficiency, more performance, more for developers

Our objective each year is to deliver industry-leading performance and efficiency on our GPUs, and then give developers the tools and resources to create stunning visual experiences. This year is no different, with the Immortalis-G720 and family of Mali GPUs delivering significantly more efficiency, more performance and ultimately more for developers. With the introduction of the 5th Gen architecture, we now have the foundation for the next generation of visual computing that will enable new game-changing graphical capabilities on mobile devices. All of this means that users will continue to see the very best visual experiences run on Arm.

Footnotes

¹ Ray Tracing benchmark is ‘Basemark® GPUScore: In Vitro’. Arm measured data as of 12th April 2023 measured on flagship Android handsets.
² VRS benchmark is ‘Basemark® GPUScore: The Expedition VRS’. Arm measured data as of 12th April 2023 measured on flagship Android handsets.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK