Cornell Virtual Workshop > Understanding GPU Architecture > GPU Memory (2024)

The picture on the preceding page is more complex than it would be for a CPU, because the GPU reserves certain areas of memory for specialized useduring rendering. Here, we summarize the roles of each type of GPU memory for doing GPGPU computations.

The first list covers the on-chip memory areas that are closest to the CUDA cores.They are part of every SM.

  • Register File - denotes the area of memory that feeds directly into the CUDA cores. Accordingly, it is organized into 32 banks,matching the 32 threads in a warp. Think of the registerfile as a big matrix of 4-byte elements, having many rows and 32 columns. A warp operates on full rows; within a given row, each thread(CUDA core) operates on a different column (bank).
  • L1 Cache - refers to the usual on-chip storage location providing fast access to data that are recently read from, or written to, mainmemory (RAM). Additionally, L1 serves as the overflow region when the amount of active data exceeeds what an SM's register file can hold, a conditionwhich is termed "register spilling". In L1, the cache lines and spilled registers are organized into banks, just as in the register file.
  • Shared Memory - is a memory area that physically resides in the same memory as the L1 cache, but differs from L1 in that all its data maybe accessed by any thread in a thread block. This allows threads to communicate and share data with each other. Variables that occupy it must bedeclared explicitly by an application. The application can also set the dividing line between L1 and shared memory.
  • Constant Caches - are special caches pertaining to variables declared as read-only constants in global memory. Such variables can be read byany thread in a thread block. The main and best use of these caches is to broadcast a single constant value to all the threads in a warp.

The second list pertains to the more distant, larger memory areas that are shared by all the SMs.

  • L2 Cache - is a further on-chip cache for retaining copies of the data that travel back and forth between the SMs and main memory. Like theL1, the L2 cache is intended to speed up subsequent reloads. But unlike the L1 cache(s), there is just one L2 that is shared by all the SMs. The L2cache is also situated in the path of data moving on or off the device via PCIe or NVLink.
  • Global Memory - represents the bulk of the main memory of the device, equivalent to RAM in a CPU-based processor. For performance reasons,the Tesla V100 has special HBM2 high-bandwidth memory, while the Quadro RTX 5000 has fast GDDR6 graphics memory.
  • Local Memory - corresponds to specially mapped regions of main memory that are assigned to each SM. Whenever "register spilling" overflowsthe L1 cache on a particular SM, the excess data are further offloaded to L2, then to "local memory". The performance penalty for reloading aspilled register becomes steeper for every memory level that must be traversed in order to retrieve it.
  • Texture and Constant Memory - are regions of main memory that are treated as read-only by the device. When fetched to an SM, variables witha "texture" or "constant" declaration can be read by any thread in a thread block, much like shared memory. Texture memory is cached in L1, whileconstant memory is cached in the constant caches.
Cornell Virtual Workshop > Understanding GPU Architecture > GPU Memory (2024)
Top Articles
Mythical Epics
Can You Delete A Bitcoin Wallet? - The Bitcoin Manual
Netronline Taxes
Kathleen Hixson Leaked
What are Dietary Reference Intakes?
Lexington Herald-Leader from Lexington, Kentucky
When is streaming illegal? What you need to know about pirated content
Acts 16 Nkjv
CA Kapil 🇦🇪 Talreja Dubai on LinkedIn: #businessethics #audit #pwc #evergrande #talrejaandtalreja #businesssetup…
Lima Crime Stoppers
Miami Valley Hospital Central Scheduling
Thayer Rasmussen Cause Of Death
Belle Delphine Boobs
The Banshees Of Inisherin Showtimes Near Regal Thornton Place
Tcu Jaggaer
Houses and Apartments For Rent in Maastricht
Amc Flight Schedule
Wausau Obits Legacy
Unionjobsclearinghouse
Form F-1 - Registration statement for certain foreign private issuers
European city that's best to visit from the UK by train has amazing beer
Mythical Escapee Of Crete
پنل کاربری سایت همسریابی هلو
Breckiehill Shower Cucumber
Wonder Film Wiki
Goodwill Of Central Iowa Outlet Des Moines Photos
Times Narcos Lied To You About What Really Happened - Grunge
Jesus Calling Feb 13
Select The Best Reagents For The Reaction Below.
Yu-Gi-Oh Card Database
Pixel Combat Unblocked
R3Vlimited Forum
Autotrader Bmw X5
Stolen Touches Neva Altaj Read Online Free
Grapes And Hops Festival Jamestown Ny
Best Restaurants In Blacksburg
Hindilinks4U Bollywood Action Movies
Bones And All Showtimes Near Johnstown Movieplex
F9 2385
California Craigslist Cars For Sale By Owner
Winta Zesu Net Worth
2024-09-13 | Iveda Solutions, Inc. Announces Reverse Stock Split to be Effective September 17, 2024; Publicly Traded Warrant Adjustment | NDAQ:IVDA | Press Release
Arch Aplin Iii Felony
Benjamin Franklin - Printer, Junto, Experiments on Electricity
Freightliner Cascadia Clutch Replacement Cost
Guy Ritchie's The Covenant Showtimes Near Look Cinemas Redlands
Craigslist Sarasota Free Stuff
Game Akin To Bingo Nyt
Wieting Funeral Home '' Obituaries
Vcuapi
Public Broadcasting Service Clg Wiki
Latest Posts
Article information

Author: Roderick King

Last Updated:

Views: 5706

Rating: 4 / 5 (51 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Roderick King

Birthday: 1997-10-09

Address: 3782 Madge Knoll, East Dudley, MA 63913

Phone: +2521695290067

Job: Customer Sales Coordinator

Hobby: Gunsmithing, Embroidery, Parkour, Kitesurfing, Rock climbing, Sand art, Beekeeping

Introduction: My name is Roderick King, I am a cute, splendid, excited, perfect, gentle, funny, vivacious person who loves writing and wants to share my knowledge and understanding with you.