SemiAccurate Forums  

 
Go Back   SemiAccurate Forums > Main Category > GPUs

GPUs Talk about graphics, cards, chips and technologies


Reply
 
Thread Tools Display Modes
  #1341  
Old 04-21-2017, 11:37 AM
livebriand livebriand is online now
8-bit overflow
 
Join Date: Jan 2013
Location: Sweden
Posts: 514
livebriand is on a distinguished road
Default

Well apparently there is now a growing email chain at AMD trying to find out the truth, so hopefully we'll find out soon.

It is odd, I guess there could be a couple of reasons. Maybe there are 8 but 4 were disabled for power or some other reason. Maybe they just wanted to send Nvidia the wrong message too.

Last edited by livebriand; 04-21-2017 at 11:39 AM.
Reply With Quote
  #1342  
Old 04-21-2017, 11:56 AM
Dezeer Dezeer is online now
itanic
 
Join Date: Sep 2011
Posts: 83
Dezeer is on a distinguished road
Default

I do remember reading a news article about how Fiji had 4 ACE units and 2 HWS units (Hotchips slides) instead of the original 8 ACE units as per launch slides. There was also some speculation that maybe one HWS unit could work as two ACE, but I don't remember there being any further clarification about it.


Quote:
Not all ACE's are equal.
https://forum.beyond3d.com/threads/a...6#post-1868282
Reply With Quote
  #1343  
Old 04-21-2017, 12:38 PM
livebriand livebriand is online now
8-bit overflow
 
Join Date: Jan 2013
Location: Sweden
Posts: 514
livebriand is on a distinguished road
Default

Here's the answer.

Quote:
In Hawaii and earlier GCN GPUs, the hardware was designed to support a fixed number of compute queues (up to 8 per ACE). Starting with 3rd generation GCN, we added a Hardware Scheduler (HWS) capability that made it possible to virtualize the compute queues. This meant that any number of queues could be supported, and the HWS would assign them to the available ACEs as slots became available. Once we had enabled the HWS, the old representation we had with 8 ACEs handling up to 8 queues each no longer made sense, so we decided to update the block diagrams.

Each ACE block now represents a single wavefront/workgroup dispatcher. So Fiji & Polaris can dispatch up to 4 wavefronts/workgroups to the shader engines at a time, from any compute queue. Also the hardware scheduling capabilities are now shown separately, and there are two of these blocks because the HWS is actually a dual threaded microprocessor (i.e. it can run two scheduling threads concurrently).
Pretty big change actually, surprised more wasn't made of it during the Fiji launch.
Reply With Quote
  #1344  
Old 04-21-2017, 04:20 PM
sirroman sirroman is offline
2^10
 
Join Date: Jul 2011
Posts: 1,967
sirroman is on a distinguished road
Default

Quote:
Originally Posted by livebriand View Post
Here's the answer.



Pretty big change actually, surprised more wasn't made of it during the Fiji launch.
Wait, so it was ->ACE->SE and now it's ->HWS->ACE->SE? The diagram clearly doesn't represent it.

In the end it just "unbounds" the compute queues? Then I imagine it simplifies the connections between all those ACEs before and SE. Each ACE could even be hardwired to a SE?

ACEs are going the way of the Dodo... (Since they don't need to manage queues and only dispatch whatever the HWS tells them)
Reply With Quote
  #1345  
Old 04-22-2017, 03:06 AM
Stuckey Stuckey is offline
>intel 4004
 
Join Date: May 2012
Location: 173000G
Posts: 4,431
Stuckey will become famous soon enough
Default

Quote:
Originally Posted by sirroman View Post
Wait, so it was ->ACE->SE and now it's ->HWS->ACE->SE? The diagram clearly doesn't represent it.

In the end it just "unbounds" the compute queues? Then I imagine it simplifies the connections between all those ACEs before and SE. Each ACE could even be hardwired to a SE?

ACEs are going the way of the Dodo... (Since they don't need to manage queues and only dispatch whatever the HWS tells them)
I think (iirc) the ACEs also handle the hardware prioritization ? Basically whenever there is a free slot they report back to the HWS, which allows the HWS to assign high priority stuff to the first available slot, no matter which ACE becomes available first.

I'd guess they could change it so the HWS handles all parts of the process ? But I think the way they've done it should be more easily scalable. Because if you incorporated everything into the HWS I'd presume you would need a different sized 'super HWS' for each size of GPU ? Whereas with the HWS->ACE system you only need one HWS and to alter the number of ACEs ?

I think having HWS->ACE is more to do with scalability than integration. Could be wrong, but I think it's likely.
__________________
To find the right answers you must ask the right questions.
Reply With Quote
  #1346  
Old 04-22-2017, 05:04 AM
GeorgiD's Avatar
GeorgiD GeorgiD is offline
2^11
 
Join Date: Feb 2010
Location: Sofia Area, Bulgaria
Posts: 2,793
GeorgiD will become famous soon enough
Default

As far as AMD Radeon RX Vega release date is concerned, it is expected that the high-end graphics card might get unveiled during the Computex Convention on May 30. There are also reports that claim AMD may debut the RX Vega at a special event before the Electronic Entertainment Expo (E3 2017) that will kick off on June 13.

http://www.scienceworldreport.com/ar...most-human.htm
Reply With Quote
  #1347  
Old 04-22-2017, 09:16 AM
sirroman sirroman is offline
2^10
 
Join Date: Jul 2011
Posts: 1,967
sirroman is on a distinguished road
Default

Quote:
Originally Posted by Stuckey View Post
I think (iirc) the ACEs also handle the hardware prioritization ? Basically whenever there is a free slot they report back to the HWS, which allows the HWS to assign high priority stuff to the first available slot, no matter which ACE becomes available first.
Can an ACE dispatch to whatever SE? I don't think this was ever asked before.

I believe that now it makes more sense for an ACE to manage the utilization on "their SE" (after all, asynchronous compute is all about filling gaps) while HWS handle priority and locality among ACEs.

Quote:
Originally Posted by Stuckey View Post
I'd guess they could change it so the HWS handles all parts of the process ? But I think the way they've done it should be more easily scalable. Because if you incorporated everything into the HWS I'd presume you would need a different sized 'super HWS' for each size of GPU ? Whereas with the HWS->ACE system you only need one HWS and to alter the number of ACEs ?

I think having HWS->ACE is more to do with scalability than integration. Could be wrong, but I think it's likely.
We need to wait for Vega to see how it evolved, but it seems like the scales are 1:2 (HWS:ACE) and 1:1 (ACE:SE), btw they pointed out that each has dual cores.
Reply With Quote
  #1348  
Old 04-22-2017, 09:24 AM
livebriand livebriand is online now
8-bit overflow
 
Join Date: Jan 2013
Location: Sweden
Posts: 514
livebriand is on a distinguished road
Default

Yes based on this info ACE's being locked to their own SE is what makes most sense. Not sure how that'll work with Hawaii/Grenada but I guess that would be two ACE's per SE.

I wouldn't be surprised to see more changes in Vega either.
Reply With Quote
  #1349  
Old 04-22-2017, 01:10 PM
Stuckey Stuckey is offline
>intel 4004
 
Join Date: May 2012
Location: 173000G
Posts: 4,431
Stuckey will become famous soon enough
Default

Quote:
Originally Posted by sirroman View Post
Can an ACE dispatch to whatever SE? I don't think this was ever asked before.

I believe that now it makes more sense for an ACE to manage the utilization on "their SE" (after all, asynchronous compute is all about filling gaps) while HWS handle priority and locality among ACEs.



We need to wait for Vega to see how it evolved, but it seems like the scales are 1:2 (HWS:ACE) and 1:1 (ACE:SE), btw they pointed out that each has dual cores.
It's actually 1:4 (HWS:ACE). As the info livebriand obtained says, the block diagram shows two HWSs, but in reality it's a single dual threaded unit.

I'd imagine that should they ever need to increase the number of ACEs, they can do what they seem to have done with Vegas SEs, ie- buff the HWS without having to do a major re-design ?
__________________
To find the right answers you must ask the right questions.
Reply With Quote
  #1350  
Old 04-22-2017, 05:26 PM
sirroman sirroman is offline
2^10
 
Join Date: Jul 2011
Posts: 1,967
sirroman is on a distinguished road
Default

Quote:
Originally Posted by Stuckey View Post
It's actually 1:4 (HWS:ACE). As the info livebriand obtained says, the block diagram shows two HWSs, but in reality it's a single dual threaded unit.

I'd imagine that should they ever need to increase the number of ACEs, they can do what they seem to have done with Vegas SEs, ie- buff the HWS without having to do a major re-design ?
You are right, I misread the quote from adored. That diagram is a mess...

Now scaling the HWS is just as difficult as scaling the Graphics Processor... There went the whole idea that it was integrated to increase scalability...
Reply With Quote
Reply

Tags
amd, vega


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Forum Jump


All times are GMT -5. The time now is 02:16 PM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
SemiAccurate is a division of Stone Arch Networking Services, Inc. Copyright 2009 Stone Arch Networking Services, Inc, all rights reserved.