(RUMOR警告）索尼图形工程师透露PS5是介于RDNA1到RDNA2之间的构架

wudizhikun · 发表于 2020-7-21 07:38 · 山西

不吃鸡蛋发表于 2020-7-21 07:22
自己数数吧，几个在说架构，几个在diss微软和软粉。

也不知道谁选择性无视。 ...

说架构的17条差不多，说软软的11、12条

显示全部楼层 · 发表于 2020-7-21 07:44 · 澳大利亚

wudizhikun 发表于 2020-7-21 07:38
说架构的17条差不多，说软软的11、12条

第一页除了三楼全是平台battle。

5条是冷嘲热讽软粉的。

你真是亲自示范什么叫选择性无视。

mr.n · 发表于 2020-7-21 07:48 · 加拿大

诺基亚8800 发表于 2020-7-19 23:07
定制化rdna2不就是取舍功能的差别么，PS5没有选择部分rdna2的自带功能而是自己设计了。 ...

这么难理解并且自载悖论的句子，你自己能读得懂吗？

诺基亚8800 · 发表于 2020-7-21 08:03 · 上海

mr.n 发表于 2020-7-21 07:48
这么难理解并且自载悖论的句子，你自己能读得懂吗？

看不懂?看我37楼作者原话。

你再看不懂我无能为力

我是差PS3君 · 发表于 2020-7-21 08:07 · 北京

不吃鸡蛋发表于 2020-7-21 07:31
你怎么用了竹笋炒肉的头像。

不好意思~是我先用的~他的和我的不一样~

GloryEvi · 发表于 2020-7-21 08:59 · 未知

没有M L可惜了死亡搁浅的DLSS2.0超级惊艳 2060都能4K60帧。在CPU和GPU都落后的情况下再缺失M L功能，感觉第三方游戏帧数会被甩开一大截。

wshlanlan · 发表于 2020-7-21 09:23 · 湖北

索粉注意力都在ff7 美末2 对马上面，至于这些参数有多少人关心呢，大概只有软粉没游戏玩天天玩参数吧

显示全部楼层 · 发表于 2020-7-21 09:26 · 广西

Radeon 擅长计算工作负载

AMD 的 GCN（Graphics Core Next）架构是 RDNA 的前身，在机器学习（ML）工作负载方面也特别强大。我们知道，人工智能在智能手机处理器方面是一个大热点，并且在未来五年内可能会变得更加普遍。

RDNA 保留了高性能的机器学习凭证，并行支持 64、32、16、8 甚至 4 位。RDNA 的矢量 ALU 的宽度是上一代的两倍，用于更快的数字运算，并且还执行融合乘法累加（FMA）操作，功耗低于前几代。FMA 数学在机器学习应用程序中很常见，因此在 Arm 的 Mali-G77 中有一个专用的硬件块。

图：RDNA 支持最多 8 个 4 位并行操作和混合精度 FMA，用于机器学习任务

此外，RDNA 引入了 ACE（Asynchronous Compute Tunneling）来管理计算着色器工作负载。AMD 声称这“使计算和图形工作负载能够在 GPU 上***地共存。”换句话说，RDNA 在处理 ML 和图形工作负载方面更加高效，可能减少了对专用 AI 芯片的需求。

可以说，如果你想利用芯片空间用于图形和 ML 工作负载，那么在功能方面，RDNA 肯定看起来很吸引人。此外，AMD 还承诺了 7nm+制程工艺及其即将推出的“下一代”RDNA 实现更高的每瓦性能，这正是三星所所擅长的。

https://www.eefocus.com/mcu-dsp/451770

显示全部楼层 · 发表于 2020-7-21 09:27 · 广西

When you look at the AMD RDNA white paper it states that some ALUs will support 8-bit and 4-bit interger. So its not an automatic feature of RDNA. Its a hardware feature that must be added.

Some variants of the dual compute unit expose additional mixed-precision dot-product modes in the ALUs, primarily for accelerating machine learning inference. A mixed-precision FMA dot2 will compute two half-precision multiplications and then add the results to a single-precision accumulator. For even greater throughput, some ALUs will support 8-bit integer dot4 operations and 4-bit dot8 operations, all of which use 32-bit accumulators to avoid any overflows

https://www.amd.com/system/files/documents/rdna-whitepaper.pdf

显示全部楼层 · 发表于 2020-7-21 09:28 · 广西

Vector Execution

The superb performance and efficiency of modern graphics processors is derived from the
parallel computing capabilities of vector execution units. As Figure 8 illustrates, one of the
biggest improvements in the compute unit is doubling the size of the SIMDs and enabling
back-to-back execution. When using the more efficient wave32 wavefronts, the new SIMDs
boosts IPC and cuts latency by 4X.

handling mixed precision. For larger 64-bit (or double precision) FP data, adjacent registers are
combined to hold a full wavefront of data. More importantly, the compute unit vector registers
natively support packed data including two half-precision (16-bit) FP values, four 8-bit
integers, or eight 4-bit integers.

		自动登录	找回密码
密码			注册