首页
社区
课程
招聘
[推荐]Cache知识汇总
发表于: 2009-6-25 22:12 7279

[推荐]Cache知识汇总

2009-6-25 22:12
7279

CAM:Content Addressable Memory.
1. 对于采用虚拟cache的cpu,如果没有采用速度特别慢的2级缓存(如磁盘驱动器等),建议cache采用直写策略,而非回写策略。
2. Cache按位置可分为逻辑(虚拟)Cache和物理Cache,在带有Cache和mmu的处理器系列中,从arm 7到arm10,包括Intel strongARM和Intel Xscale处理器,都使用逻辑Cache。Arm11处理器(ARMv6系列体系结构)采用物理Cache。Cache同时使用了时间和空间的局部性原理。
3. Cache有两种总线结构:冯诺依曼结构和哈佛结构(即混合Cache和分立Cache)。这两种结构的区别在于是否在内核和主存之间将指令和数据总线分离。
Cache的大小由Cache可以存储的主存中实际数据和代码的大小决定的。在计算Cache容量时,用来存储Cache标签和状态信息的状态位的那部分Cache存储器不计算在内。
4. 写缓冲器是一个容量非常小的FIFO存储缓冲器,Cache中替换出来的脏Cache行将被放入FIFO中,而不是直接写入内存中。
5. 主存写策略:写回策略(write-back)、直写策略(write-through)。Cache未命中时的分配策略:读操作分配策略、读/写策略分配策略。
a). 读操作分配策略,当Cache未命中时,只有进行存储器读操作时,才分配Cache行。如果被替换的Cache行包含有效数据,那么在该行被新的数据填充之前,要先把原理的内容写入到主存中去。采用读操作分配策略时,存储器写操作不会更新Cache行,除非相关的Cache行恰好是前一个主存读操作刚分配的。
b). 采用读/写分配策略,无论存储器读还是写操作,在Cache未命中时,都将分配Cache行。对于存储器写操作,如果Cache未命中,将分配一个 Cache行。如果被替换的Cache行中包含有效数据,控制器会先将该行数据写入主存,再用从主存读取的数据将改行Cache覆盖,最后把内核数据写入该Cache行中。如果采用Cache直写策略,内核数据将会同时被写入到主存中。
6. Cache行替换策略:轮转法、伪随机替换法。伪随机替换法从特定的位置上随机地选出一行替换出去。该算法使用非连续增加的丢弃计数器,控制器随机产生一个增加值,并将该值加到丢弃计数器上。丢弃计数器采取的是模加。Ixscale只支持轮转法。
7. 针对Cache行,清除(Clear、Flush、Invalidate)和清理(Clean)操作的不同:清除(Clear、Flush、 Invalidate)仅仅是把Cache行的tag置为无效(无论是否有脏位标志);在采用写回策略情况下,清理(Clean)则要把带脏位的 Cache行写入主存。(注意:在非Arm的其他系统中,Flush往往代表着清理)。在IXP42X中的清除和清理操作都是针对整个Cache,没有针对某一行的操作。
8. Cache锁定操作
锁定在Cache中的数据和代码不会被替换。但是,如果Cache被清除,被锁定的信息也会丢失,但是被锁定的Cache行被清除后仍不能用于一般的代码和数据存储,必须重新运行Cache行锁定程序来锁定新的代码和数据。
执行Cache锁定的软件本身必须被存放在不可Cache的主存中。锁定在Cache中的代码和数据必须被存放在可Cache的主存中。被锁定的Cache和数据不能存在于Cache中的其他地方,因此在锁定之前必须先清理和清除Cache。
Cache锁定有3中不同的方法:1. 使用了路(way)寻址方式;2.使用了组锁定位;3. 使用了特殊分配命令和读取主存中特定块(Ixscale使用)。


[注意]传递专业知识、拓宽行业人脉——看雪讲师团队等你加入!

收藏
免费 7
支持
分享
最新回复 (10)
雪    币: 2096
活跃值: (100)
能力值: (RANK:420 )
在线值:
发帖
回帖
粉丝
2
2009-6-26 01:44
0
雪    币: 21
活跃值: (26)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
3
Now there are two kinds of timing attack based Cache,one is based D-Cache,the other is based I-Cache.
2009-6-26 08:50
0
雪    币: 2096
活跃值: (100)
能力值: (RANK:420 )
在线值:
发帖
回帖
粉丝
4
There are several ways to discuss for cache issue.

1) From software or application program. A kind of software, it can access to cache directly.
If sysop doesn't set the proirity, then it may be a system leakage.

2) From operation system, it mean similar with software, this also discuss access control or proirity or monitor abilities of role in the operation system. Any programmer designs his program which it can do something such as memory access while he doesn't a owner or sysop. This is a vulnerable.

Previously, they both related with memory management problems.

3) For System Uilities, sush as compiler or assembler, it may exist a weakness while you use some functions.

4) Else. I will list some information later.
2009-6-26 17:26
0
雪    币: 21
活跃值: (26)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
5
现在还不明白Cache和CPU中的分支预测单元是什么关系?
2009-6-26 20:53
0
雪    币: 2096
活跃值: (100)
能力值: (RANK:420 )
在线值:
发帖
回帖
粉丝
6
分支预测单元? What's that?
2009-6-27 17:59
0
雪    币: 1022
活跃值: (31)
能力值: ( LV4,RANK:50 )
在线值:
发帖
回帖
粉丝
7
Branch Prediction
现在很多都是超标量超流水线结构的CPU,
CPU判断程序分支的进行方向,
预测最可能执行的指令,并装入cache执行。
2009-6-27 18:16
0
雪    币: 2096
活跃值: (100)
能力值: (RANK:420 )
在线值:
发帖
回帖
粉丝
8
Branch Prediction?
Okay~~ I see.
It is related with CISC and RISC architecture.
2009-6-27 18:25
0
雪    币: 21
活跃值: (26)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
9
不知道如何获得它的指令执行流呢?
2009-6-28 00:13
0
雪    币: 1022
活跃值: (31)
能力值: ( LV4,RANK:50 )
在线值:
发帖
回帖
粉丝
10
Reducing branch mispredicts with branch hint
General-purpose processors have typically addressed branch prediction by supporting hardware look-asides with branch history tables (BHT), branch target address caches (BTAC), or branch target instruction caches (BTIC).

The SPU addresses branch prediction through a set of hint for branch (HBR) instructions that facilitate efficient branch processing by allowing programs to avoid the penalty of taken branches.
If a branch hint is provided, software speculates that the instruction branches to the target path.
If a hint is not provided, software speculates that the instruction does not branch to a new location (that is, it stays inline).
If speculation is incorrect, the speculated branch is flushed and refetched.
It is possible to sequence multiple hints in advance of multiple branches. As with all programmer-provided hints, care must be exercised when using branch hints because, if the information provided is incorrect, performance might degrade.

Branch-hint instructions can provide three kinds of advance knowledge about future branches:
Address of the branch target (that is, where will the branch take the flow of control)
Address of the actual branch instruction (known as the hint-trigger address )
Prefetch schedule (when to initiate prefetching instructions at the branch target)
Branch-hint instructions load a branch-target buffer (BTB) in the SPU. When the BTB is loaded with a branch target, the hint-trigger address and branch address are also loaded into the BTB. After loading, the BTB monitors the instruction stream as it goes into the issue stage of the pipeline. When the address of the instruction going into issue matches the hint trigger address, the hint is triggered, and the SPU speculates to the target address in the hint buffer.

Branch-hint instructions have no program-visible effects. They provide a hint to the SPE architecture about a future branch instruction, with the intention that the information be used to improve performance by prefetching the branch target. The SPE branch-hint instructions are shown in Table 1. There are immediate and indirect forms for this instruction class. The location of the branch is always specified by an immediate operand in the instruction.
Table 1. Branch-Hint Instructions Instruction Description
hbr s11, ra Hint for branch (r-form). Hint that the instruction addressed by the sum of the address of the current instruction and the signed extended, 11-bit value s11 will branch to the address contained in word element 0 of register ra. This form is used to hint function returns, pointer function calls, and other situations that give rise to indirect branches.
hbra s11, s18 Hint for branch (a-form). Hint that the instruction addressed by the sum of the address of the current instruction and the signed extended, 11-bit value s11 will branch to the address specified by the sign extended, 18-bit value s18.
hbrr s11, s18 Hint for branch relative. Hint that the instruction addressed by the sum of the address of the current instruction and the signed extended, 11-bit value s11 will branch to the address specified by the sum of the address of the current instruction and sign extended, 18-bit value s18.

The following rules apply to the hint for branch (HBR) instructions:

An HBR instruction should be placed at least 11 cycles followed by four instruction pairs before the branch instructions being hinted by the HBR instruction. In other words, an HBR instruction must be followed by at least 11 cycles of instructions, followed by eight instructions aligned on an even address boundary. More separation between the hint and branch improves the performance of applications on future SPU implementations.
If an HBR instruction is placed too close to the branch, then a hint stall will result. This results in the branch instruction stalling until the timing requirement of the HBR instruction is satisfied.
If an HBR instruction is placed closer to the hint-trigger address than four instruction pairs plus one cycle, then the hint stall does not occur and the HBR is not used.
Only one HBR instruction can be active at a time. Issuing another HBR cancels the current one.
An HBR instruction can be moved outside of a loop and will be effective on each loop iteration as long as another HBR instruction is not executed.
The HBR instruction must be placed within 255 instructions of the branch instruction.
The HBR instruction only affects performance.
The HBR instructions can be used to support multiple strategies of branch prediction. These include:

Static Branch Prediction — Prediction based upon branch type or displacement, and prediction based upon profiling or linguistic hints.
Dynamic Branch Prediction — Software caching of branch-target addresses, and using control flow to record branching history.
A common approach to generating static branch prediction is to use expert knowledge that is obtained either by feedback-directed optimization techniques or using linguistic hints supplied by the programmer.

The document C/C++ Language Extensions for Cell Broadband Engine Architecture defines a mechanism for directing branch prediction. The __builtin_expect directive allows programmers to predict conditional program statements. The following example demonstrates how a programmer can predict that a conditional statement is false (a is not larger than b).
        if(__builtin_expect((a>b),0))
          c += a;
        else
          d += 1;
Not only can the __builtin_expect directive be used for static branch prediction, it can be used for dynamic branch prediction.
2009-6-28 01:00
0
雪    币: 21
活跃值: (26)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
11
首先对楼上的表示谢谢!不知还能不能给大家简单介绍一下呢?
2009-6-28 01:32
0
游客
登录 | 注册 方可回帖
返回
//