版本
3.0-776
分支
master
发布于
12 years, 2 months ago
Windows x64 Windows x86 Mac OS X
提交(Commit)
39900022508121f7f2412b928989ee34f40ee069
修改者
Pierre Bourdon
修改说明
Optimize JitCache::InvalidateICache by maintaining a "valid blocks" bitset

Most of the InvalidateICache calls are for a 32 bytes block: this is the
number of bytes invalidated by PowerPC dcb*/icb* instructions. Profiling
shows that a lot of CPU time is spent checking if there are any JIT blocks
covered by these 32 bytes (using std::map::lower_bound).

This patch adds a bitset containing the state of every 32 bytes block in
RAM (JIT cached/not JIT cached). Using that, a 32 bytes InvalidateICache
can check in the bitset if any JIT block might be invalidated. A bitset
check is a lot faster than an std::map::lower_bound operation, improving
performance of JitCache::InvalidateICache by more than 100%.

Some practical numbers:

* Xenoblade Chronicles (PAL)
  56.04FPS -> 59.28FPS (+5.78%)
* The Last Story (PAL)
  30.9FPS -> 32.83FPS (+6.25%)
* Super Mario Galaxy (PAL)
  59.76FPS -> 62.46FPS (+4.52%)

This function still takes more time than it should - more optimization in
this area might be possible (specializing for 32 bytes blocks to avoid
useless memcpy, for example).