You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know that uarch-bench can use perf, but since I always get jevents errors, I made a simple implementation of this part of uarch-bench, and the code is here, which can be run directly from cont.sh .
The experiments are done in my intel i7-10700. The data is shown below.
The three PMUs are explained as follows.
r1d1: Retired load uops with L1 cache hits as data source
r2d1: Retired load uops with L2 cache hits as data source
r4d1: Retired load uops with L3 cache hits as data source
The first column size refers to the size of the Cache to be traversed, in Kib, with 500000 iterations.
The number of hits per cycle for the L1 and L2 cache is not quite the same as stated in the article, am I doing something wrong here?
The text was updated successfully, but these errors were encountered:
And I have re-run the code you provided in the article in both turning on and off prefetchers on the i5-8265u. It seems that turning on prefetchers will not cause any performance losses. Instead, when turning off them, there are performance losses. It seems that Intel has made some efforts.
I tried to reproduce the results in this article.
I know that uarch-bench can use perf, but since I always get jevents errors, I made a simple implementation of this part of uarch-bench, and the code is here, which can be run directly from
cont.sh
.The experiments are done in my intel i7-10700. The data is shown below.
The three PMUs are explained as follows.
The first column size refers to the size of the Cache to be traversed, in Kib, with 500000 iterations.
The number of hits per cycle for the L1 and L2 cache is not quite the same as stated in the article, am I doing something wrong here?
The text was updated successfully, but these errors were encountered: