Replies: 6 comments 2 replies
-
To get this to build in Yosys, I had to mock fake_pll_clk: Whereas Chipyard has this clock coming out of a PLL: I don't know what the relationships are between these clocks, this is my first cut .sdc file that I used: https://github.com/The-OpenROAD-Project/megaboom/blob/main/constraints-chiptop.sdc |
Beta Was this translation helpful? Give feedback.
-
I've reached out to Chipyard to get some help with the .sdc file for the clocks: https://groups.google.com/g/chipyard/c/PzVcnnqRwf8/m/t1pIDPnuAwAJ |
Beta Was this translation helpful? Give feedback.
-
Failed in global routing, ca. 7000 seconds.
I suppose I could try to flatten the design for the two macros with contention, but if macro placement can't place these two bigger macros, what are the chances it can place many smaller macros(there are SRAMs inside the branch predictor and the data cache that have all the congestion on top)? Intentionally, BranchPredictor has all the pins on the left side. This is to break the rotational and mirroring symmetry such that mpl2 can try different directions to find one that is better. This is odd... Why is the macro rotated R180? That puts the pins on the right hand side... I would have thought, looking at the floorplan, that having the pins on the left side was better, given the placement... |
Beta Was this translation helpful? Give feedback.
-
You can zoom in with the mouse wheel in the CTS view. If you do so on the odd branch on the right you can select the end points and see what they are. That would be helpful. The setup paths with tons of hold buffers are quite odd. I suspect an SDC issue. You might lower the placement density slightly to see if that helps with congestion. A test case for the flipping is needed to say much about it. Perhaps the connections go to pins on the right? |
Beta Was this translation helpful? Give feedback.
-
So, I think I understand more or less what is going on. The Verilog is not the way I want it yet. It contains a fake PLL that is controlled via the TileLink interface at the top level that multiplies the top level clock to the core frequency. This PLL is set up out of bounds of a single core. |
Beta Was this translation helpful? Give feedback.
-
Looks much more reasonable now. There are H-clock trees after CTS.
make cts_issue
; CTS takes ca. 25000 seconds on my machine: https://drive.google.com/file/d/13eenpP2JgzXJD0uP3KIpX4QLArBVjzT2/view?usp=sharingThis is apples and oranges comparision, but ... Minimum clock period is 5000ps, whereas I've seen claims on the order of 1000ps with commercial tools for the most similar design I know of to MegaBoom: https://carrv.github.io/2020/papers/CARRV2020_paper_15_Zhao.pdf
Yellow are hold cells, very reasonable... Clock tree looks odd. Those leaf nodes on the right side are flip flops. It would be nice if there was some clear visual indication of macros vs. flip flop endpoints in the CTS view....
Beta Was this translation helpful? Give feedback.
All reactions