Can compiler merge distinct tail call sites in template-generated functions

4 days ago 5
ARTICLE AD BOX

I am writing my first interpreter and am interested in using tail calls to make branch prediciton better.

Consider I have something like

template<void *Handler> void wrapper(Cpu& cpu) { Handler(cpu); cpu.post_instruction(); [[clang::musttail]] return dispatch_table[cpu.get_next_instr_token()](cpu); }

Can the compiler fold or merge the tail return call sites for different template-instantiated functions? if so then the whole point of having different branch sites that the branch predictor can learn from is defeated.

Read Entire Article