An example is pr58726.c in ctorture. The .mpl generated by mplfe has these questionable cvt's:
cvt i32 i16 (dread i32 %p),
dassign %levVar_1 0 (cvt i32 i16 (dread i32 %p))
return (cvt i16 i32 (dread i32 %levVar_1))
dassign %d_19_3 0 (cvt u16 i32 (dread i32 %levVar_9))
callassigned &foo (cvt i16 u16 (dread u32 %d_19_3)) { dassign %retVar_11 0 }
dassign $c 0 (cvt i32 i16 (dread i32 %retVar_11))
cvt i32 i16 (cvt i16 i32 (constval i32 0xdc36)))) {
Because the smallest register size is 32 bits, cvt's that have either result type or operand type less than 32 bits are not well-defined, and their semantics is not clear It is preferred and would be much more semantically explicit to use OP_zext and OP_sext instructions instead.
In Maple IR, regardless of the opcode, the result PrimType should never be less than 32 bits, because the result must be stored in a register and the smallest register is 32 bits.
ok, I'll fix this issue
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
dassign %d_19_3 0 (zext i32 16 (dread i32 %levVar_9))
callassigned &foo (sext u32 16 (dread u32 %d_19_3)) { dassign %retVar_11 0 }
Some fixes have been merged.
the zext/sext's primtype and opnd0 stays the same ? like the example above, zext with i32 and sext with u32
"stays the same" does not refer to the operand's primtype. sext/zext does not care the actual signedness of the operand. For sext, the result must be either i32 or i64. For zext, the result must be either u32 or u64.
"zext i32 16" should be "zext u32 16"
"sext u32 16" should be "sext i32 16"
For sext and zext, the signedness of their operand does not affect the semantics of the sext/zext operations.
Here is an example from 600.perlbench's regexec.c S_regmatch. regexec.c:5321
scan = ST.me + ((ST.jump && ST.jump[ST.nextword])
? ST.jump[ST.nextword]
: NEXT_OFF(ST.me));
preprocessed source
scan = st->u.trie.me + ((st->u.trie.jump && st->u.trie.jump[st->u.trie.nextword])
? st->u.trie.jump[st->u.trie.nextword]
: ((st->u.trie.me)->next_off));
after mplfe, only partially shown
dassign %levVar_22924 0 (cvt i32 u16 (iread u32 <* u16> 0 (add ptr (
iread ptr <* <$regmatch_state>> 40 (dread ptr %st_4826_5),
mul ptr (
cvt ptr i32 (iread u32 <* <$regmatch_state>> 44 (dread ptr %st_4826_5)),
constval ptr 2)))))
value in %levVar_22924 is converted from u16 to i32 and later used in address computation.
dassign %scan_4828_5 0 (add ptr (
iread ptr <* <$regmatch_state>> 41 (dread ptr %st_4826_5),
cvt ptr i32 (mul i32 (dread i32 %levVar_22924, constval i32 4))))
"cvt i32 u16 (iread u32 <* u16> 0 (add" expression cannot be changed to "cvt i32 u32 (iread u32 <* u16> 0 (add", can't use sext/zext either. Because some cases in HW fail, clang2mpl also fails to run these cases. I think this just shows the conversion from u16 to i32, which is not fundamentally different from clang2mpl's "cvt i32 u32" because the actual variable is the type of u16.
for example, SUP01220-3-2-1-2-a-t228 O0 pass, O2 fail, Clang2mpl behaves the same way.
Again, please avoid generating "cvt i32 i16", "cvt i32 u16", "cvt u32 i16", "cvt u32 u16" because nobody knows the exact semantics of these instructions.
It the change to use sext/zext causes some test to fail, we need to fix the other bugs that show up.
By the way, "cvt i32 u32" or "cvt u32 i32" are no-op's, and if they are generated, the optimizer will delete them. Same for "cvt i64 u64" or "cvt u64 i64".
In perlbench's regexec.c around line 6590:
@shortCircuit_label_29950 if (ne u1 u32 (dread u32 %shortCircuit, constval u1 0)) {
dassign %levVar_25371 0 (cvt i32 u1 (cvt u1 i32 (constval i32 1)))
}
else {
dassign %levVar_25371 0 (cvt i32 u1 (cvt u1 i32 (constval i32 0)))
}
dassign %sw_33_14 0 (cvt u1 i32 (dread i32 %levVar_25371))
The u1 primtype should not be used for C code, because C does not have boolean type. All the "cvt u1 i32" and "cvt i32 u1" can be omitted.
Except u1, All fixes have been megerd in HW and will be synchronized to the community later today. For u1, Initially, the value is int,not u1. The me change to u1 to improve performance.
@Leo Young i32 is changed to U1. The performance is improved. Let's talk to fred.
The Maple code I showed above is from mplfe (before running mplme). Those cvt's that have u1 are from mplfe.
I did check that I'm using the latest mplfe.
Sorry, there is a misunderstanding. mplfe does generate u1 (the u1 is changed from i32) cooperate with me to improve performance. In addition, I will change the mplfe to generate i32 to check the performance difference.
When u1 is optimized, the temporary variable levVar is also optimized, and the patch will be merged as soon as possible.
In addition, Instead of using cvt, replace it with sext/zext. but sext/zext is not omitted from small types to large types like clang2mpl, because BE cannot handle this implicit conversion, many cases, such as bior, fail to run.
登录 后才可以发表评论