1.4K Star 7.6K Fork 1.4K

GVP方舟编译器 / OpenArkCompiler

 / 详情

mplfe should avoid using OP_cvt for integer types less than 32 bits

已完成
成员
创建于  
2021-09-22 23:06

An example is pr58726.c in ctorture. The .mpl generated by mplfe has these questionable cvt's:

    cvt i32 i16 (dread i32 %p),
    dassign %levVar_1 0 (cvt i32 i16 (dread i32 %p))
  return (cvt i16 i32 (dread i32 %levVar_1))
  dassign %d_19_3 0 (cvt u16 i32 (dread i32 %levVar_9))
  callassigned &foo (cvt i16 u16 (dread u32 %d_19_3)) { dassign %retVar_11 0 }
  dassign $c 0 (cvt i32 i16 (dread i32 %retVar_11))
    cvt i32 i16 (cvt i16 i32 (constval i32 0xdc36)))) {

Because the smallest register size is 32 bits, cvt's that have either result type or operand type less than 32 bits are not well-defined, and their semantics is not clear It is preferred and would be much more semantically explicit to use OP_zext and OP_sext instructions instead.

In Maple IR, regardless of the opcode, the result PrimType should never be less than 32 bits, because the result must be stored in a register and the smallest register is 32 bits.

评论 (18)

fredchow 创建了任务
展开全部操作日志

ok, I'll fix this issue

dassign %d_19_3 0 (cvt u16 i32 (dread i32 %levVar_9))
callassigned &foo (cvt i16 u16 (dread u32 %d_19_3)) { dassign %retVar_11 0 }

dassign %d_19_3 0 (zext i32 16 (dread i32 %levVar_9))
callassigned &foo (sext u32 16 (dread u32 %d_19_3)) { dassign %retVar_11 0 }

Some fixes have been merged.

输入图片说明

the zext/sext's primtype and opnd0 stays the same ? like the example above, zext with i32 and sext with u32

"stays the same" does not refer to the operand's primtype. sext/zext does not care the actual signedness of the operand. For sext, the result must be either i32 or i64. For zext, the result must be either u32 or u64.

"zext i32 16" should be "zext u32 16"

"sext u32 16" should be "sext i32 16"

For sext and zext, the signedness of their operand does not affect the semantics of the sext/zext operations.

ok, i'll correct it

Here is an example from 600.perlbench's regexec.c S_regmatch. regexec.c:5321

            scan = ST.me + ((ST.jump && ST.jump[ST.nextword])
                            ? ST.jump[ST.nextword]
                            : NEXT_OFF(ST.me));

preprocessed source

     scan = st->u.trie.me + ((st->u.trie.jump && st->u.trie.jump[st->u.trie.nextword])
       ? st->u.trie.jump[st->u.trie.nextword]
       : ((st->u.trie.me)->next_off));

after mplfe, only partially shown

      dassign %levVar_22924 0 (cvt i32 u16 (iread u32 <* u16> 0 (add ptr (
          iread ptr <* <$regmatch_state>> 40 (dread ptr %st_4826_5),
          mul ptr (
            cvt ptr i32 (iread u32 <* <$regmatch_state>> 44 (dread ptr %st_4826_5)),
            constval ptr 2)))))

value in %levVar_22924 is converted from u16 to i32 and later used in address computation.

    dassign %scan_4828_5 0 (add ptr (
        iread ptr <* <$regmatch_state>> 41 (dread ptr %st_4826_5),
        cvt ptr i32 (mul i32 (dread i32 %levVar_22924, constval i32 4))))

"cvt i32 u16 (iread u32 <* u16> 0 (add" expression cannot be changed to "cvt i32 u32 (iread u32 <* u16> 0 (add", can't use sext/zext either. Because some cases in HW fail, clang2mpl also fails to run these cases. I think this just shows the conversion from u16 to i32, which is not fundamentally different from clang2mpl's "cvt i32 u32" because the actual variable is the type of u16.

jinzhu 成员
回复 jinzhu 成员

for example, SUP01220-3-2-1-2-a-t228 O0 pass, O2 fail, Clang2mpl behaves the same way.

Again, please avoid generating "cvt i32 i16", "cvt i32 u16", "cvt u32 i16", "cvt u32 u16" because nobody knows the exact semantics of these instructions.

It the change to use sext/zext causes some test to fail, we need to fix the other bugs that show up.

By the way, "cvt i32 u32" or "cvt u32 i32" are no-op's, and if they are generated, the optimizer will delete them. Same for "cvt i64 u64" or "cvt u64 i64".

OK, I'll fix the other bugs first to do the above.

In perlbench's regexec.c around line 6590:

@shortCircuit_label_29950     if (ne u1 u32 (dread u32 %shortCircuit, constval u1 0)) {
      dassign %levVar_25371 0 (cvt i32 u1 (cvt u1 i32 (constval i32 1)))
    }
    else {
      dassign %levVar_25371 0 (cvt i32 u1 (cvt u1 i32 (constval i32 0)))
    }
    dassign %sw_33_14 0 (cvt u1 i32 (dread i32 %levVar_25371))

The u1 primtype should not be used for C code, because C does not have boolean type. All the "cvt u1 i32" and "cvt i32 u1" can be omitted.

Except u1, All fixes have been megerd in HW and will be synchronized to the community later today. For u1, Initially, the value is int,not u1. The me change to u1 to improve performance.

@Leo Young i32 is changed to U1. The performance is improved. Let's talk to fred.

The Maple code I showed above is from mplfe (before running mplme). Those cvt's that have u1 are from mplfe.

I did check that I'm using the latest mplfe.

Sorry, there is a misunderstanding. mplfe does generate u1 (the u1 is changed from i32) cooperate with me to improve performance. In addition, I will change the mplfe to generate i32 to check the performance difference.

输入图片说明

When u1 is optimized, the temporary variable levVar is also optimized, and the patch will be merged as soon as possible.

jinzhu 成员
回复 jinzhu 成员

All fixes have been megerd in HW and will be synchronized to the community later today. @fredchow

In addition, Instead of using cvt, replace it with sext/zext. but sext/zext is not omitted from small types to large types like clang2mpl, because BE cannot handle this implicit conversion, many cases, such as bior, fail to run.

fredchow 任务状态待办的 修改为已完成

登录 后才可以发表评论

状态
负责人
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
参与者(3)
C++
1
https://gitee.com/openarkcompiler/OpenArkCompiler.git
git@gitee.com:openarkcompiler/OpenArkCompiler.git
openarkcompiler
OpenArkCompiler
OpenArkCompiler

搜索帮助