Message ID | 1513001933-17348-1-git-send-email-james.greenhalgh@arm.com |
---|---|
State | New |
Headers | show |
Series | [combine] Don't create vector mode ZERO_EXTEND from subregs | expand |
On 12/11/2017 07:18 AM, James Greenhalgh wrote: > > Hi, > > In simplify_set we try transforming the paradoxical subreg expression: > > (set FOO (subreg:M (mem:N BAR) 0)) > > in to: > > (set FOO (zero_extend:M (mem:N BAR))) > > However, this code does not consider the case where M is a vector > mode, allowing it to construct (for example): > > (zero_extend:V4SI (mem:SI)) > > This would clearly have the wrong semantics, but fortunately we fail long > before then in expand_compound_operation. As we really don't want a vector > zero_extend of a scalar value. > > We need to explicitly reject vector modes from this transformation. > > This fixes a failure I'm seeing on a branch in which I'm trying to > tackle some performance regressions, so I have no live testcase for > this, but it is wrong by observation. > > Tested on aarch64-none-elf and bootstrapped on aarch64-none-linux-gnu with > no issues. > > OK? > > Thanks, > James > > --- > 2017-12-11 James Greenhalgh <james.greenhalgh@arm.com> > > * combine.c (simplify_set): Do not transform subregs to zero_extends > if the destination mode is a vector mode. > OK. Ideally you'd have a test for the testsuite as well, but I won't stress without it :-) jeff
I was happily going to close PR 55549, but it looks like we are trying the same transformation in a second place in that file :-( On Mon, 11 Dec 2017, James Greenhalgh wrote: > Hi, > > In simplify_set we try transforming the paradoxical subreg expression: > > (set FOO (subreg:M (mem:N BAR) 0)) > > in to: > > (set FOO (zero_extend:M (mem:N BAR))) > > However, this code does not consider the case where M is a vector > mode, allowing it to construct (for example): > > (zero_extend:V4SI (mem:SI)) > > This would clearly have the wrong semantics, but fortunately we fail long > before then in expand_compound_operation. As we really don't want a vector > zero_extend of a scalar value. > > We need to explicitly reject vector modes from this transformation. > > This fixes a failure I'm seeing on a branch in which I'm trying to > tackle some performance regressions, so I have no live testcase for > this, but it is wrong by observation. > > Tested on aarch64-none-elf and bootstrapped on aarch64-none-linux-gnu with > no issues. > > OK? > > Thanks, > James > > --- > 2017-12-11 James Greenhalgh <james.greenhalgh@arm.com> > > * combine.c (simplify_set): Do not transform subregs to zero_extends > if the destination mode is a vector mode. > > -- Marc Glisse
Hi! On Mon, Dec 11, 2017 at 02:18:53PM +0000, James Greenhalgh wrote: > > In simplify_set we try transforming the paradoxical subreg expression: > > (set FOO (subreg:M (mem:N BAR) 0)) > > in to: > > (set FOO (zero_extend:M (mem:N BAR))) > > However, this code does not consider the case where M is a vector > mode, allowing it to construct (for example): > > (zero_extend:V4SI (mem:SI)) > > This would clearly have the wrong semantics, but fortunately we fail long > before then in expand_compound_operation. As we really don't want a vector > zero_extend of a scalar value. > > We need to explicitly reject vector modes from this transformation. It does not consider any other modes either. Both modes involved are required to be SCALAR_INT_MODE_P (for zero_extend to be valid at all and to be equivalent to the subreg); could you test for that instead please? Segher
On Sun, Dec 17, 2017 at 03:14:08AM +0000, Segher Boessenkool wrote: > Hi! > > On Mon, Dec 11, 2017 at 02:18:53PM +0000, James Greenhalgh wrote: > > > > In simplify_set we try transforming the paradoxical subreg expression: > > > > (set FOO (subreg:M (mem:N BAR) 0)) > > > > in to: > > > > (set FOO (zero_extend:M (mem:N BAR))) > > > > However, this code does not consider the case where M is a vector > > mode, allowing it to construct (for example): > > > > (zero_extend:V4SI (mem:SI)) > > > > This would clearly have the wrong semantics, but fortunately we fail long > > before then in expand_compound_operation. As we really don't want a vector > > zero_extend of a scalar value. > > > > We need to explicitly reject vector modes from this transformation. > > It does not consider any other modes either. Both modes involved are > required to be SCALAR_INT_MODE_P (for zero_extend to be valid at all and > to be equivalent to the subreg); could you test for that instead please? Makes sense. I've committed this as obvious after an AArch64 bootstrap and test as revision 255945. Thanks, James --- gcc/ 2017-12-21 James Greenhalgh <james.greenhalgh@arm.com> * combine.c (simplify_set): Do not transform subregs to zero_extends if the destination is not a scalar int mode. diff --git a/gcc/combine.c b/gcc/combine.c index 9f19ee4..91bed06 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -6967,12 +6967,13 @@ simplify_set (rtx x) /* If we have (set FOO (subreg:M (mem:N BAR) 0)) with M wider than N, this would require a paradoxical subreg. Replace the subreg with a zero_extend to avoid the reload that would otherwise be required. - Don't do this for vector modes, as the transformation is incorrect. */ + Don't do this unless we have a scalar integer mode, otherwise the + transformation is incorrect. */ enum rtx_code extend_op; if (paradoxical_subreg_p (src) && MEM_P (SUBREG_REG (src)) - && !VECTOR_MODE_P (GET_MODE (src)) + && SCALAR_INT_MODE_P (GET_MODE (src)) && (extend_op = load_extend_op (GET_MODE (SUBREG_REG (src)))) != UNKNOWN) { SUBST (SET_SRC (x),
diff --git a/gcc/combine.c b/gcc/combine.c index 786a840..562eae6 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -6962,11 +6962,13 @@ simplify_set (rtx x) /* If we have (set FOO (subreg:M (mem:N BAR) 0)) with M wider than N, this would require a paradoxical subreg. Replace the subreg with a - zero_extend to avoid the reload that would otherwise be required. */ + zero_extend to avoid the reload that would otherwise be required. + Don't do this for vector modes, as the transformation is incorrect. */ enum rtx_code extend_op; if (paradoxical_subreg_p (src) && MEM_P (SUBREG_REG (src)) + && !VECTOR_MODE_P (GET_MODE (src)) && (extend_op = load_extend_op (GET_MODE (SUBREG_REG (src)))) != UNKNOWN) { SUBST (SET_SRC (x),