Message ID | 552EED9F.3060901@linaro.org |
---|---|
State | New |
Headers | show |
> On Apr 18, 2015, at 8:21 PM, Richard Earnshaw <Richard.Earnshaw@foss.arm.com> wrote: > > On 18/04/15 16:13, Jakub Jelinek wrote: >> On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote: >>> You need to ensure that your scratch register cannot overlap op1, since >>> the scratch is written before op1 is read. >> >> - (clobber (match_scratch:QI 3 "=X,w,X"))] >> + (clobber (match_scratch:QI 3 "=X,&w,X"))] >> >> incremental diff should ensure that, right? >> >> Jakub >> > > > Sorry, where in the patch is that hunk? > > I see just: > > + (clobber (match_scratch:QI 3 "=X,w,X"))] Jakub's suggestion is an incremental patch on top of Kugan's. > > And why would early clobbering the scratch be notably better than the > original? > It will still be better. With this patch we want to allow RA freedom to optimally handle both of the following cases: 1. operand[1] dies after the instruction. In this case we want operand[0] and operand[1] to be assigned to the same reg, and operand[3] to be assigned to a different register to provide a temporary. In this case we don't care whether operand[3] is early-clobber or not. This case is not optimally handled with current insn patterns. 2. operand[1] lives on after the instruction. In this case we want operand[0] and operand[3] to be assigned to the same reg, and not clobber operand[1]. By marking operand[3] early-clobber we ensure that operand[1] is in a different register from what operand[0] and operand[3] were assigned to. This case should be handled equally well before and after the patch. My understanding is that Kugan's patch with Jakub's fix on top satisfy both of these cases. -- Maxim Kuvyrkov www.linaro.org
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 534a862..72a9f05 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3277,6 +3277,14 @@ DONE; } } + + if (<CODE> == LSHIFTRT) + { + emit_insn (gen_aarch64_lshr_sisd_or_int_<mode>3 (operands[0], + operands[1], + operands[2])); + DONE; + } } ) @@ -3361,11 +3369,13 @@ ) ;; Logical right shift using SISD or Integer instruction -(define_insn "*aarch64_lshr_sisd_or_int_<mode>3" - [(set (match_operand:GPI 0 "register_operand" "=w,&w,r") +(define_insn "aarch64_lshr_sisd_or_int_<mode>3" + [(set (match_operand:GPI 0 "register_operand" "=w,w,r") (lshiftrt:GPI (match_operand:GPI 1 "register_operand" "w,w,r") - (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "Us<cmode>,w,rUs<cmode>")))] + (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" + "Us<cmode>,w,rUs<cmode>"))) + (clobber (match_scratch:QI 3 "=X,w,X"))] "" "@ ushr\t%<rtn>0<vas>, %<rtn>1<vas>, %2 @@ -3379,30 +3389,28 @@ [(set (match_operand:DI 0 "aarch64_simd_register") (lshiftrt:DI (match_operand:DI 1 "aarch64_simd_register") - (match_operand:QI 2 "aarch64_simd_register")))] + (match_operand:QI 2 "aarch64_simd_register"))) + (clobber (match_scratch:QI 3))] "TARGET_SIMD && reload_completed" [(set (match_dup 3) (unspec:QI [(match_dup 2)] UNSPEC_SISD_NEG)) (set (match_dup 0) (unspec:DI [(match_dup 1) (match_dup 3)] UNSPEC_SISD_USHL))] - { - operands[3] = gen_lowpart (QImode, operands[0]); - } + "" ) (define_split [(set (match_operand:SI 0 "aarch64_simd_register") (lshiftrt:SI (match_operand:SI 1 "aarch64_simd_register") - (match_operand:QI 2 "aarch64_simd_register")))] + (match_operand:QI 2 "aarch64_simd_register"))) + (clobber (match_scratch:QI 3))] "TARGET_SIMD && reload_completed" [(set (match_dup 3) (unspec:QI [(match_dup 2)] UNSPEC_SISD_NEG)) (set (match_dup 0) (unspec:SI [(match_dup 1) (match_dup 3)] UNSPEC_USHL_2S))] - { - operands[3] = gen_lowpart (QImode, operands[0]); - } + "" ) ;; Arithmetic right shift using SISD or Integer instruction