[arm] Fix PR target/50305 (arm_legitimize_reload_address problem)

Hello,

the problem in PR 50305 turned out to be caused by the ARM back-end
LEGITIMIZE_RELOAD_ADDRESS implementation.

One of the arguments to the inline asm ("+Qo" (perf_event_id)) has
the form
   (mem/c/i:DI (plus:SI (reg/f:SI 152)
                        (const_int 1200 [0x4b0])) [5 perf_event_id+0 S8 A64])
before reload, where reg 152 holds the section anchor:
(insn 23 21 29 3 (set (reg/f:SI 152)
        (symbol_ref:SI ("*.LANCHOR0") [flags 0x182])) pr50305.c:36 176 {*arm_movsi_insn}
     (expr_list:REG_EQUAL (symbol_ref:SI ("*.LANCHOR0") [flags 0x182])
        (nil)))

The displacement is considered out of range for a DImode MEM, and therefore
reload attempts to reload the address.  The ARM LEGITIMIZE_RELOAD_ADDRESS
routine then attempts to optimize this by converting the address to:

    (mem/c/i:DI (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                  (const_int 1024 [0x400]))
                          (const_int 176 [0xb0])) [5 perf_event_id+0 S8 A64])

and pushing reloads:

Reload 0: reload_in (SI) = (plus:SI (reg/f:SI 3 r3 [152])
                                                    (const_int 1024 [0x400]))
        CORE_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 2)
        reload_in_reg: (plus:SI (reg/f:SI 3 r3 [152])
                                                    (const_int 1024 [0x400]))
Reload 1: reload_in (SI) = (plus:SI (reg/f:SI 3 r3 [152])
                                                    (const_int 1024 [0x400]))
        CORE_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 5)
        reload_in_reg: (plus:SI (reg/f:SI 3 r3 [152])
                                                    (const_int 1024 [0x400]))
Reload 2: reload_in (SI) = (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                                        (const_int 1024 [0x400]))
                                                    (const_int 176 [0xb0]))
        CORE_REGS, RELOAD_FOR_INPUT (opnum = 2), inc by 8
        reload_in_reg: (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                                        (const_int 1024 [0x400]))
                                                    (const_int 176 [0xb0]))
Reload 3: reload_in (SI) = (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                                        (const_int 1024 [0x400]))
                                                    (const_int 176 [0xb0]))
        CORE_REGS, RELOAD_FOR_INPUT (opnum = 5), inc by 8
        reload_in_reg: (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                                        (const_int 1024 [0x400]))
                                                    (const_int 176 [0xb0]))

(Note that the duplicate reloads are because the "+" operand has been
implicitly converted to an input and an output operand.  Reloads 2/3
are there because reload is not sure that the result of LEGITIMIZE_RELOAD_ADDRESS
is offsetable, and therefore reloads the whole thing anyway.)

Now the problem is that some other arguments of the asm don't all fit into
registers, and therefore we get a second pass through find_reloads.  At this
point, the insn stream has already been modified, so LEGITIMIZE_RELOAD_ADDRESS
this time around sees the RTL it has itself generated at the first pass.
However, it is not able to recognize this, and therefore doesn't re-generate
the required reloads, so instead generic code attempts to handle the
nested plus, and creates somewhat unfortunate reloads:

Reload 0: reload_in (SI) = (plus:SI (reg/f:SI 3 r3 [152])
                                                    (const_int 176 [0xb0]))
        CORE_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 2)
        reload_in_reg: (plus:SI (reg/f:SI 3 r3 [152])
                                                    (const_int 176 [0xb0]))
Reload 1: reload_in (SI) = (plus:SI (reg/f:SI 3 r3 [152])
                                                    (const_int 176 [0xb0]))
        CORE_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 5)
        reload_in_reg: (plus:SI (reg/f:SI 3 r3 [152])
                                                    (const_int 176 [0xb0]))
Reload 2: reload_out (SI) = (reg:SI 151 [ tmp ])
        GENERAL_REGS, RELOAD_FOR_INSN (opnum = 1)
        reload_out_reg: (reg:SI 151 [ tmp ])
Reload 3: reload_in (SI) = (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                                        (const_int 176 [0xb0]))
                                                    (const_int 1024 [0x400]))
        CORE_REGS, RELOAD_FOR_INPUT (opnum = 2), inc by 8
        reload_in_reg: (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                                        (const_int 176 [0xb0]))
                                                    (const_int 1024 [0x400]))
Reload 4: reload_in (SI) = (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                                        (const_int 176 [0xb0]))
                                                    (const_int 1024 [0x400]))
        CORE_REGS, RELOAD_FOR_INPUT (opnum = 5), inc by 8
        reload_in_reg: (plus:SI (plus:SI (reg/f:SI 3 r3 [152])
                                                        (const_int 176 [0xb0]))
                                                    (const_int 1024 [0x400]))

This can be fixed by having LEGITIMIZE_RELOAD_ADDRESS recognize RTL it has
generated itself in a prior pass (note that several other back-ends already
have a corresponding fix).

However, even then, the test case still fails.  This is because we then
emit reloads that compute (plus:SI (reg 152) (const_int 1024)).  But,
reg 152 was marked as equivalent to a constant (the LANCHOR address).

Also, reload needed another register, and decided to spill reg 152 from
its preliminary home in reg 3.  Since the register was equivalent to a
constant, it is not spilled on the stack; instead, reload assumes that
all uses would get reloaded to a rematerialization of the equivalent
constant.  This is in fact what reload itself will do; but this doesn't
happen if the reload is generated by LEGITIMIZE_RELOAD_ADDRESS.

In theory, LEGITIMIZE_RELOAD_ADDRESS could attempt to handle them by
substituting the equivalent constant and then reloading the result.
However, this might need additional steps (pushing to the constant pool,
reloading the constant pool address, ...) which would lead to significant
duplication of code from core reload.  This doesn't seem worthwhile
at this point ...

Therefore, the patch below fixes this second issue by simply not handling
addresses based on a register equivalent to a constant at all in
LEGITIMIZE_RELOAD_ADDRESS.  In general, common code should do a good
enough job for those anyway ...

Tested on arm-linux-gnueabi with no regressions, fixes the testcase.

OK for mainline?

Bye,
Ulrich

ChangeLog:

	gcc/
	PR target/50305
	* config/arm/arm.c (arm_legitimize_reload_address): Recognize
	output of a previous pass through legitimize_reload_address.
	Do not attempt to optimize addresses if the base register is
	equivalent to a constant.

	gcc/testsuite/
	PR target/50305
	* gcc.target/arm/pr50305.c: New test.

[arm] Fix PR target/50305 (arm_legitimize_reload_address problem)

Commit Message

Comments

Patch