[ARM] Thumb2 replicated constants

This patch is a repost of the one I previously posted here:

   http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html

As requested, I've broken out the other parts of the original patch, and 
those have already been reposted yesterday (and one committed also).

This (final) part is support for using Thumb2's replicated constants and 
addw/subw instructions as part of split constant loads. Previously the 
compiler could use these constants, but only where they would be loaded 
in a single instruction.

This patch must be applied on top of the addw/subw patch I posted yesterday.

The patch also optimizes the use of inverted or negated constants as a 
short-cut to the final value. The previous code did this in some cases, 
but could not be easily adapted to replicated constants.

The previous code also had a bug that prevented optimal use of shifted 
constants in Thumb code by imposing the same restrictions as ARM code. 
This has been fixed.

Example 1: addw as part of a split constant load

a + 0xfffff

    Before:
          movw    r3, #65535       ; 0x0ffff
          movt    r3, 15           ; 0xf0000
          adds    r3, r0, r3
    After:
          add     r0, r0, #1044480 ; 0xff000
          addw    r0, r0, #4095    ; 0x00fff

Example 2: arbitrary shifts bug fix

a - 0xfff1

    Before:
          sub     r0, r0, #65024   ; 0xfe00
          sub     r0, r0, #496     ; 0x01f0
          sub     r0, r0, #1       ; 0x0001
    After:
          sub     r0, r0, #65280   ; 0xff00
          sub     r0, r0, #241     ; 0x00f1

Example 3: 16-bit replicated patterns

a + 0x44004401

    Before:
          movw    r3, #17409          ; 0x00004401
          movt    r3, 17408           ; 0x44000000
          adds    r3, r0, r3
    After:
          add     r0, r0, #1140868096 ; 0x44004400
          adds    r0, r0, #1          ; 0x00000001

Example 4: 32-bit replicated patterns

a & 0xaaaaaa00

    Before:
          mov     r3, #43520           ; 0x0000aa00
          movt    r3, 43690            ; 0xaaaa0000
          and     r3, r0, r3
    After:
          and     r0, r0, #-1431655766 ; 0xaaaaaaaa
          bic     r0, r0, #170         ; 0x000000aa

The constant splitting code was duplicated in two places, and I would 
have needed to modify both quite heavily, so I have taken the 
opportunity to unify the two, and hopefully reduce the future 
maintenance burden.

Let me respond to a point Richard Earnshaw raised following the original 
posting:

 > A final note is that you may have missed some cases.  Now that we have
 > movw,
 > 	reg&  ~(16-bit const)
 > can now be done in at most 2 insns:
 > 	movw t1, #16-bit const
 > 	bic  Rd, reg, t1

Actually, I think we can do better than that for a 16-bit constant.

Given:

    a & ~(0xabcd)

Before my changes, GCC gave:

         bic     r0, r0, #43520
         bic     r0, r0, #460
         bic     r0, r0, #1

and after applying my patch:

         bic     r0, r0, #43776
         bic     r0, r0, #205

Two instructions and no temporary register.

 > On thumb-2 you can also use ORN that way as well.

It turns out that my previous patch was broken for ORN. I traced the 
problem to some confusing code already in arm.c that set can_invert for 
IOR, but then explicitly ignored it later (I had removed the second 
part, but not the first). I posted, and committed a patch to fix this 
yesterday.

In fact ORN is only of limited use for this kind of thing. Like AND, you 
can't use multiple ORNs to build a constant. The compiler already does 
use ORN in some circumstances, and this patch has not changed that.

Is the patch OK?

Andrew

[ARM] Thumb2 replicated constants

Commit Message

Comments

Patch