diff mbox

[rs6000] Add support for signed overflow arithmetic

Message ID 9218493.4xTCRBTuI6@polaris
State New
Headers show

Commit Message

Eric Botcazou Oct. 21, 2016, 11:03 p.m. UTC
Hi,

this implements support for signed overflow arithmetic on PowerPC.  It's an 
implementation for Power ISA v2.0x, i.e. it doesn't take account the new OV32 
flag introduced in v3.0.  It doesn't implement unsigned overflow arithmetic 
because my understanding is that the generic support already generates optimal 
code in most cases on PowerPC for unsigned.

It introduces a new MODE_CC mode (CCVmode) which represents the OV flag of the 
XER, and the overflow arithmetic instructions are paired with a mcrxr.  The 
comparisons are written in terms of UNSPECs because I used that for Visium and 
SPARC, but I can rewrite them a la x86/ARM if requested.

There is also a tweak to expand_arith_overflow, because it would otherwise 
"promote" signed multiplication to unsigned multiplication in some cases and 
this badly pessimizes for PowerPC.

Tested on PowerPC/Linux and PowerPC64/Linux, OK for the mainline?


2016-10-21  Eric Botcazou  <ebotcazou@adacore.com>

	* internal-fn.c (expand_arith_overflow): Do not promote a signed
	multiplication done in hardware to an unsigned open-coded one.
	* config/rs6000/rs6000-modes.def (CCV): New.
	* config/rs6000/rs6000-protos.h (rs6000_select_cc_mode): Declare.
	* config/rs6000/rs6000.h (SELECT_CC_MODE): Call it.
	* config/rs6000/rs6000.c (rs6000_debug_reg_global): Handle CCVmode.
	(validate_condition_mode): Likewise.
	(print_operand): Handle %C modifier.
	(rs6000_select_cc_mode): Likewise.
	(output_cbranch): Likewise.  Tidy up.
	* config/rs6000/rs6000.md (UNSPEC_{ADD,SUB,NEG,MUL}V): New constants.
	(addv<mode>4): New expander.
	(add<mode>3_overflow): New instruction.
	(add<mode>3_overflow_carry_in): New expander.
	(add<mode>3_overflow_carry_in_internal): New instruction.
	(add<mode>3_overflow_carry_in_0): Likewise.
	(add<mode>3_overflow_carry_in_m1): Likewise.
	(subv<mode>4): New expander.
	(subf<mode>3_overflow): New instruction.
	(subf<mode>3_overflow_carry_in): New expander.
	(sub<mode>3_overflow_carry_in_internal): New instruction.
	(subf<mode>3_overflow_carry_in_0): Likewise.
	(subf<mode>3_overflow_carry_in_m1): Likewise.
	(negv<mode>3): New expander.
	(neg<mode>2_overflow): New instruction.
	(mulv<mode>4): New expander.
	(mulv<mode>3_overflow): New instruction.
testsuite/
	* gcc.target/powerpc/overflow-1.c: New test.
	* gcc.target/powerpc/overflow-2.c: Likewise.
	* gcc.target/powerpc/overflow-3.c: Likewise.
	* gcc.target/powerpc/overflow-4.c: Likewise.

-- 
Eric Botcazou

Comments

Segher Boessenkool Oct. 22, 2016, 10:56 a.m. UTC | #1
Hi Eric,

Thanks for the patch.  Unfortunately there is a big problem with it :-(

On Sat, Oct 22, 2016 at 01:03:33AM +0200, Eric Botcazou wrote:
> this implements support for signed overflow arithmetic on PowerPC.  It's an 

> implementation for Power ISA v2.0x, i.e. it doesn't take account the new OV32 

> flag introduced in v3.0.  It doesn't implement unsigned overflow arithmetic 

> because my understanding is that the generic support already generates optimal 

> code in most cases on PowerPC for unsigned.

> 

> It introduces a new MODE_CC mode (CCVmode) which represents the OV flag of the 

> XER, and the overflow arithmetic instructions are paired with a mcrxr.


mcrxr does not exist anymore.  It is not implemented in any IBM non-embedded
CPU since POWER4.  PA6T does not have it either.  In some versions of the
2.0x ISA it does exist, but only in the optional "embedded" category.

Linux emulates mcrxr, but that is very slow.  You could use mfxer, but
that is a slow instruction as well (it is microcoded).

> The 

> comparisons are written in terms of UNSPECs because I used that for Visium and 

> SPARC, but I can rewrite them a la x86/ARM if requested.


That may work better.  It also may be better if you expose the OV bit
as a separate reg (just like we have CA), instead of putting two machine
insns in each template.

> @@ -21863,6 +21865,14 @@ print_operand (FILE *file, rtx x, int co

>        /* %c is output_addr_const if a CONSTANT_ADDRESS_P, otherwise

>  	 output_operand.  */

>  

> +    case 'C':

> +      /* X is a CR register.  Print the index number of the CR.  */

> +      if (GET_CODE (x) != REG || ! CR_REGNO_P (REGNO (x)))

> +	output_operand_lossage ("invalid %%E value");


"%%C value", no space after "!".

> +      else

> +	fputs (reg_names[REGNO (x)], file);

> +      return;


Why is this needed, do you have an assembler that wants register names
but not for mcrxr?


Segher
Eric Botcazou Oct. 24, 2016, 9:54 a.m. UTC | #2
> mcrxr does not exist anymore.  It is not implemented in any IBM non-embedded

> CPU since POWER4.  PA6T does not have it either.  In some versions of the

> 2.0x ISA it does exist, but only in the optional "embedded" category.


Yes, it exists in all (public) versions of the unified 2.0x ISA.

> Linux emulates mcrxr, but that is very slow.  You could use mfxer, but

> that is a slow instruction as well (it is microcoded).


So there is no efficient way of accessing the OV flag in recent non-embedded 
CPUs (except for the new mcrxrx instruction in the 3.0 version)?

> That may work better.  It also may be better if you expose the OV bit

> as a separate reg (just like we have CA), instead of putting two machine

> insns in each template.


Yes, although CA can be seen as behaving like a register to some extent, but 
OV far less so IMO.

-- 
Eric Botcazou
Segher Boessenkool Oct. 24, 2016, 10:59 a.m. UTC | #3
On Mon, Oct 24, 2016 at 11:54:27AM +0200, Eric Botcazou wrote:
> > mcrxr does not exist anymore.  It is not implemented in any IBM non-embedded

> > CPU since POWER4.  PA6T does not have it either.  In some versions of the

> > 2.0x ISA it does exist, but only in the optional "embedded" category.

> 

> Yes, it exists in all (public) versions of the unified 2.0x ISA.


But not in the "Base" category.  GCC should not generate mcrxr unless some
flag enabling the "Embedded" category is set (this can of course be set
via -mcpu=), or we have ISA 1.xx .  It is probably nicer if we treat
mcrxr separately, with an -mmcrxr flag, with TARGET_MCRXR.

It is "phased out" in 2.02, btw; I cannot find earlier versions right now.
2.02 says:
	Warning: This instruction has been phased out of
	the architecture. Attempting to execute this instruc-
	tion will cause the system illegal instruction error
	handler to be invoked.
<http://www.ibm.com/developerworks/systems/library/es-archguide-v2.html>

> > Linux emulates mcrxr, but that is very slow.  You could use mfxer, but

> > that is a slow instruction as well (it is microcoded).

> 

> So there is no efficient way of accessing the OV flag in recent non-embedded 

> CPUs (except for the new mcrxrx instruction in the 3.0 version)?


You can use SO instead, but then you have to make sure it is clear before
you run the OE=1 instruction.  SO of course is the cheapest way to access
overflow (it is copied to the CR field on any compare or recording insn).

And yeah something like mcrxrx should have existed long ago :-(

> > That may work better.  It also may be better if you expose the OV bit

> > as a separate reg (just like we have CA), instead of putting two machine

> > insns in each template.

> 

> Yes, although CA can be seen as behaving like a register to some extent, but 

> OV far less so IMO.


How so?  It behaves exactly like CA, i.e. exactly like a register (it _is_
a register!)


Segher
Eric Botcazou Oct. 24, 2016, 11:20 a.m. UTC | #4
> But not in the "Base" category.  GCC should not generate mcrxr unless some

> flag enabling the "Embedded" category is set (this can of course be set

> via -mcpu=), or we have ISA 1.xx .  It is probably nicer if we treat

> mcrxr separately, with an -mmcrxr flag, with TARGET_MCRXR.


OK, that makes sense indeed.

> You can use SO instead, but then you have to make sure it is clear before

> you run the OE=1 instruction.  SO of course is the cheapest way to access

> overflow (it is copied to the CR field on any compare or recording insn).


Is there an efficient way of clearing the XER then?  Because this kind of 
sticky flag is not usable with the current infrastructure.

> How so?  It behaves exactly like CA, i.e. exactly like a register (it _is_

> a register!)


You can do arithmetical operations with CA by means of the extended form 
whereas you cannot with OV, you really need to manipulate it like a flag.

-- 
Eric Botcazou
Segher Boessenkool Oct. 24, 2016, 11:40 a.m. UTC | #5
On Mon, Oct 24, 2016 at 01:20:09PM +0200, Eric Botcazou wrote:
> > You can use SO instead, but then you have to make sure it is clear before

> > you run the OE=1 instruction.  SO of course is the cheapest way to access

> > overflow (it is copied to the CR field on any compare or recording insn).

> 

> Is there an efficient way of clearing the XER then?  Because this kind of 

> sticky flag is not usable with the current infrastructure.


mtxer, maybe together with mfxer if you care about the other bits in there.
Efficient?  Nope.

Maybe the best you can do is generate the double-width result, and then
check if the upper halve is the sign extension of the lower halve.  Maybe
some trickery can help (for add/sub/neg at least).

You can also just FAIL the expander if !TARGET_MCRXR.  I wonder just how
bad the generic code is.

> > How so?  It behaves exactly like CA, i.e. exactly like a register (it _is_

> > a register!)

> 

> You can do arithmetical operations with CA by means of the extended form 

> whereas you cannot with OV, you really need to manipulate it like a flag.


Yes, you cannot cheaply read from it :-(  That doesn't matter for how we
model it in GCC though.  We cannot cheaply read from CA either.


Segher
Eric Botcazou Oct. 24, 2016, 4:14 p.m. UTC | #6
> Maybe the best you can do is generate the double-width result, and then

> check if the upper halve is the sign extension of the lower halve.  Maybe

> some trickery can help (for add/sub/neg at least).


That's inefficient, even for additive operations.

> You can also just FAIL the expander if !TARGET_MCRXR.  I wonder just how

> bad the generic code is.


It is branchy.  Here's a 32-bit overflow addition at -O2:

	cmpwi 7,4,0
	add 4,3,4
	blt- 7,.L4
	cmpw 7,4,3
	blt- 7,.L3
.L5:
	mr 3,4
	blr
.L4:
	cmpw 7,4,3
	ble+ 7,.L5
.L3:
	<overflow>

You can do it manually with just one branch:

	add 10,4,3
	srwi 4,4,31
	cmpw 7,10,3
	mfcr 9
	rlwinm 9,9,29,1
	cmpw 7,9,4
	bne- 7,.L5
	mr 3,10
	blr
.L5:
	<overflow>

and of course with -mmcrxr:

	addo 3,3,4
	mcrxr 7
	bgt- 7,.L10
	blr
L10:
	<overflow>

-- 
Eric Botcazou
Segher Boessenkool Oct. 24, 2016, 5:31 p.m. UTC | #7
On Mon, Oct 24, 2016 at 06:14:48PM +0200, Eric Botcazou wrote:
> > Maybe the best you can do is generate the double-width result, and then

> > check if the upper halve is the sign extension of the lower halve.  Maybe

> > some trickery can help (for add/sub/neg at least).

> 

> That's inefficient, even for additive operations.


It's better than the generic branch sequence below, or yours.  It still
sucks, obviously.

Let's see.  Completely untested.  Inputs in regs 3 and 4, output in reg 3.
32-bit code all the way.

add:
	eqv 9,3,4
	add 3,3,4
	xor 4,3,4
	and. 4,9,4
	blt <overflow>

sub:
	xor 9,3,4
	sub 3,3,4
	eqv 4,3,4
	and. 4,9,4
	blt <overflow>

neg:
	neg 3,3
	xoris. 9,3,0x8000
	beq <overflow>

mul:
	mulhw 9,3,4
	mullw 3,3,4
	srawi 4,9,31
	cmpw 4,9
	bne <overflow>

> > You can also just FAIL the expander if !TARGET_MCRXR.  I wonder just how

> > bad the generic code is.

> 

> It is branchy.  Here's a 32-bit overflow addition at -O2:

> 

> 	cmpwi 7,4,0

> 	add 4,3,4

> 	blt- 7,.L4

> 	cmpw 7,4,3

> 	blt- 7,.L3

> .L5:

> 	mr 3,4

> 	blr

> .L4:

> 	cmpw 7,4,3

> 	ble+ 7,.L5

> .L3:

> 	<overflow>

> 

> You can do it manually with just one branch:

> 

> 	add 10,4,3

> 	srwi 4,4,31

> 	cmpw 7,10,3

> 	mfcr 9

> 	rlwinm 9,9,29,1

> 	cmpw 7,9,4

> 	bne- 7,.L5

> 	mr 3,10

> 	blr

> .L5:

> 	<overflow>

> 

> and of course with -mmcrxr:

> 

> 	addo 3,3,4

> 	mcrxr 7

> 	bgt- 7,.L10

> 	blr

> L10:

> 	<overflow>


Or using mcrxr (or mtxer) and SO:

	mcrxr 0 # clear XER[SO], can use mtxer instead
	...
	addo. 3,3,4
	bso .L10
	blr
.L10:
	etc.

(but keeping track of when your SO flag is clear is a pain, and if you
have to reset it all the time there is no big win).


Segher
Eric Botcazou Oct. 24, 2016, 9:03 p.m. UTC | #8
> It's better than the generic branch sequence below, or yours.  It still

> sucks, obviously.


OK, I thought you were talking about the double-width result.  The non-branch 
sequence I posted is the generic non-branch sequence (that Ada was using).

> Let's see.  Completely untested.  Inputs in regs 3 and 4, output in reg 3.

> 32-bit code all the way.

> 

> add:

> 	eqv 9,3,4

> 	add 3,3,4

> 	xor 4,3,4

> 	and. 4,9,4

> 	blt <overflow>

> 

> sub:

> 	xor 9,3,4

> 	sub 3,3,4

> 	eqv 4,3,4

> 	and. 4,9,4

> 	blt <overflow>


These ones (if correct) are quite better than the generic code!

> neg:

> 	neg 3,3

> 	xoris. 9,3,0x8000

> 	beq <overflow>

> 

> mul:

> 	mulhw 9,3,4

> 	mullw 3,3,4

> 	srawi 4,9,31

> 	cmpw 4,9

> 	bne <overflow>


These ones are essentially equivalent to the generic code.

-- 
Eric Botcazou
Segher Boessenkool Oct. 24, 2016, 9:30 p.m. UTC | #9
On Mon, Oct 24, 2016 at 11:03:25PM +0200, Eric Botcazou wrote:
> > Let's see.  Completely untested.  Inputs in regs 3 and 4, output in reg 3.

> > 32-bit code all the way.

> > 

> > add:

> > 	eqv 9,3,4

> > 	add 3,3,4

> > 	xor 4,3,4

> > 	and. 4,9,4

> > 	blt <overflow>

> > 

> > sub:

> > 	xor 9,3,4

> > 	sub 3,3,4

> > 	eqv 4,3,4

> > 	and. 4,9,4

> > 	blt <overflow>

> 

> These ones (if correct) are quite better than the generic code!


It is nicely generic as well, but requires more insns than this if your
ISA does not have a full complement of logical ops.  Well, just an
"andnot" is enough, you don't actually need eqv here.

Addition has a signed overflow if and only if the two inputs have the
same sign, and the result has the opposite sign.  Subtraction overflows
if and only if the two inputs have opposite sign, and the subtrahend has
the same sign as the result.  Here 0 is counted as positive.

> > neg:

> > 	neg 3,3

> > 	xoris. 9,3,0x8000

> > 	beq <overflow>

> > 

> > mul:

> > 	mulhw 9,3,4

> > 	mullw 3,3,4

> > 	srawi 4,9,31

> > 	cmpw 4,9

> > 	bne <overflow>

> 

> These ones are essentially equivalent to the generic code.


And the generic one for div is as good as it gets as well I suppose?
We cannot use the result of the divide insn if it overflows (the result
is undefined), so there isn't much at all we can do.


Segher
Eric Botcazou Oct. 24, 2016, 10:29 p.m. UTC | #10
> And the generic one for div is as good as it gets as well I suppose?


There is no support for div at all, only add/sub/neg/mul.

-- 
Eric Botcazou
Eric Botcazou Oct. 25, 2016, 10:09 a.m. UTC | #11
> It is nicely generic as well, but requires more insns than this if your

> ISA does not have a full complement of logical ops.  Well, just an

> "andnot" is enough, you don't actually need eqv here.


Indeed, and it's rather spectacular for 64-bit operations on 32-bit machines 
because the sign bit trick can be done solely on the upper word, whereas a 
fully-fledged store-flag sequence is heavyweight in this case.  Here's what 
the patched generic code yields at -O2:

op__add32:
	add 10,3,4
	xor 9,10,4
	eqv 4,3,4
	and. 8,9,4
	blt- <overflow>
	mr 3,10
	blr

op__add64:
	addc 4,4,6
	adde 10,3,5
	xor 9,10,5
	eqv 5,3,5
	and. 8,9,5
	blt- <overflow>
	mr 3,10
	blr

This should help 32-bit x86 too, which doesn't have 64-bit operations AFAICS.

Thanks for the tip, I'll submit the change to the generic code separately.

-- 
Eric Botcazou
diff mbox

Patch

Index: internal-fn.c
===================================================================
--- internal-fn.c	(revision 241379)
+++ internal-fn.c	(working copy)
@@ -1772,10 +1772,23 @@  expand_arith_overflow (enum tree_code co
   int prec1 = TYPE_PRECISION (TREE_TYPE (arg1));
   int precres = TYPE_PRECISION (type);
   location_t loc = gimple_location (stmt);
-  if (!uns0_p && get_range_pos_neg (arg0) == 1)
-    uns0_p = true;
-  if (!uns1_p && get_range_pos_neg (arg1) == 1)
-    uns1_p = true;
+  /* Try to promote to unsigned since unsigned overflow is easier to open
+     code than signed overflow, but not for multiplication if that would
+     mean not using the hardware because this would very likely result in
+     doing 2 multiplications instead of only 1, e.g. on PowerPC.  */
+  if (code == MULT_EXPR
+      && !unsr_p
+      && precres <= BITS_PER_WORD
+      && optab_handler (mulv4_optab, TYPE_MODE (type)) != CODE_FOR_nothing
+      && optab_handler (umulv4_optab, TYPE_MODE (type)) == CODE_FOR_nothing)
+    ;
+  else
+    {
+      if (!uns0_p && get_range_pos_neg (arg0) == 1)
+	uns0_p = true;
+      if (!uns1_p && get_range_pos_neg (arg1) == 1)
+	uns1_p = true;
+    }
   int pr = get_min_precision (arg0, uns0_p ? UNSIGNED : SIGNED);
   prec0 = MIN (prec0, pr);
   pr = get_min_precision (arg1, uns1_p ? UNSIGNED : SIGNED);
Index: config/rs6000/rs6000-modes.def
===================================================================
--- config/rs6000/rs6000-modes.def	(revision 241313)
+++ config/rs6000/rs6000-modes.def	(working copy)
@@ -32,13 +32,15 @@  FLOAT_MODE (TF, 16, ieee_quad_format);
 /* Add any extra modes needed to represent the condition code.
 
    For the RS/6000, we need separate modes when unsigned (logical) comparisons
-   are being done and we need a separate mode for floating-point.  We also
-   use a mode for the case when we are comparing the results of two
-   comparisons, as then only the EQ bit is valid in the register.  */
+   are being done and we need a separate mode for floating-point.  We also use
+   a mode for the case when we are comparing the results of two comparisons,
+   as then only the EQ bit is valid in the register.  We also use a mode for
+   detecting signed overflow, as only the GT bit is valid in the register.  */
 
 CC_MODE (CCUNS);
 CC_MODE (CCFP);
 CC_MODE (CCEQ);
+CC_MODE (CCV);
 
 /* Vector modes.  */
 VECTOR_MODES (INT, 8);        /*       V8QI  V4HI V2SI */
Index: config/rs6000/rs6000-protos.h
===================================================================
--- config/rs6000/rs6000-protos.h	(revision 241313)
+++ config/rs6000/rs6000-protos.h	(working copy)
@@ -127,8 +127,8 @@  extern int ccr_bit (rtx, int);
 extern void rs6000_output_function_entry (FILE *, const char *);
 extern void print_operand (FILE *, rtx, int);
 extern void print_operand_address (FILE *, rtx);
-extern enum rtx_code rs6000_reverse_condition (machine_mode,
-					       enum rtx_code);
+extern machine_mode rs6000_select_cc_mode (enum rtx_code, rtx, rtx);
+extern enum rtx_code rs6000_reverse_condition (machine_mode, enum rtx_code);
 extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx);
 extern void rs6000_emit_sISEL (machine_mode, rtx[]);
 extern void rs6000_emit_sCOND (machine_mode, rtx[]);
Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c	(revision 241313)
+++ config/rs6000/rs6000.c	(working copy)
@@ -2374,6 +2374,7 @@  rs6000_debug_reg_global (void)
     CCmode,
     CCUNSmode,
     CCEQmode,
+    CCVmode
   };
 
   /* Virtual regs we are interested in.  */
@@ -19334,6 +19335,7 @@  validate_condition_mode (enum rtx_code c
 
   /* These are invalid; the information is not there.  */
   gcc_assert (mode != CCEQmode || code == EQ || code == NE);
+  gcc_assert (mode != CCVmode || code == EQ || code == NE);
 }
 
 
@@ -21863,6 +21865,14 @@  print_operand (FILE *file, rtx x, int co
       /* %c is output_addr_const if a CONSTANT_ADDRESS_P, otherwise
 	 output_operand.  */
 
+    case 'C':
+      /* X is a CR register.  Print the index number of the CR.  */
+      if (GET_CODE (x) != REG || ! CR_REGNO_P (REGNO (x)))
+	output_operand_lossage ("invalid %%E value");
+      else
+	fputs (reg_names[REGNO (x)], file);
+      return;
+
     case 'D':
       /* Like 'J' but get to the GT bit only.  */
       gcc_assert (REG_P (x));
@@ -22638,6 +22648,34 @@  rs6000_assemble_visibility (tree decl, i
 }
 #endif
 
+machine_mode
+rs6000_select_cc_mode (enum rtx_code op, rtx x, rtx y)
+{
+  /* For floating-point, CCFPmode should be used.  CCUNSmode should be used
+     for unsigned comparisons.  CCEQmode should be used when we are doing an
+     inequality comparison on the result of a comparison.  CCVmode should be
+     used for the special overflow comparisons.  CCmode should be used in all
+     the other cases.  */
+
+  if (SCALAR_FLOAT_MODE_P (GET_MODE (x)))
+    return CCFPmode;
+
+  if ((op == GTU || op == LTU || op == GEU || op == LEU))
+    return CCUNSmode;
+
+  if ((op == EQ || op == NE) && COMPARISON_P (x))
+    return CCEQmode;
+
+  if (GET_CODE (y) == UNSPEC
+      && (XINT (y, 1) == UNSPEC_ADDV
+	 || XINT (y, 1) == UNSPEC_SUBV
+	 || XINT (y, 1) == UNSPEC_NEGV
+	 || XINT (y, 1) == UNSPEC_MULV))
+    return CCVmode;
+
+  return CCmode;
+}
+
 enum rtx_code
 rs6000_reverse_condition (machine_mode mode, enum rtx_code code)
 {
@@ -23558,7 +23596,6 @@  output_cbranch (rtx op, const char *labe
   enum rtx_code code = GET_CODE (op);
   rtx cc_reg = XEXP (op, 0);
   machine_mode mode = GET_MODE (cc_reg);
-  int cc_regno = REGNO (cc_reg) - CR0_REGNO;
   int need_longbranch = label != NULL && get_attr_length (insn) == 8;
   int really_reversed = reversed ^ need_longbranch;
   char *s = string;
@@ -23601,6 +23638,24 @@  output_cbranch (rtx op, const char *labe
 	}
     }
 
+  if (mode == CCVmode)
+    {
+      switch (code)
+	{
+	case EQ:
+	  /* Opposite of GT.  */
+	  code = LE;
+	  break;
+
+	case NE:
+	  code = GT;
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	}
+    }
+
   switch (code)
     {
       /* Not all of these are actually distinct opcodes, but
@@ -23659,9 +23714,9 @@  output_cbranch (rtx op, const char *labe
 
   /* We need to escape any '%' characters in the reg_names string.
      Assume they'd only be the first character....  */
-  if (reg_names[cc_regno + CR0_REGNO][0] == '%')
+  if (reg_names[REGNO (cc_reg)][0] == '%')
     *s++ = '%';
-  s += sprintf (s, "%s", reg_names[cc_regno + CR0_REGNO]);
+  s += sprintf (s, "%s", reg_names[REGNO (cc_reg)]);
 
   if (label != NULL)
     {
Index: config/rs6000/rs6000.h
===================================================================
--- config/rs6000/rs6000.h	(revision 241313)
+++ config/rs6000/rs6000.h	(working copy)
@@ -2225,17 +2225,8 @@  extern unsigned rs6000_pmode;
 /* #define ADJUST_INSN_LENGTH(X,LENGTH) */
 
 /* Given a comparison code (EQ, NE, etc.) and the first operand of a
-   COMPARE, return the mode to be used for the comparison.  For
-   floating-point, CCFPmode should be used.  CCUNSmode should be used
-   for unsigned comparisons.  CCEQmode should be used when we are
-   doing an inequality comparison on the result of a
-   comparison.  CCmode should be used in all other cases.  */
-
-#define SELECT_CC_MODE(OP,X,Y) \
-  (SCALAR_FLOAT_MODE_P (GET_MODE (X)) ? CCFPmode	\
-   : (OP) == GTU || (OP) == LTU || (OP) == GEU || (OP) == LEU ? CCUNSmode \
-   : (((OP) == EQ || (OP) == NE) && COMPARISON_P (X)			  \
-      ? CCEQmode : CCmode))
+   COMPARE, return the mode to be used for the comparison.  */
+#define SELECT_CC_MODE(OP,X,Y) rs6000_select_cc_mode (OP, X, Y)
 
 /* Can the condition code MODE be safely reversed?  This is safe in
    all cases on this port, because at present it doesn't use the
Index: config/rs6000/rs6000.md
===================================================================
--- config/rs6000/rs6000.md	(revision 241313)
+++ config/rs6000/rs6000.md	(working copy)
@@ -149,6 +149,10 @@  (define_c_enum "unspec"
    UNSPEC_IEEE128_CONVERT
    UNSPEC_SIGNBIT
    UNSPEC_DOLOOP
+   UNSPEC_ADDV
+   UNSPEC_SUBV
+   UNSPEC_NEGV
+   UNSPEC_MULV
   ])
 
 ;;
@@ -1636,6 +1640,39 @@  (define_expand "add<mode>3"
     }
 })
 
+(define_expand "addv<mode>4"
+  [(match_operand:SDI 0 "register_operand")
+   (match_operand:SDI 1 "register_operand")
+   (match_operand:SDI 2 "register_operand")
+   (match_operand 3 "")]
+  "!(<MODE>mode == SImode && TARGET_POWERPC64)"
+{
+  rtx cc_reg = gen_reg_rtx (CCVmode);
+
+  if (<MODE>mode == DImode && !TARGET_POWERPC64)
+    {
+      rtx lo0 = gen_lowpart (SImode, operands[0]);
+      rtx lo1 = gen_lowpart (SImode, operands[1]);
+      rtx lo2 = gen_lowpart (SImode, operands[2]);
+      rtx hi0 = gen_highpart (SImode, operands[0]);
+      rtx hi1 = gen_highpart (SImode, operands[1]);
+      rtx hi2 = gen_highpart (SImode, operands[2]);
+
+      emit_insn (gen_addsi3_carry (lo0, lo1, lo2));
+      emit_insn (gen_addsi3_overflow_carry_in (hi0, hi1, hi2, cc_reg));
+    }
+  else
+    emit_insn (gen_add<mode>3_overflow (operands[0], operands[1], operands[2],
+					cc_reg));
+
+  rtx cond = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx);
+  rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]);
+  emit_jump_insn (gen_rtx_SET (pc_rtx,
+			       gen_rtx_IF_THEN_ELSE (VOIDmode, cond,
+						    loc_ref, pc_rtx)));
+  DONE;
+})
+
 (define_insn "*add<mode>3"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
 	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
@@ -1837,6 +1874,21 @@  (define_insn "*add<mode>3_imm_carry_neg"
   [(set_attr "type" "add")])
 
 
+;; The OV flag is set on overflow in Pmode only.
+(define_insn "add<mode>3_overflow"
+  [(set (match_operand:CCV 3 "cc_reg_operand" "=y")
+	(compare:CCV (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
+			     (match_operand:P 2 "gpc_reg_operand" "r"))
+		     (unspec:P [(match_dup 1) (match_dup 2)]
+			       UNSPEC_ADDV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(plus:P (match_dup 1) (match_dup 2)))]
+  ""
+  "addo %0,%1,%2\;mcrxr %C3"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+
 (define_expand "add<mode>3_carry_in"
   [(parallel [
      (set (match_operand:GPR 0 "gpc_reg_operand")
@@ -1888,6 +1940,83 @@  (define_insn "add<mode>3_carry_in_m1"
   [(set_attr "type" "add")])
 
 
+(define_expand "add<mode>3_overflow_carry_in"
+  [(parallel [
+     (set (match_operand:CCV 3 "cc_reg_operand")
+	  (compare:CCV (plus:P (plus:P (match_operand:P 1 "gpc_reg_operand")
+				       (match_operand:P 2 "adde_operand"))
+				(reg:P CA_REGNO))
+		       (unspec:P [(plus:P (match_dup 1) (match_dup 2))
+				  (reg:P CA_REGNO)] UNSPEC_ADDV)))
+     (set (match_operand:P 0 "gpc_reg_operand")
+	  (plus:P (plus:P (match_dup 1) (match_dup 2))
+		  (reg:P CA_REGNO)))
+     (clobber (reg:P CA_REGNO))])]
+  ""
+{
+  if (operands[2] == const0_rtx)
+    {
+      emit_insn (gen_add<mode>3_overflow_carry_in_0 (operands[0],
+						     operands[1],
+						     operands[3]));
+      DONE;
+    }
+  if (operands[2] == constm1_rtx)
+    {
+      emit_insn (gen_add<mode>3_overflow_carry_in_m1 (operands[0],
+						      operands[1],
+						      operands[3]));
+      DONE;
+    }
+})
+
+(define_insn "*add<mode>3_overflow_carry_in_internal"
+  [(set (match_operand:CCV 3 "cc_reg_operand" "=y")
+	(compare:CCV (plus:P (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
+				     (match_operand:P 2 "gpc_reg_operand" "r"))
+			     (reg:P CA_REGNO))
+		     (unspec:P [(plus:P (match_dup 1) (match_dup 2))
+				(reg:P CA_REGNO)] UNSPEC_ADDV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(plus:P (plus:P (match_dup 1) (match_dup 2))
+		(reg:P CA_REGNO)))
+   (clobber (reg:GPR CA_REGNO))]
+  ""
+  "addeo %0,%1,%2\;mcrxr %C3"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+(define_insn "add<mode>3_overflow_carry_in_0"
+  [(set (match_operand:CCV 2 "cc_reg_operand" "=y")
+	(compare:CCV (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
+			     (reg:P CA_REGNO))
+		     (unspec:P [(plus:P (match_dup 1) (reg:P CA_REGNO))]
+			       UNSPEC_ADDV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(plus:P (match_dup 1) (reg:P CA_REGNO)))
+   (clobber (reg:P CA_REGNO))]
+  ""
+  "addzeo %0,%1\;mcrxr %C2"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+(define_insn "add<mode>3_overflow_carry_in_m1"
+  [(set (match_operand:CCV 2 "cc_reg_operand" "=y")
+	(compare:CCV (plus:P (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
+				     (reg:P CA_REGNO))
+			     (const_int -1))
+		     (unspec:P [(plus:P (match_dup 1) (reg:P CA_REGNO))
+				(const_int -1)] UNSPEC_ADDV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(plus:P (plus:P (match_dup 1) (reg:P CA_REGNO))
+		(const_int -1)))
+   (clobber (reg:P CA_REGNO))]
+  ""
+  "addmeo %0,%1\;mcrxr %C2"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+
 (define_expand "one_cmpl<mode>2"
   [(set (match_operand:SDI 0 "gpc_reg_operand" "")
 	(not:SDI (match_operand:SDI 1 "gpc_reg_operand" "")))]
@@ -1980,6 +2109,39 @@  (define_expand "sub<mode>3"
     }
 })
 
+(define_expand "subv<mode>4"
+  [(match_operand:SDI 0 "register_operand")
+   (match_operand:SDI 1 "register_operand")
+   (match_operand:SDI 2 "register_operand")
+   (match_operand 3 "")]
+  "!(<MODE>mode == SImode && TARGET_POWERPC64)"
+{
+  rtx cc_reg = gen_reg_rtx (CCVmode);
+
+  if (<MODE>mode == DImode && !TARGET_POWERPC64)
+    {
+      rtx lo0 = gen_lowpart (SImode, operands[0]);
+      rtx lo1 = gen_lowpart (SImode, operands[1]);
+      rtx lo2 = gen_lowpart (SImode, operands[2]);
+      rtx hi0 = gen_highpart (SImode, operands[0]);
+      rtx hi1 = gen_highpart (SImode, operands[1]);
+      rtx hi2 = gen_highpart (SImode, operands[2]);
+
+      emit_insn (gen_subfsi3_carry (lo0, lo2, lo1));
+      emit_insn (gen_subfsi3_overflow_carry_in (hi0, hi2, hi1, cc_reg));
+    }
+  else
+    emit_insn (gen_subf<mode>3_overflow (operands[0], operands[2], operands[1],
+				         cc_reg));
+
+  rtx cond = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx);
+  rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]);
+  emit_jump_insn (gen_rtx_SET (pc_rtx,
+			       gen_rtx_IF_THEN_ELSE (VOIDmode, cond,
+						     loc_ref, pc_rtx)));
+  DONE;
+})
+
 (define_insn "*subf<mode>3"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
 	(minus:GPR (match_operand:GPR 2 "gpc_reg_operand" "r")
@@ -2075,6 +2237,21 @@  (define_insn "*subf<mode>3_imm_carry_m1"
   [(set_attr "type" "add")])
 
 
+;; The OV flag is set on overflow in Pmode only.
+(define_insn "subf<mode>3_overflow"
+  [(set (match_operand:CCV 3 "cc_reg_operand" "=y")
+	(compare:CCV (minus:P (match_operand:P 2 "gpc_reg_operand" "r")
+			      (match_operand:P 1 "gpc_reg_operand" "r"))
+		     (unspec:P [(match_dup 2) (match_dup 1)]
+			       UNSPEC_SUBV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(minus:P (match_dup 2) (match_dup 1)))]
+  ""
+  "subfo %0,%1,%2\;mcrxr %C3"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+
 (define_expand "subf<mode>3_carry_in"
   [(parallel [
      (set (match_operand:GPR 0 "gpc_reg_operand")
@@ -2135,6 +2312,87 @@  (define_insn "subf<mode>3_carry_in_xx"
   [(set_attr "type" "add")])
 
 
+(define_expand "subf<mode>3_overflow_carry_in"
+  [(parallel [
+     (set (match_operand:CCV 3 "cc_reg_operand")
+	  (compare:CCV
+	    (plus:P (plus:P (not:P (match_operand:P 1 "gpc_reg_operand"))
+			    (reg:P CA_REGNO))
+		    (match_operand:P 2 "adde_operand"))
+	    (unspec:P [(plus:P (not:P (match_dup 1)) (reg:P CA_REGNO))
+		       (match_dup 2)] UNSPEC_ADDV)))
+     (set (match_operand:P 0 "gpc_reg_operand")
+	  (plus:P (plus:P (not:P (match_dup 1)) (reg:P CA_REGNO))
+		  (match_dup 2)))
+     (clobber (reg:P CA_REGNO))])]
+  ""
+{
+  if (operands[2] == const0_rtx)
+    {
+      emit_insn (gen_subf<mode>3_overflow_carry_in_0 (operands[0],
+						      operands[1],
+						      operands[3]));
+      DONE;
+    }
+  if (operands[2] == constm1_rtx)
+    {
+      emit_insn (gen_subf<mode>3_overflow_carry_in_m1 (operands[0],
+						       operands[1],
+						       operands[3]));
+      DONE;
+    }
+})
+
+(define_insn "*sub<mode>3_overflow_carry_in_internal"
+  [(set (match_operand:CCV 3 "cc_reg_operand" "=y")
+	(compare:CCV
+	  (plus:P (plus:P (not:P (match_operand:P 1 "gpc_reg_operand" "r"))
+			  (reg:P CA_REGNO))
+		  (match_operand:P 2 "gpc_reg_operand" "r"))
+	  (unspec:P [(plus:P (not:P (match_dup 1)) (reg:P CA_REGNO))
+		     (match_dup 2)] UNSPEC_ADDV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(plus:P (plus:P (not:P (match_dup 1)) (reg:P CA_REGNO))
+	        (match_dup 2)))
+   (clobber (reg:P CA_REGNO))]
+  ""
+  "subfeo %0,%1,%2\;mcrxr %C3"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+(define_insn "subf<mode>3_overflow_carry_in_0"
+  [(set (match_operand:CCV 2 "cc_reg_operand" "=y")
+	(compare:CCV
+	  (plus:P (not:P (match_operand:P 1 "gpc_reg_operand" "r"))
+		  (reg:P CA_REGNO))
+	  (unspec:P [(not:P (match_dup 1)) (reg:P CA_REGNO)]
+		    UNSPEC_ADDV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(plus:P (not:P (match_dup 1)) (reg:P CA_REGNO)))
+   (clobber (reg:P CA_REGNO))]
+  ""
+  "subfzeo %0,%1\;mcrxr %C2"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+(define_insn "subf<mode>3_overflow_carry_in_m1"
+  [(set (match_operand:CCV 2 "cc_reg_operand" "=y")
+	(compare:CCV
+	  (plus:P (minus:P (reg:P CA_REGNO)
+			   (match_operand:P 1 "gpc_reg_operand" "r"))
+		  (const_int -2))
+	  (unspec:P [(minus:P (reg:P CA_REGNO) (match_dup 1))
+		     (const_int -2)] UNSPEC_ADDV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(plus:P (minus:P (reg:P CA_REGNO) (match_dup 1))
+		(const_int -2)))
+   (clobber (reg:P CA_REGNO))]
+  ""
+  "subfmeo %0,%1\;mcrxr %C2"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+
 (define_insn "neg<mode>2"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
 	(neg:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")))]
@@ -2142,6 +2400,35 @@  (define_insn "neg<mode>2"
   "neg %0,%1"
   [(set_attr "type" "add")])
 
+(define_expand "negv<mode>3"
+  [(match_operand:SDI 0 "register_operand")
+   (match_operand:SDI 1 "register_operand")
+   (match_operand 2 "")]
+  "!(<MODE>mode == SImode && TARGET_POWERPC64)"
+{
+  rtx cc_reg = gen_reg_rtx (CCVmode);
+
+  if (<MODE>mode == DImode && !TARGET_POWERPC64)
+    {
+      rtx lo0 = gen_lowpart (SImode, operands[0]);
+      rtx lo1 = gen_lowpart (SImode, operands[1]);
+      rtx hi0 = gen_highpart (SImode, operands[0]);
+      rtx hi1 = gen_highpart (SImode, operands[1]);
+
+      emit_insn (gen_subfsi3_carry (lo0, lo1, const0_rtx));
+      emit_insn (gen_subfsi3_overflow_carry_in_0 (hi0, hi1, cc_reg));
+    }
+  else
+    emit_insn (gen_neg<mode>2_overflow (operands[0], operands[1], cc_reg));
+
+  rtx cond = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx);
+  rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[2]);
+  emit_jump_insn (gen_rtx_SET (pc_rtx,
+			       gen_rtx_IF_THEN_ELSE (VOIDmode, cond,
+						     loc_ref, pc_rtx)));
+  DONE;
+})
+
 (define_insn_and_split "*neg<mode>2_dot"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y")
 	(compare:CC (neg:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r"))
@@ -2184,6 +2471,18 @@  (define_insn_and_split "*neg<mode>2_dot2
    (set_attr "length" "4,8")])
 
 
+(define_insn "neg<mode>2_overflow"
+  [(set (match_operand 2 "cc_reg_operand" "=y")
+	(compare:CCV (neg:P (match_operand:P 1 "gpc_reg_operand" "r"))
+		     (unspec:P [(match_dup 1)] UNSPEC_NEGV)))
+   (set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(neg:P (match_dup 1)))]
+  ""
+  "nego %0,%1\;mcrxr %C2"
+  [(set_attr "type" "add")
+   (set_attr "length" "8")])
+
+
 (define_insn "clz<mode>2"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
 	(clz:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")))]
@@ -2759,6 +3058,24 @@  (define_insn "mul<mode>3"
 		(const_string "16")]
 	(const_string "<bits>")))])
 
+(define_expand "mulv<mode>4"
+  [(match_operand:SDI 0 "register_operand")
+   (match_operand:SDI 1 "register_operand")
+   (match_operand:SDI 2 "register_operand")
+   (match_operand 3 "")]
+  "<MODE>mode == SImode || TARGET_POWERPC64"
+{
+  rtx cc_reg = gen_reg_rtx (CCVmode);
+  emit_insn (gen_mul<mode>3_overflow (operands[0], operands[1], operands[2],
+				      cc_reg));
+  rtx cond = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx);
+  rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]);
+  emit_jump_insn (gen_rtx_SET (pc_rtx,
+			       gen_rtx_IF_THEN_ELSE (VOIDmode, cond,
+						     loc_ref, pc_rtx)));
+  DONE;
+})
+
 (define_insn_and_split "*mul<mode>3_dot"
   [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
 	(compare:CC (mult:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
@@ -2807,6 +3124,20 @@  (define_insn_and_split "*mul<mode>3_dot2
    (set_attr "dot" "yes")
    (set_attr "length" "4,8")])
 
+;; The OV flag is set on overflow in SImode for mullw and DImode for mulld.
+(define_insn "mul<mode>3_overflow"
+  [(set (match_operand:CCV 3 "cc_reg_operand" "=y")
+	(compare:CCV (mult:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
+			       (match_operand:GPR 2 "gpc_reg_operand" "r"))
+		     (unspec:GPR [(match_dup 1) (match_dup 2)]
+			         UNSPEC_MULV)))
+   (set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+	(mult:GPR (match_dup 1) (match_dup 2)))]
+  ""
+  "mull<wd>o %0,%1,%2\;mcrxr %C3"
+  [(set_attr "type" "mul")
+   (set_attr "length" "8")])
+
 
 (define_expand "<su>mul<mode>3_highpart"
   [(set (match_operand:GPR 0 "gpc_reg_operand")
Index: testsuite/gcc.target/powerpc/overflow-1.c
===================================================================
--- testsuite/gcc.target/powerpc/overflow-1.c	(revision 0)
+++ testsuite/gcc.target/powerpc/overflow-1.c	(working copy)
@@ -0,0 +1,33 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+/* { dg-require-effective-target ilp32 } */
+
+#include <stdbool.h>
+#include <stdint.h>
+
+bool my_add_overflow (int32_t a, int32_t b, int32_t *res)
+{
+  return __builtin_add_overflow (a, b, res);
+}
+
+bool my_sub_overflow (int32_t a, int32_t b, int32_t *res)
+{
+  return __builtin_sub_overflow (a, b, res);
+}
+
+bool my_neg_overflow (int32_t a, int32_t *res)
+{
+  return __builtin_sub_overflow (0, a, res);
+}
+
+bool my_mul_overflow (int32_t a, int32_t b, int32_t *res)
+{
+  return __builtin_mul_overflow (a, b, res);
+}
+
+/* { dg-final { scan-assembler-times "addo" 1 } } */
+/* { dg-final { scan-assembler-times "subfo" 1 } } */
+/* { dg-final { scan-assembler-times "nego" 1 } } */
+/* { dg-final { scan-assembler-times "mullwo" 1 } } */
+/* { dg-final { scan-assembler-times "mcrxr" 4 } } */
+/* { dg-final { scan-assembler-not "cmp" } } */
Index: testsuite/gcc.target/powerpc/overflow-2.c
===================================================================
--- testsuite/gcc.target/powerpc/overflow-2.c	(revision 0)
+++ testsuite/gcc.target/powerpc/overflow-2.c	(working copy)
@@ -0,0 +1,32 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+/* { dg-require-effective-target lp64 } */
+
+#include <stdbool.h>
+#include <stdint.h>
+
+bool my_add_overflow (int32_t a, int32_t b, int32_t *res)
+{
+  return __builtin_add_overflow (a, b, res);
+}
+
+bool my_sub_overflow (int32_t a, int32_t b, int32_t *res)
+{
+  return __builtin_sub_overflow (a, b, res);
+}
+
+bool my_neg_overflow (int32_t a, int32_t *res)
+{
+  return __builtin_sub_overflow (0, a, res);
+}
+
+bool my_mul_overflow (int32_t a, int32_t b, int32_t *res)
+{
+  return __builtin_mul_overflow (a, b, res);
+}
+
+/* { dg-final { scan-assembler-not "addo" } } */
+/* { dg-final { scan-assembler-not "subfo" } } */
+/* { dg-final { scan-assembler-not "nego" } } */
+/* { dg-final { scan-assembler-times "mullwo" 1 } } */
+/* { dg-final { scan-assembler-times "mcrxr" 1 } } */
Index: testsuite/gcc.target/powerpc/overflow-3.c
===================================================================
--- testsuite/gcc.target/powerpc/overflow-3.c	(revision 0)
+++ testsuite/gcc.target/powerpc/overflow-3.c	(working copy)
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+/* { dg-require-effective-target ilp32 } */
+
+#include <stdbool.h>
+#include <stdint.h>
+
+bool my_add_overflow (int64_t a, int64_t b, int64_t *res)
+{
+  return __builtin_add_overflow (a, b, res);
+}
+
+bool my_sub_overflow (int64_t a, int64_t b, int64_t *res)
+{
+  return __builtin_sub_overflow (a, b, res);
+}
+
+bool my_neg_overflow (int64_t a, int64_t *res)
+{
+  return __builtin_sub_overflow (0, a, res);
+}
+
+/* { dg-final { scan-assembler-times "addeo" 1 } } */
+/* { dg-final { scan-assembler-times "subfeo" 1 } } */
+/* { dg-final { scan-assembler-times "subfzeo" 1 } } */
+/* { dg-final { scan-assembler-times "mcrxr" 3 } } */
+/* { dg-final { scan-assembler-not "cmp" } } */
Index: testsuite/gcc.target/powerpc/overflow-4.c
===================================================================
--- testsuite/gcc.target/powerpc/overflow-4.c	(revision 0)
+++ testsuite/gcc.target/powerpc/overflow-4.c	(working copy)
@@ -0,0 +1,33 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+/* { dg-require-effective-target lp64 } */
+
+#include <stdbool.h>
+#include <stdint.h>
+
+bool my_add_overflow (int64_t a, int64_t b, int64_t *res)
+{
+  return __builtin_add_overflow (a, b, res);
+}
+
+bool my_sub_overflow (int64_t a, int64_t b, int64_t *res)
+{
+  return __builtin_sub_overflow (a, b, res);
+}
+
+bool my_neg_overflow (int64_t a, int64_t *res)
+{
+  return __builtin_sub_overflow (0, a, res);
+}
+
+bool my_mul_overflow (int64_t a, int64_t b, int64_t *res)
+{
+  return __builtin_mul_overflow (a, b, res);
+}
+
+/* { dg-final { scan-assembler-times "addo" 1 } } */
+/* { dg-final { scan-assembler-times "subfo" 1 } } */
+/* { dg-final { scan-assembler-times "nego" 1 } } */
+/* { dg-final { scan-assembler-times "mulldo" 1 } } */
+/* { dg-final { scan-assembler-times "mcrxr" 4 } } */
+/* { dg-final { scan-assembler-not "cmp" } } */