diff mbox

[PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.

Message ID HE1PR0801MB2027D1CA7B116B16DD1E5641FFBB0@HE1PR0801MB2027.eurprd08.prod.outlook.com
State Superseded
Headers show

Commit Message

Tamar Christina Nov. 11, 2016, 5:26 p.m. UTC
Hi All,

This is v3 of the patch which adds an optimized route to the fpclassify builtin
for floating point numbers which are similar to IEEE-754 in format.

The patch has been rewritten to do it in GIMPLE instead of a fold. As part of
the implementation optimized versions of is_normal, is_subnormal, is_nan,
is_infinite and is_zero have been created. This patch also introduces two new
intrinsics __builtin_iszero and __builtin_issubnormal.

NOTE: the old code for ISNORMAL, ISSUBNORMAL, ISNAN and ISINFINITE had a
      special case for ibm_extended_format which dropped the second part
      of the number (which was being represented as two numbers internally).
      fpclassify did not have such a case. As such I have dropped it as I am
      under the impression the format is deprecated? So the optimization isn't
      as important? If this is wrong it would be easy to add that back in.

Should ISFINITE be change as well? Also should it be SUBNORMAL or DENORMAL?
And what should I do about Documentation? I'm not sure how to document a new
BUILTIN.

The goal is to make it faster by:
1. Trying to determine the most common case first
   (e.g. the float is a Normal number) and then the
   rest. The amount of code generated at -O2 are
   about the same +/- 1 instruction, but the code
   is much better.
2. Using integer operation in the optimized path.

At a high level, the optimized path uses integer operations
to perform the following checks in the given order:

  - normal
  - zero
  - nan
  - infinite
  - subnormal

The operations are ordered in the order of most occurrence of the values.

In case the optimization can't be applied a fall-back method is used which
is similar to the existing implementation using FP instructions. However
the operations now also follow the same order as described above. Which means
there should be some slight benefits there as well.

A limitation with this new approach is that the exponent
of the floating point has to fit in 32 bits and the floating
point has to have an IEEE like format and values for NaN and INF
(e.g. for NaN and INF all bits of the exp must be set).

To determine this IEEE likeness a new boolean was added to real_format.

As an example, AArch64 now generates for classification of doubles:

f:
	fmov	x1, d0
	mov	w0, 7
	ubfx	x2, x1, 52, 11
	add	w2, w2, 1
	tst	w2, 2046
	bne	.L1
	lsl	x1, x1, 1
	mov	w0, 13
	cbz	x1, .L1
	mov	x2, -9007199254740992
	cmp	x1, x2
	mov	w0, 5
	mov	w3, 11
	csel	w0, w0, w3, eq
	mov	w1, 3
	csel	w0, w0, w1, ls
.L1:
	ret

and for the floating point version:

f:
	adrp	x2, .LC0
	fabs	d1, d0
	adrp	x1, .LC1
	mov	w0, 7
	ldr	d3, [x2, #:lo12:.LC0]
	ldr	d2, [x1, #:lo12:.LC1]
	fcmpe	d1, d3
	fccmpe	d1, d2, 2, ge
	bls	.L1
	fcmp	d0, #0.0
	mov	w0, 13
	beq	.L1
	fcmp	d1, d1
	bvs	.L5
	fcmpe	d1, d2
	mov	w0, 5
	mov	w1, 11
	csel	w0, w0, w1, gt
.L1:
	ret
.L5:
	mov	w0, 3
	ret

One new test to test  that the integer version does not generate FP code,
correctness is tested using the existing test code for FP classifiy.

Glibc benchmarks ran against the built-in and this shows the following
performance gain on Aarch64 using the integer code:

* zero: 0%
* inf/nan: 29%
* normal: 69.1%

On x86_64:

* zero: 0%
* inf/nan: 89.9%
* normal: 4.7%

Regression tests ran on aarch64-none-linux and arm-none-linux-gnueabi
and no regression. x86_64 bootstrapped successfully as well.

Ok for trunk?

Thanks,
Tamar

gcc/
2016-11-11  Tamar Christina  <tamar.christina@arm.com>

	* gcc/builtins.c (fold_builtin_fpclassify): Removed.
	(expand_builtin): Added builtins to lowering list.
	(fold_builtin_n): Removed fold_builtin_varargs.
	(fold_builtin_varargs): Removed.
	* gcc/builtins.def (BUILT_IN_ISZERO, BUILT_IN_ISSUBNORMAL): Added.
	(fold_builtin_interclass_mathfn): Use get_min_float instead.
	* gcc/real.h (get_min_float): Added.
	* gcc/real.c (get_min_float): Added.
	* gcc/gimple-low.c (lower_stm): Define BUILT_IN_FPCLASSIFY,
	CASE_FLT_FN (BUILT_IN_ISINF), BUILT_IN_ISINFD32, BUILT_IN_ISINFD64,
	BUILT_IN_ISINFD128, BUILT_IN_ISNAND32, BUILT_IN_ISNAND64,
	BUILT_IN_ISNAND128, BUILT_IN_ISNAN, BUILT_IN_ISNORMAL, BUILT_IN_ISZERO,
	BUILT_IN_ISSUBNORMAL.
	(lower_builtin_fpclassify, is_nan, is_normal, is_infinity): Added.
	(is_zero, is_subnormal, use_ieee_int_mode): Likewise.
	(lower_builtin_isnan, lower_builtin_isinfinite): Likewise.
	(lower_builtin_isnormal, lower_builtin_iszero): Likewise.
	(lower_builtin_issubnormal): Likewise.
	(emit_tree_cond, get_num_as_int, emit_tree_and_return_var): Added.
	* gcc/real.h (real_format): Added is_ieee_compatible field.
	* gcc/real.c (ieee_single_format): Set is_ieee_compatible flag.
	(mips_single_format): Likewise.
	(motorola_single_format): Likewise.
	(spu_single_format): Likewise.
	(ieee_double_format): Likewise.
	(mips_double_format): Likewise.
	(motorola_double_format): Likewise.
	(ieee_extended_motorola_format): Likewise.
	(ieee_extended_intel_128_format): Likewise.
	(ieee_extended_intel_96_round_53_format): Likewise.
	(ibm_extended_format): Likewise.
	(mips_extended_format): Likewise.
	(ieee_quad_format): Likewise.
	(mips_quad_format): Likewise.
	(vax_f_format): Likewise.
	(vax_d_format): Likewise.
	(vax_g_format): Likewise.
	(decimal_single_format): Likewise.
	(decimal_quad_format): Likewise.
	(iee_half_format): Likewise.
	(mips_single_format): Likewise.
	(arm_half_format): Likewise.
	(real_internal_format): Likewise.

gcc/testsuite/
2016-11-11  Tamar Christina  <tamar.christina@arm.com>

	* gcc.target/aarch64/builtin-fpclassify.c: New codegen test.
	* gcc.dg/fold-notunord.c: Removed.

Comments

Joseph Myers Nov. 11, 2016, 10:05 p.m. UTC | #1
On Fri, 11 Nov 2016, Tamar Christina wrote:

> is_infinite and is_zero have been created. This patch also introduces two new

> intrinsics __builtin_iszero and __builtin_issubnormal.


And so the ChangeLog entry needs to include:

	PR middle-end/77925
	PR middle-end/77926

(in addition to PR references for whatever PRs for signaling NaN issues 
are partly addressed by the patch - those can't be closed however until 
fully fixed for all formats with signaling NaNs).

> NOTE: the old code for ISNORMAL, ISSUBNORMAL, ISNAN and ISINFINITE had a

>       special case for ibm_extended_format which dropped the second part

>       of the number (which was being represented as two numbers internally).

>       fpclassify did not have such a case. As such I have dropped it as I am

>       under the impression the format is deprecated? So the optimization isn't

>       as important? If this is wrong it would be easy to add that back in.


It's not simply an optimization when it involves comparison against the 
largest normal value (to determine whether something is finite / infinite) 
- in that case it's needed to avoid bad results on the true maximum value 
(mantissa 53 1s, 0, 53 1s) which can occur at runtime but GCC can't 
represent internally because it internally treats this format as having a 
fixed width of 106 bits.

> Should ISFINITE be change as well? Also should it be SUBNORMAL or DENORMAL?


It's subnormal.  (Denormal was the old name in IEEE 754-1985.)

> And what should I do about Documentation? I'm not sure how to document a new

> BUILTIN.


Document them in the "Other Builtins" node in extend.texi, like the 
existing classification built-in functions.

> diff --git a/gcc/builtins.def b/gcc/builtins.def

> index 219feebd3aebefbd079bf37cc801453cd1965e00..e3d12eccfed528fd6df0570b65f8aef42494d675 100644

> --- a/gcc/builtins.def

> +++ b/gcc/builtins.def

> @@ -831,6 +831,8 @@ DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISINFL, "isinfl", BT_FN_INT_LONGDOUBLE, ATTR_CO

>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISINFD32, "isinfd32", BT_FN_INT_DFLOAT32, ATTR_CONST_NOTHROW_LEAF_LIST)

>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISINFD64, "isinfd64", BT_FN_INT_DFLOAT64, ATTR_CONST_NOTHROW_LEAF_LIST)

>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISINFD128, "isinfd128", BT_FN_INT_DFLOAT128, ATTR_CONST_NOTHROW_LEAF_LIST)

> +DEF_C99_C90RES_BUILTIN (BUILT_IN_ISZERO, "iszero", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)

> +DEF_C99_C90RES_BUILTIN (BUILT_IN_ISSUBNORMAL, "issubnormal", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)


No, these should be DEF_GCC_BUILTIN, so that only the __builtin_* names 
exist.

> +static tree

> +is_zero (gimple_seq *seq, tree arg, location_t loc)

> +{

> +  tree type = TREE_TYPE (arg);

> +

> +  /* If not using optimized route then exit early.  */

> +  if (!use_ieee_int_mode (arg))

> +  {

> +    tree arg_p

> +      = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type,

> +							arg));

> +    tree res = fold_build2_loc (loc, EQ_EXPR, boolean_type_node, arg_p,

> +				build_real (type, dconst0));


There is no need to take the absolute value before comparing with constant 
0.

I think tests for the new built-in functions should be added for the 
_Float* types (so add gcc.dg/torture/float-tg-4.h to test them and 
associated float*-tg-4.c files using that header to instantiate the tests 
for each type).

-- 
Joseph S. Myers
joseph@codesourcery.com
diff mbox

Patch

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3ac2d44148440b124559ba7cd3de483b7a74b72d..fb09d342c836d68ef40a90fca803dbd496407ecb 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -160,7 +160,6 @@  static tree fold_builtin_0 (location_t, tree);
 static tree fold_builtin_1 (location_t, tree, tree);
 static tree fold_builtin_2 (location_t, tree, tree, tree);
 static tree fold_builtin_3 (location_t, tree, tree, tree, tree);
-static tree fold_builtin_varargs (location_t, tree, tree*, int);
 
 static tree fold_builtin_strpbrk (location_t, tree, tree, tree);
 static tree fold_builtin_strstr (location_t, tree, tree, tree);
@@ -5998,10 +5997,8 @@  expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
       if (! flag_unsafe_math_optimizations)
 	break;
       gcc_fallthrough ();
-    CASE_FLT_FN (BUILT_IN_ISINF):
     CASE_FLT_FN (BUILT_IN_FINITE):
     case BUILT_IN_ISFINITE:
-    case BUILT_IN_ISNORMAL:
       target = expand_builtin_interclass_mathfn (exp, target);
       if (target)
 	return target;
@@ -6281,8 +6278,20 @@  expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 	}
       break;
 
+    CASE_FLT_FN (BUILT_IN_ISINF):
+    case BUILT_IN_ISNAND32:
+    case BUILT_IN_ISNAND64:
+    case BUILT_IN_ISNAND128:
+    case BUILT_IN_ISNAN:
+    case BUILT_IN_ISINFD32:
+    case BUILT_IN_ISINFD64:
+    case BUILT_IN_ISINFD128:
+    case BUILT_IN_ISNORMAL:
+    case BUILT_IN_ISZERO:
+    case BUILT_IN_ISSUBNORMAL:
+    case BUILT_IN_FPCLASSIFY:
     case BUILT_IN_SETJMP:
-      /* This should have been lowered to the builtins below.  */
+      /* These should have been lowered to the builtins below.  */
       gcc_unreachable ();
 
     case BUILT_IN_SETJMP_SETUP:
@@ -7646,30 +7655,6 @@  fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
   switch (DECL_FUNCTION_CODE (fndecl))
     {
       tree result;
-
-    CASE_FLT_FN (BUILT_IN_ISINF):
-      {
-	/* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
-	tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
-	tree type = TREE_TYPE (arg);
-	REAL_VALUE_TYPE r;
-	char buf[128];
-
-	if (is_ibm_extended)
-	  {
-	    /* NaN and Inf are encoded in the high-order double value
-	       only.  The low-order value is not significant.  */
-	    type = double_type_node;
-	    mode = DFmode;
-	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
-	  }
-	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
-	real_from_string (&r, buf);
-	result = build_call_expr (isgr_fn, 2,
-				  fold_build1_loc (loc, ABS_EXPR, type, arg),
-				  build_real (type, r));
-	return result;
-      }
     CASE_FLT_FN (BUILT_IN_FINITE):
     case BUILT_IN_ISFINITE:
       {
@@ -7701,79 +7686,6 @@  fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
 				  result);*/
 	return result;
       }
-    case BUILT_IN_ISNORMAL:
-      {
-	/* isnormal(x) -> isgreaterequal(fabs(x),DBL_MIN) &
-	   islessequal(fabs(x),DBL_MAX).  */
-	tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
-	tree type = TREE_TYPE (arg);
-	tree orig_arg, max_exp, min_exp;
-	machine_mode orig_mode = mode;
-	REAL_VALUE_TYPE rmax, rmin;
-	char buf[128];
-
-	orig_arg = arg = builtin_save_expr (arg);
-	if (is_ibm_extended)
-	  {
-	    /* Use double to test the normal range of IBM extended
-	       precision.  Emin for IBM extended precision is
-	       different to emin for IEEE double, being 53 higher
-	       since the low double exponent is at least 53 lower
-	       than the high double exponent.  */
-	    type = double_type_node;
-	    mode = DFmode;
-	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
-	  }
-	arg = fold_build1_loc (loc, ABS_EXPR, type, arg);
-
-	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
-	real_from_string (&rmax, buf);
-	sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1);
-	real_from_string (&rmin, buf);
-	max_exp = build_real (type, rmax);
-	min_exp = build_real (type, rmin);
-
-	max_exp = build_call_expr (isle_fn, 2, arg, max_exp);
-	if (is_ibm_extended)
-	  {
-	    /* Testing the high end of the range is done just using
-	       the high double, using the same test as isfinite().
-	       For the subnormal end of the range we first test the
-	       high double, then if its magnitude is equal to the
-	       limit of 0x1p-969, we test whether the low double is
-	       non-zero and opposite sign to the high double.  */
-	    tree const islt_fn = builtin_decl_explicit (BUILT_IN_ISLESS);
-	    tree const isgt_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
-	    tree gt_min = build_call_expr (isgt_fn, 2, arg, min_exp);
-	    tree eq_min = fold_build2 (EQ_EXPR, integer_type_node,
-				       arg, min_exp);
-	    tree as_complex = build1 (VIEW_CONVERT_EXPR,
-				      complex_double_type_node, orig_arg);
-	    tree hi_dbl = build1 (REALPART_EXPR, type, as_complex);
-	    tree lo_dbl = build1 (IMAGPART_EXPR, type, as_complex);
-	    tree zero = build_real (type, dconst0);
-	    tree hilt = build_call_expr (islt_fn, 2, hi_dbl, zero);
-	    tree lolt = build_call_expr (islt_fn, 2, lo_dbl, zero);
-	    tree logt = build_call_expr (isgt_fn, 2, lo_dbl, zero);
-	    tree ok_lo = fold_build1 (TRUTH_NOT_EXPR, integer_type_node,
-				      fold_build3 (COND_EXPR,
-						   integer_type_node,
-						   hilt, logt, lolt));
-	    eq_min = fold_build2 (TRUTH_ANDIF_EXPR, integer_type_node,
-				  eq_min, ok_lo);
-	    min_exp = fold_build2 (TRUTH_ORIF_EXPR, integer_type_node,
-				   gt_min, eq_min);
-	  }
-	else
-	  {
-	    tree const isge_fn
-	      = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
-	    min_exp = build_call_expr (isge_fn, 2, arg, min_exp);
-	  }
-	result = fold_build2 (BIT_AND_EXPR, integer_type_node,
-			      max_exp, min_exp);
-	return result;
-      }
     default:
       break;
     }
@@ -7794,12 +7706,6 @@  fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index)
 
   switch (builtin_index)
     {
-    case BUILT_IN_ISINF:
-      if (!HONOR_INFINITIES (arg))
-	return omit_one_operand_loc (loc, type, integer_zero_node, arg);
-
-      return NULL_TREE;
-
     case BUILT_IN_ISINF_SIGN:
       {
 	/* isinf_sign(x) -> isinf(x) ? (signbit(x) ? -1 : 1) : 0 */
@@ -7838,100 +7744,11 @@  fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index)
 	return omit_one_operand_loc (loc, type, integer_one_node, arg);
 
       return NULL_TREE;
-
-    case BUILT_IN_ISNAN:
-      if (!HONOR_NANS (arg))
-	return omit_one_operand_loc (loc, type, integer_zero_node, arg);
-
-      {
-	bool is_ibm_extended = MODE_COMPOSITE_P (TYPE_MODE (TREE_TYPE (arg)));
-	if (is_ibm_extended)
-	  {
-	    /* NaN and Inf are encoded in the high-order double value
-	       only.  The low-order value is not significant.  */
-	    arg = fold_build1_loc (loc, NOP_EXPR, double_type_node, arg);
-	  }
-      }
-      arg = builtin_save_expr (arg);
-      return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);
-
     default:
       gcc_unreachable ();
     }
 }
 
-/* Fold a call to __builtin_fpclassify(int, int, int, int, int, ...).
-   This builtin will generate code to return the appropriate floating
-   point classification depending on the value of the floating point
-   number passed in.  The possible return values must be supplied as
-   int arguments to the call in the following order: FP_NAN, FP_INFINITE,
-   FP_NORMAL, FP_SUBNORMAL and FP_ZERO.  The ellipses is for exactly
-   one floating point argument which is "type generic".  */
-
-static tree
-fold_builtin_fpclassify (location_t loc, tree *args, int nargs)
-{
-  tree fp_nan, fp_infinite, fp_normal, fp_subnormal, fp_zero,
-    arg, type, res, tmp;
-  machine_mode mode;
-  REAL_VALUE_TYPE r;
-  char buf[128];
-
-  /* Verify the required arguments in the original call.  */
-  if (nargs != 6
-      || !validate_arg (args[0], INTEGER_TYPE)
-      || !validate_arg (args[1], INTEGER_TYPE)
-      || !validate_arg (args[2], INTEGER_TYPE)
-      || !validate_arg (args[3], INTEGER_TYPE)
-      || !validate_arg (args[4], INTEGER_TYPE)
-      || !validate_arg (args[5], REAL_TYPE))
-    return NULL_TREE;
-
-  fp_nan = args[0];
-  fp_infinite = args[1];
-  fp_normal = args[2];
-  fp_subnormal = args[3];
-  fp_zero = args[4];
-  arg = args[5];
-  type = TREE_TYPE (arg);
-  mode = TYPE_MODE (type);
-  arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
-
-  /* fpclassify(x) ->
-       isnan(x) ? FP_NAN :
-         (fabs(x) == Inf ? FP_INFINITE :
-	   (fabs(x) >= DBL_MIN ? FP_NORMAL :
-	     (x == 0 ? FP_ZERO : FP_SUBNORMAL))).  */
-
-  tmp = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg,
-		     build_real (type, dconst0));
-  res = fold_build3_loc (loc, COND_EXPR, integer_type_node,
-		     tmp, fp_zero, fp_subnormal);
-
-  sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
-  real_from_string (&r, buf);
-  tmp = fold_build2_loc (loc, GE_EXPR, integer_type_node,
-		     arg, build_real (type, r));
-  res = fold_build3_loc (loc, COND_EXPR, integer_type_node, tmp, fp_normal, res);
-
-  if (HONOR_INFINITIES (mode))
-    {
-      real_inf (&r);
-      tmp = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg,
-			 build_real (type, r));
-      res = fold_build3_loc (loc, COND_EXPR, integer_type_node, tmp,
-			 fp_infinite, res);
-    }
-
-  if (HONOR_NANS (mode))
-    {
-      tmp = fold_build2_loc (loc, ORDERED_EXPR, integer_type_node, arg, arg);
-      res = fold_build3_loc (loc, COND_EXPR, integer_type_node, tmp, res, fp_nan);
-    }
-
-  return res;
-}
-
 /* Fold a call to an unordered comparison function such as
    __builtin_isgreater().  FNDECL is the FUNCTION_DECL for the function
    being called and ARG0 and ARG1 are the arguments for the call.
@@ -8243,30 +8060,9 @@  fold_builtin_1 (location_t loc, tree fndecl, tree arg0)
 	  return ret;
 	return fold_builtin_interclass_mathfn (loc, fndecl, arg0);
       }
-
-    CASE_FLT_FN (BUILT_IN_ISINF):
-    case BUILT_IN_ISINFD32:
-    case BUILT_IN_ISINFD64:
-    case BUILT_IN_ISINFD128:
-      {
-	tree ret = fold_builtin_classify (loc, fndecl, arg0, BUILT_IN_ISINF);
-	if (ret)
-	  return ret;
-	return fold_builtin_interclass_mathfn (loc, fndecl, arg0);
-      }
-
-    case BUILT_IN_ISNORMAL:
-      return fold_builtin_interclass_mathfn (loc, fndecl, arg0);
-
     case BUILT_IN_ISINF_SIGN:
       return fold_builtin_classify (loc, fndecl, arg0, BUILT_IN_ISINF_SIGN);
 
-    CASE_FLT_FN (BUILT_IN_ISNAN):
-    case BUILT_IN_ISNAND32:
-    case BUILT_IN_ISNAND64:
-    case BUILT_IN_ISNAND128:
-      return fold_builtin_classify (loc, fndecl, arg0, BUILT_IN_ISNAN);
-
     case BUILT_IN_FREE:
       if (integer_zerop (arg0))
 	return build_empty_stmt (loc);
@@ -8465,7 +8261,11 @@  fold_builtin_n (location_t loc, tree fndecl, tree *args, int nargs, bool)
       ret = fold_builtin_3 (loc, fndecl, args[0], args[1], args[2]);
       break;
     default:
-      ret = fold_builtin_varargs (loc, fndecl, args, nargs);
+      /* There used to be a call to fold_builtin_varargs here, but with the
+         lowering of fpclassify which was it's only member the function became
+	 redundant. As such it has been removed. The function's default case
+	 was the same as what is below the switch here, so the function can
+	 safely be removed.  */
       break;
     }
   if (ret)
@@ -9422,37 +9222,6 @@  fold_builtin_object_size (tree ptr, tree ost)
   return NULL_TREE;
 }
 
-/* Builtins with folding operations that operate on "..." arguments
-   need special handling; we need to store the arguments in a convenient
-   data structure before attempting any folding.  Fortunately there are
-   only a few builtins that fall into this category.  FNDECL is the
-   function, EXP is the CALL_EXPR for the call.  */
-
-static tree
-fold_builtin_varargs (location_t loc, tree fndecl, tree *args, int nargs)
-{
-  enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl);
-  tree ret = NULL_TREE;
-
-  switch (fcode)
-    {
-    case BUILT_IN_FPCLASSIFY:
-      ret = fold_builtin_fpclassify (loc, args, nargs);
-      break;
-
-    default:
-      break;
-    }
-  if (ret)
-    {
-      ret = build1 (NOP_EXPR, TREE_TYPE (ret), ret);
-      SET_EXPR_LOCATION (ret, loc);
-      TREE_NO_WARNING (ret) = 1;
-      return ret;
-    }
-  return NULL_TREE;
-}
-
 /* Initialize format string characters in the target charset.  */
 
 bool
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 219feebd3aebefbd079bf37cc801453cd1965e00..e3d12eccfed528fd6df0570b65f8aef42494d675 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -831,6 +831,8 @@  DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISINFL, "isinfl", BT_FN_INT_LONGDOUBLE, ATTR_CO
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISINFD32, "isinfd32", BT_FN_INT_DFLOAT32, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISINFD64, "isinfd64", BT_FN_INT_DFLOAT64, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISINFD128, "isinfd128", BT_FN_INT_DFLOAT128, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_C99_C90RES_BUILTIN (BUILT_IN_ISZERO, "iszero", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
+DEF_C99_C90RES_BUILTIN (BUILT_IN_ISSUBNORMAL, "issubnormal", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_ISNAN, "isnan", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISNANF, "isnanf", BT_FN_INT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_ISNANL, "isnanl", BT_FN_INT_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c
index 64752b67b86b3d01df5f5661e4666df98b7b91d1..ac9dd6cb319eba25402a68fd4ffecfdc8f0d2118 100644
--- a/gcc/gimple-low.c
+++ b/gcc/gimple-low.c
@@ -30,6 +30,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "calls.h"
 #include "gimple-iterator.h"
 #include "gimple-low.h"
+#include "stor-layout.h"
+#include "target.h"
 
 /* The differences between High GIMPLE and Low GIMPLE are the
    following:
@@ -72,6 +74,12 @@  static void lower_gimple_bind (gimple_stmt_iterator *, struct lower_data *);
 static void lower_try_catch (gimple_stmt_iterator *, struct lower_data *);
 static void lower_gimple_return (gimple_stmt_iterator *, struct lower_data *);
 static void lower_builtin_setjmp (gimple_stmt_iterator *);
+static void lower_builtin_fpclassify (gimple_stmt_iterator *);
+static void lower_builtin_isnan (gimple_stmt_iterator *);
+static void lower_builtin_isinfinite (gimple_stmt_iterator *);
+static void lower_builtin_isnormal (gimple_stmt_iterator *);
+static void lower_builtin_iszero (gimple_stmt_iterator *);
+static void lower_builtin_issubnormal (gimple_stmt_iterator *);
 static void lower_builtin_posix_memalign (gimple_stmt_iterator *);
 
 
@@ -330,19 +338,61 @@  lower_stmt (gimple_stmt_iterator *gsi, struct lower_data *data)
 	if (decl
 	    && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
 	  {
-	    if (DECL_FUNCTION_CODE (decl) == BUILT_IN_SETJMP)
-	      {
-		lower_builtin_setjmp (gsi);
-		data->cannot_fallthru = false;
-		return;
-	      }
-	    else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN
-		     && flag_tree_bit_ccp
-		     && gimple_builtin_call_types_compatible_p (stmt, decl))
-	      {
-		lower_builtin_posix_memalign (gsi);
-		return;
-	      }
+	    switch (DECL_FUNCTION_CODE (decl))
+	    {
+		    case BUILT_IN_SETJMP:
+			lower_builtin_setjmp (gsi);
+			data->cannot_fallthru = false;
+			return;
+
+		    case BUILT_IN_POSIX_MEMALIGN:
+			if (flag_tree_bit_ccp
+			    && gimple_builtin_call_types_compatible_p (stmt, decl))
+			{
+				lower_builtin_posix_memalign (gsi);
+				return;
+			}
+			break;
+
+		    case BUILT_IN_FPCLASSIFY:
+			lower_builtin_fpclassify (gsi);
+			data->cannot_fallthru = false;
+			return;
+
+		    CASE_FLT_FN (BUILT_IN_ISINF):
+		    case BUILT_IN_ISINFD32:
+		    case BUILT_IN_ISINFD64:
+		    case BUILT_IN_ISINFD128:
+			lower_builtin_isinfinite (gsi);
+			data->cannot_fallthru = false;
+			return;
+
+		    case BUILT_IN_ISNAND32:
+		    case BUILT_IN_ISNAND64:
+		    case BUILT_IN_ISNAND128:
+		    CASE_FLT_FN (BUILT_IN_ISNAN):
+			lower_builtin_isnan (gsi);
+			data->cannot_fallthru = false;
+			return;
+
+		    case BUILT_IN_ISNORMAL:
+			lower_builtin_isnormal (gsi);
+			data->cannot_fallthru = false;
+			return;
+
+		    case BUILT_IN_ISZERO:
+			lower_builtin_iszero (gsi);
+			data->cannot_fallthru = false;
+			return;
+
+		    case BUILT_IN_ISSUBNORMAL:
+			lower_builtin_issubnormal (gsi);
+			data->cannot_fallthru = false;
+			return;
+
+		    default:
+			break;
+	    }
 	  }
 
 	if (decl && (flags_from_decl_or_type (decl) & ECF_NORETURN))
@@ -822,6 +872,580 @@  lower_builtin_setjmp (gimple_stmt_iterator *gsi)
   gsi_remove (gsi, false);
 }
 
+static tree
+emit_tree_and_return_var (gimple_seq *seq, tree arg)
+{
+  tree tmp = create_tmp_reg (TREE_TYPE(arg));
+  gassign *stm = gimple_build_assign(tmp, arg);
+  gimple_seq_add_stmt (seq, stm);
+  return tmp;
+}
+
+/* This function builds an if statement that ends up using explicit branches
+   instead of becoming a csel. This function assumes you will fall through to
+   the next statements after this condition for the false branch. */
+static void
+emit_tree_cond (gimple_seq *seq, tree result_variable, tree exit_label,
+		tree cond, tree true_branch)
+{
+    /* Create labels for fall through  */
+  tree true_label = create_artificial_label (UNKNOWN_LOCATION);
+  tree false_label = create_artificial_label (UNKNOWN_LOCATION);
+  gcond *stmt = gimple_build_cond_from_tree (cond, true_label, false_label);
+  gimple_seq_add_stmt (seq, stmt);
+
+  /* Build the true case.  */
+  gimple_seq_add_stmt (seq, gimple_build_label (true_label));
+  tree value = TREE_CONSTANT (true_branch) 
+	     ? true_branch
+	     : emit_tree_and_return_var (seq, true_branch);
+  gimple_seq_add_stmt (seq, gimple_build_assign (result_variable, value));
+  gimple_seq_add_stmt (seq, gimple_build_goto (exit_label));
+
+  /* Build the false case. */
+  gimple_seq_add_stmt (seq, gimple_build_label (false_label));
+}
+
+static tree
+get_num_as_int (gimple_seq *seq, tree arg, location_t loc)
+{
+  tree type = TREE_TYPE (arg);
+
+  machine_mode mode = TYPE_MODE (type);
+  const real_format *format = REAL_MODE_FORMAT (mode);
+  const HOST_WIDE_INT type_width = TYPE_PRECISION (type);
+
+  gcc_assert (format->b == 2);
+
+  /* Re-interpret the float as an unsigned integer type
+     with equal precision.  */
+  tree int_arg_type = build_nonstandard_integer_type (type_width, true);
+  tree conv_arg = fold_build1_loc (loc, VIEW_CONVERT_EXPR, int_arg_type, arg);
+  return emit_tree_and_return_var(seq, conv_arg);
+}
+
+  /* Check if the number that is being classified is close enough to IEEE 754
+     format to be able to go in the early exit code.  */
+static bool
+use_ieee_int_mode (tree arg)
+{
+  tree type = TREE_TYPE (arg);
+
+  machine_mode mode = TYPE_MODE (type);
+
+  const real_format *format = REAL_MODE_FORMAT (mode);
+  const HOST_WIDE_INT type_width = TYPE_PRECISION (type);
+  return (format->is_binary_ieee_compatible
+	  && FLOAT_WORDS_BIG_ENDIAN == WORDS_BIG_ENDIAN
+	  /* We explicitly disable quad float support on 32 bit systems.  */
+	  && !(UNITS_PER_WORD == 4 && type_width == 128)
+	  && targetm.scalar_mode_supported_p (mode));
+}
+
+static tree
+is_normal (gimple_seq *seq, tree arg, location_t loc)
+{
+  tree type = TREE_TYPE (arg);
+
+  machine_mode mode = TYPE_MODE (type);
+  const real_format *format = REAL_MODE_FORMAT (mode);
+  const tree bool_type = boolean_type_node;
+
+  /* If not using optimized route then exit early.  */
+  if (!use_ieee_int_mode (arg))
+  {
+    REAL_VALUE_TYPE rinf, rmin;
+    tree arg_p
+      = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type,
+							arg));
+    char buf[128];
+    real_inf(&rinf);
+    get_min_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
+    real_from_string (&rmin, buf);
+
+    tree inf_exp = fold_build2_loc (loc, LT_EXPR, bool_type, arg_p,
+				    build_real (type, rinf));
+
+    tree min_exp = fold_build2_loc (loc, GE_EXPR, bool_type, arg_p,
+				    build_real (type, rmin));
+
+    tree res
+      = fold_build2_loc (loc, BIT_AND_EXPR, bool_type,
+			 emit_tree_and_return_var (seq, min_exp),
+			 emit_tree_and_return_var (seq, inf_exp));
+
+    return emit_tree_and_return_var (seq, res);
+  }
+
+  gcc_assert (format->b == 2);
+
+  const tree int_type = unsigned_type_node;
+  const int exp_bits  = (GET_MODE_SIZE (mode) * BITS_PER_UNIT) - format->p;
+  const int exp_mask  = (1 << exp_bits) - 1;
+
+  /* Get the number reinterpreted as an integer.  */
+  tree int_arg = get_num_as_int (seq, arg, loc);
+
+  /* Extract exp bits from the float, where we expect the exponent to be.
+     We create a new type because BIT_FIELD_REF does not allow you to
+     extract less bits than the precision of the storage variable.  */
+  tree exp_tmp
+    = fold_build3_loc (loc, BIT_FIELD_REF,
+		       build_nonstandard_integer_type (exp_bits, true),
+		       int_arg,
+		       build_int_cstu (int_type, exp_bits),
+		       build_int_cstu (int_type, format->p - 1));
+  tree exp_bitfield = emit_tree_and_return_var (seq, exp_tmp);
+
+  /* Re-interpret the extracted exponent bits as a 32 bit int.
+     This allows us to continue doing operations as int_type.  */
+  tree exp
+    = emit_tree_and_return_var(seq,fold_build1_loc (loc, NOP_EXPR, int_type,
+						    exp_bitfield));
+
+  /* exp_mask & ~1.  */
+  tree mask_check
+     = fold_build2_loc (loc, BIT_AND_EXPR, int_type,
+			build_int_cstu (int_type, exp_mask),
+			fold_build1_loc (loc, BIT_NOT_EXPR, int_type,
+					 build_int_cstu (int_type, 1)));
+
+  /* (exp + 1) & mask_check.
+     Check to see if exp is not all 0 or all 1.  */
+  tree exp_check
+    = fold_build2_loc (loc, BIT_AND_EXPR, int_type,
+		       emit_tree_and_return_var (seq,
+				fold_build2_loc (loc, PLUS_EXPR, int_type, exp,
+						 build_int_cstu (int_type, 1))),
+		       mask_check);
+
+  tree res = fold_build2_loc (loc, NE_EXPR, boolean_type_node,
+			      build_int_cstu (int_type, 0),
+			      emit_tree_and_return_var(seq, exp_check));
+
+  return emit_tree_and_return_var (seq, res);
+}
+
+static tree
+is_zero (gimple_seq *seq, tree arg, location_t loc)
+{
+  tree type = TREE_TYPE (arg);
+
+  /* If not using optimized route then exit early.  */
+  if (!use_ieee_int_mode (arg))
+  {
+    tree arg_p
+      = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type,
+							arg));
+    tree res = fold_build2_loc (loc, EQ_EXPR, boolean_type_node, arg_p,
+				build_real (type, dconst0));
+    return emit_tree_and_return_var (seq, res);
+  }
+
+  machine_mode mode = TYPE_MODE (type);
+  const real_format *format = REAL_MODE_FORMAT (mode);
+  const HOST_WIDE_INT type_width = TYPE_PRECISION (type);
+
+  gcc_assert (format->b == 2);
+
+  tree int_arg_type = build_nonstandard_integer_type (type_width, true);
+
+  /* Get the number reinterpreted as an integer.
+     Shift left to remove the sign. */
+  tree int_arg
+    = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type,
+		       get_num_as_int (seq, arg, loc),
+		       build_int_cstu (int_arg_type, 1));
+
+  /* num << 1 == 0.
+     This checks to see if the number is zero.  */
+  tree zero_check
+    = fold_build2_loc (loc, EQ_EXPR, boolean_type_node,
+		       build_int_cstu (int_arg_type, 0),
+		       emit_tree_and_return_var (seq, int_arg));
+
+  return emit_tree_and_return_var (seq, zero_check);
+}
+
+static tree
+is_subnormal (gimple_seq *seq, tree arg, location_t loc)
+{
+  const tree bool_type = boolean_type_node;
+
+  tree type = TREE_TYPE (arg);
+
+  machine_mode mode = TYPE_MODE (type);
+  const real_format *format = REAL_MODE_FORMAT (mode);
+  const HOST_WIDE_INT type_width = TYPE_PRECISION (type);
+
+  tree int_arg_type = build_nonstandard_integer_type (type_width, true);
+
+  gcc_assert (format->b == 2);
+
+  /* If not using optimized route then exit early.  */
+  if (!use_ieee_int_mode (arg))
+  {
+    tree arg_p
+      = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type,
+							arg));
+    REAL_VALUE_TYPE r;
+    char buf[128];
+    sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
+    real_from_string (&r, buf);
+    tree subnorm = fold_build2_loc (loc, LT_EXPR, bool_type,
+				    arg_p, build_real (type, r));
+
+    tree zero = fold_build2_loc (loc, GT_EXPR, bool_type, arg_p,
+				 build_real (type, dconst0));
+
+    tree res
+      = fold_build2_loc (loc, BIT_AND_EXPR, bool_type,
+			 emit_tree_and_return_var (seq, subnorm),
+			 emit_tree_and_return_var (seq, zero));
+
+    return emit_tree_and_return_var (seq, res);
+  }
+
+  /* Get the number reinterpreted as an integer.
+     Shift left to remove the sign. */
+  tree int_arg
+    = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type,
+		       get_num_as_int (seq, arg, loc),
+		       build_int_cstu (int_arg_type, 1));
+
+  /* Check for a zero exponent and non-zero mantissa.
+     This can be done with two comparisons by first apply a
+     removing the sign bit and checking if the value is larger
+     than the mantissa mask.  */
+
+  /* This creates a mask to be used to check the mantissa value in the shifted
+     integer representation of the fpnum.  */
+  tree significant_bit = build_int_cstu (int_arg_type, format->p - 1);
+  tree mantissa_mask
+    = fold_build2_loc (loc, MINUS_EXPR, int_arg_type,
+		       fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type,
+					build_int_cstu (int_arg_type, 2),
+					significant_bit),
+		       build_int_cstu (int_arg_type, 1));
+
+  /* Check if exponent is zero and mantissa is not. */
+  tree subnorm_check
+    = emit_tree_and_return_var(seq,
+	fold_build2_loc (loc, LE_EXPR, bool_type,
+			 emit_tree_and_return_var(seq, int_arg),
+			 mantissa_mask));
+
+  return emit_tree_and_return_var (seq, subnorm_check);
+}
+
+static tree
+is_infinity (gimple_seq *seq, tree arg, location_t loc)
+{
+  tree type = TREE_TYPE (arg);
+
+  machine_mode mode = TYPE_MODE (type);
+  const tree bool_type = boolean_type_node;
+
+  if (!HONOR_INFINITIES (mode))
+  {
+    return build_int_cst (bool_type, 0);
+  }
+
+  /* If not using optimized route then exit early.  */
+  if (!use_ieee_int_mode (arg))
+  {
+    tree arg_p
+      = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type,
+							arg));
+    REAL_VALUE_TYPE r;
+    real_inf (&r);
+    tree res = fold_build2_loc (loc, EQ_EXPR, bool_type, arg_p,
+				build_real (type, r));
+
+    return emit_tree_and_return_var (seq, res);
+  }
+
+  const real_format *format = REAL_MODE_FORMAT (mode);
+  const HOST_WIDE_INT type_width = TYPE_PRECISION (type);
+
+  gcc_assert (format->b == 2);
+
+  tree int_arg_type = build_nonstandard_integer_type (type_width, true);
+
+  /* This creates a mask to be used to check the exp value in the shifted
+     integer representation of the fpnum.  */
+  const int exp_bits  = (GET_MODE_SIZE (mode) * BITS_PER_UNIT) - format->p;
+  gcc_assert (format->p > 0);
+
+  tree significant_bit = build_int_cstu (int_arg_type, format->p);
+  tree exp_mask
+    = fold_build2_loc (loc, MINUS_EXPR, int_arg_type,
+		       fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type,
+					build_int_cstu (int_arg_type, 2),
+					build_int_cstu (int_arg_type, exp_bits - 1)),
+		       build_int_cstu (int_arg_type, 1));
+
+  /* Get the number reinterpreted as an integer.
+     Shift left to remove the sign. */
+  tree int_arg
+    = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type,
+		       get_num_as_int (seq, arg, loc),
+		       build_int_cstu (int_arg_type, 1));
+
+  /* This mask checks to see if the exp has all bits set and mantissa no
+     bits set.  */
+  tree inf_mask
+    = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, exp_mask, significant_bit);
+
+  /* Check if exponent has all bits set and mantissa is 0. */
+  tree inf_check
+    = emit_tree_and_return_var(seq,
+	fold_build2_loc (loc, EQ_EXPR, bool_type,
+			 emit_tree_and_return_var(seq, int_arg),
+			 inf_mask));
+
+  return emit_tree_and_return_var (seq, inf_check);
+}
+
+/* Determines if the given number is a NaN value.
+   This function is the last in the chain and only has to
+   check if it's preconditions are true.  */
+static tree
+is_nan (gimple_seq *seq, tree arg, location_t loc)
+{
+  tree type = TREE_TYPE (arg);
+
+  machine_mode mode = TYPE_MODE (type);
+  const real_format *format = REAL_MODE_FORMAT (mode);
+  const tree bool_type = boolean_type_node;
+
+  if (!HONOR_NANS (mode))
+  {
+    return build_int_cst (bool_type, 0);
+  }
+
+  /* If not using optimized route then exit early.  */
+  if (!use_ieee_int_mode (arg))
+  {
+    tree arg_p
+      = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type,
+							arg));
+    tree eq_check
+      = fold_build2_loc (loc, ORDERED_EXPR, bool_type,arg_p, arg_p);
+
+    tree res
+      = fold_build1_loc (loc, BIT_NOT_EXPR, bool_type,
+			 emit_tree_and_return_var (seq, eq_check));
+
+    return emit_tree_and_return_var (seq, res);
+  }
+
+  const HOST_WIDE_INT type_width = TYPE_PRECISION (type);
+  tree int_arg_type = build_nonstandard_integer_type (type_width, true);
+
+  /* This creates a mask to be used to check the exp value in the shifted
+     integer representation of the fpnum.  */
+  const int exp_bits  = (GET_MODE_SIZE (mode) * BITS_PER_UNIT) - format->p;
+  tree significant_bit = build_int_cstu (int_arg_type, format->p);
+  tree exp_mask
+    = fold_build2_loc (loc, MINUS_EXPR, int_arg_type,
+		       fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type,
+					build_int_cstu (int_arg_type, 2),
+					build_int_cstu (int_arg_type, exp_bits - 1)),
+		       build_int_cstu (int_arg_type, 1));
+
+  /* Get the number reinterpreted as an integer.
+     Shift left to remove the sign. */
+  tree int_arg
+    = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type,
+		       get_num_as_int (seq, arg, loc),
+		       build_int_cstu (int_arg_type, 1));
+
+  /* This mask checks to see if the exp has all bits set and mantissa no
+     bits set.  */
+  tree inf_mask
+    = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, exp_mask, significant_bit);
+
+  /* Check if exponent has all bits set and mantissa is not 0. */
+  tree nan_check
+    = emit_tree_and_return_var(seq,
+	fold_build2_loc (loc, GT_EXPR, bool_type,
+			 emit_tree_and_return_var(seq, int_arg),
+			 inf_mask));
+
+  return emit_tree_and_return_var (seq, nan_check);
+}
+
+/* Validate a single argument ARG against a tree code CODE representing
+   a type.  */
+static bool
+gimple_validate_arg (gimple* call, int index, enum tree_code code)
+{
+  const tree arg = gimple_call_arg(call, index);
+  if (!arg)
+    return false;
+  else if (code == POINTER_TYPE)
+    return POINTER_TYPE_P (TREE_TYPE (arg));
+  else if (code == INTEGER_TYPE)
+    return INTEGRAL_TYPE_P (TREE_TYPE (arg));
+  return code == TREE_CODE (TREE_TYPE (arg));
+}
+
+/* Lowers calls to __builtin_fpclassify to
+   fpclassify (x) ->
+     isnormal(x) ? FP_NORMAL :
+       iszero (x) ? FP_ZERO :
+	 isnan (x) ? FP_NAN :
+	   isinfinite (x) ? FP_INFINITE :
+	     FP_SUBNORMAL.
+
+   The code may use integer arithmentic if it decides
+   that the produced assembly would be faster. This can only be done
+   for numbers that are similar to IEEE-754 in format.
+
+   This builtin will generate code to return the appropriate floating
+   point classification depending on the value of the floating point
+   number passed in.  The possible return values must be supplied as
+   int arguments to the call in the following order: FP_NAN, FP_INFINITE,
+   FP_NORMAL, FP_SUBNORMAL and FP_ZERO.  The ellipses is for exactly
+   one floating point argument which is "type generic".
+*/
+static void
+lower_builtin_fpclassify (gimple_stmt_iterator *gsi)
+{
+  gimple *call = gsi_stmt (*gsi);
+  location_t loc = gimple_location (call);
+
+  /* Verify the required arguments in the original call.  */
+  if (gimple_call_num_args (call) != 6
+      || !gimple_validate_arg (call, 0, INTEGER_TYPE)
+      || !gimple_validate_arg (call, 1, INTEGER_TYPE)
+      || !gimple_validate_arg (call, 2, INTEGER_TYPE)
+      || !gimple_validate_arg (call, 3, INTEGER_TYPE)
+      || !gimple_validate_arg (call, 4, INTEGER_TYPE)
+      || !gimple_validate_arg (call, 5, REAL_TYPE))
+    return;
+
+  /* Collect the arguments from the call.  */
+  tree fp_nan = gimple_call_arg (call, 0);
+  tree fp_infinite = gimple_call_arg (call, 1);
+  tree fp_normal = gimple_call_arg (call, 2);
+  tree fp_subnormal = gimple_call_arg (call, 3);
+  tree fp_zero = gimple_call_arg (call, 4);
+  tree arg = gimple_call_arg (call, 5);
+
+  gimple_seq body = NULL;
+
+  /* Create label to jump to to exit.  */
+  tree done_label = create_artificial_label (UNKNOWN_LOCATION);
+  tree dest;
+  tree orig_dest = dest = gimple_call_lhs (call);
+  if (orig_dest && TREE_CODE (orig_dest) == SSA_NAME)
+    dest = create_tmp_reg (TREE_TYPE (orig_dest));
+
+  emit_tree_cond (&body, dest, done_label,
+		  is_normal(&body, arg, loc), fp_normal);
+  emit_tree_cond (&body, dest, done_label,
+		  is_zero(&body, arg, loc), fp_zero);
+  emit_tree_cond (&body, dest, done_label,
+		  is_nan(&body, arg, loc), fp_nan);
+  emit_tree_cond (&body, dest, done_label,
+		  is_infinity(&body, arg, loc), fp_infinite);
+
+  /* And finally, emit the default case if nothing else matches.
+     This replaces the call to is_subnormal.  */
+  gimple_seq_add_stmt (&body, gimple_build_assign (dest, fp_subnormal));
+  gimple_seq_add_stmt (&body, gimple_build_label (done_label));
+
+  /* Build orig_dest = dest if necessary.  */
+  if (dest != orig_dest)
+  {
+    gimple_seq_add_stmt (&body, gimple_build_assign (orig_dest, dest));
+  }
+
+  gsi_insert_seq_before (gsi, body, GSI_SAME_STMT);
+
+
+  /* Remove the call to __builtin_fpclassify.  */
+  gsi_remove (gsi, false);
+}
+
+static void
+gen_call_fp_builtin (gimple_stmt_iterator *gsi,
+		     tree (*fndecl)(gimple_seq *, tree, location_t))
+{
+  gimple *call = gsi_stmt (*gsi);
+  location_t loc = gimple_location (call);
+
+  /* Verify the required arguments in the original call.  */
+  if (gimple_call_num_args (call) != 1
+      || !gimple_validate_arg (call, 0, REAL_TYPE))
+    return;
+
+  tree arg = gimple_call_arg (call, 0);
+  gimple_seq body = NULL;
+
+  /* Create label to jump to to exit.  */
+  tree done_label = create_artificial_label (UNKNOWN_LOCATION);
+  tree dest;
+  tree orig_dest = dest = gimple_call_lhs (call);
+  tree type = TREE_TYPE (orig_dest);
+  if (orig_dest && TREE_CODE (orig_dest) == SSA_NAME)
+    dest = create_tmp_reg (type);
+
+  tree t_true = build_int_cst (type, true);
+  tree t_false = build_int_cst (type, false);
+
+  emit_tree_cond (&body, dest, done_label,
+		  fndecl(&body, arg, loc), t_true);
+
+  /* And finally, emit the default case if nothing else matches.
+     This replaces the call to false.  */
+  gimple_seq_add_stmt (&body, gimple_build_assign (dest, t_false));
+  gimple_seq_add_stmt (&body, gimple_build_label (done_label));
+
+  /* Build orig_dest = dest if necessary.  */
+  if (dest != orig_dest)
+  {
+    gimple_seq_add_stmt (&body, gimple_build_assign (orig_dest, dest));
+  }
+
+  gsi_insert_seq_before (gsi, body, GSI_SAME_STMT);
+
+  /* Remove the call to the builtin.  */
+  gsi_remove (gsi, false);
+}
+
+static void
+lower_builtin_isnan (gimple_stmt_iterator *gsi)
+{
+  gen_call_fp_builtin (gsi, &is_nan);
+}
+
+static void
+lower_builtin_isinfinite (gimple_stmt_iterator *gsi)
+{
+  gen_call_fp_builtin (gsi, &is_infinity);
+}
+
+static void
+lower_builtin_isnormal (gimple_stmt_iterator *gsi)
+{
+  gen_call_fp_builtin (gsi, &is_normal);
+}
+
+static void
+lower_builtin_iszero (gimple_stmt_iterator *gsi)
+{
+  gen_call_fp_builtin (gsi, &is_zero);
+}
+
+static void
+lower_builtin_issubnormal (gimple_stmt_iterator *gsi)
+{
+  gen_call_fp_builtin (gsi, &is_subnormal);
+}
+
 /* Lower calls to posix_memalign to
      res = posix_memalign (ptr, align, size);
      if (res == 0)
diff --git a/gcc/real.h b/gcc/real.h
index 59af580e78f2637be84f71b98b45ec6611053222..30604adf0f7d4ca4257ed92f6d019b52a52db6c5 100644
--- a/gcc/real.h
+++ b/gcc/real.h
@@ -161,6 +161,19 @@  struct real_format
   bool has_signed_zero;
   bool qnan_msb_set;
   bool canonical_nan_lsbs_set;
+
+  /* This flag indicates whether the format is suitable for the optimized
+     code paths for the __builtin_fpclassify function and friends.  For
+     this, the format must be a base 2 representation with the sign bit as
+     the most-significant bit followed by (exp <= 32) exponent bits
+     followed by the mantissa bits.  It must be possible to interpret the
+     bits of the floating-point representation as an integer.  NaNs and
+     INFs (if available) must be represented by the same schema used by
+     IEEE 754.  (NaNs must be represented by an exponent with all bits 1,
+     any mantissa except all bits 0 and any sign bit.  +INF and -INF must be
+     represented by an exponent with all bits 1, a mantissa with all bits 0 and
+     a sign bit of 0 and 1 respectively.)  */
+  bool is_binary_ieee_compatible;
   const char *name;
 };
 
@@ -511,6 +524,11 @@  extern bool real_isinteger (const REAL_VALUE_TYPE *, HOST_WIDE_INT *);
    float string.  BUF must be large enough to contain the result.  */
 extern void get_max_float (const struct real_format *, char *, size_t);
 
+/* Write into BUF the minimum negative representable finite floating-point
+   number, x, such that b**(x-1) is normalized.
+   BUF must be large enough to contain the result.  */
+extern void get_min_float (const struct real_format *, char *, size_t);
+
 #ifndef GENERATOR_FILE
 /* real related routines.  */
 extern wide_int real_to_integer (const REAL_VALUE_TYPE *, bool *, int);
diff --git a/gcc/real.c b/gcc/real.c
index 66e88e2ad366f7848609d157074c80420d778bcf..20c907a6d543c73ba62aa9a8ddf6973d82de7832 100644
--- a/gcc/real.c
+++ b/gcc/real.c
@@ -3052,6 +3052,7 @@  const struct real_format ieee_single_format =
     true,
     true,
     false,
+    true,
     "ieee_single"
   };
 
@@ -3075,6 +3076,7 @@  const struct real_format mips_single_format =
     true,
     false,
     true,
+    true,
     "mips_single"
   };
 
@@ -3098,6 +3100,7 @@  const struct real_format motorola_single_format =
     true,
     true,
     true,
+    true,
     "motorola_single"
   };
 
@@ -3132,6 +3135,7 @@  const struct real_format spu_single_format =
     true,
     false,
     false,
+    false,
     "spu_single"
   };
 
@@ -3343,6 +3347,7 @@  const struct real_format ieee_double_format =
     true,
     true,
     false,
+    true,
     "ieee_double"
   };
 
@@ -3366,6 +3371,7 @@  const struct real_format mips_double_format =
     true,
     false,
     true,
+    true,
     "mips_double"
   };
 
@@ -3389,6 +3395,7 @@  const struct real_format motorola_double_format =
     true,
     true,
     true,
+    true,
     "motorola_double"
   };
 
@@ -3735,6 +3742,7 @@  const struct real_format ieee_extended_motorola_format =
     true,
     true,
     true,
+    false,
     "ieee_extended_motorola"
   };
 
@@ -3758,6 +3766,7 @@  const struct real_format ieee_extended_intel_96_format =
     true,
     true,
     false,
+    false,
     "ieee_extended_intel_96"
   };
 
@@ -3781,6 +3790,7 @@  const struct real_format ieee_extended_intel_128_format =
     true,
     true,
     false,
+    false,
     "ieee_extended_intel_128"
   };
 
@@ -3806,6 +3816,7 @@  const struct real_format ieee_extended_intel_96_round_53_format =
     true,
     true,
     false,
+    false,
     "ieee_extended_intel_96_round_53"
   };
 
@@ -3896,6 +3907,7 @@  const struct real_format ibm_extended_format =
     true,
     true,
     false,
+    false,
     "ibm_extended"
   };
 
@@ -3919,6 +3931,7 @@  const struct real_format mips_extended_format =
     true,
     false,
     true,
+    false,
     "mips_extended"
   };
 
@@ -4184,6 +4197,7 @@  const struct real_format ieee_quad_format =
     true,
     true,
     false,
+    true,
     "ieee_quad"
   };
 
@@ -4207,6 +4221,7 @@  const struct real_format mips_quad_format =
     true,
     false,
     true,
+    true,
     "mips_quad"
   };
 
@@ -4509,6 +4524,7 @@  const struct real_format vax_f_format =
     false,
     false,
     false,
+    false,
     "vax_f"
   };
 
@@ -4532,6 +4548,7 @@  const struct real_format vax_d_format =
     false,
     false,
     false,
+    false,
     "vax_d"
   };
 
@@ -4555,6 +4572,7 @@  const struct real_format vax_g_format =
     false,
     false,
     false,
+    false,
     "vax_g"
   };
 
@@ -4633,6 +4651,7 @@  const struct real_format decimal_single_format =
     true,
     true,
     false,
+    false,
     "decimal_single"
   };
 
@@ -4657,6 +4676,7 @@  const struct real_format decimal_double_format =
     true,
     true,
     false,
+    false,
     "decimal_double"
   };
 
@@ -4681,6 +4701,7 @@  const struct real_format decimal_quad_format =
     true,
     true,
     false,
+    false,
     "decimal_quad"
   };
 
@@ -4820,6 +4841,7 @@  const struct real_format ieee_half_format =
     true,
     true,
     false,
+    true,
     "ieee_half"
   };
 
@@ -4846,6 +4868,7 @@  const struct real_format arm_half_format =
     true,
     false,
     false,
+    false,
     "arm_half"
   };
 
@@ -4893,6 +4916,7 @@  const struct real_format real_internal_format =
     true,
     true,
     false,
+    false,
     "real_internal"
   };
 
@@ -5080,6 +5104,16 @@  get_max_float (const struct real_format *fmt, char *buf, size_t len)
   gcc_assert (strlen (buf) < len);
 }
 
+/* Write into BUF the minimum negative representable finite floating-point
+   number, x, such that b**(x-1) is normalized.
+   BUF must be large enough to contain the result.  */
+void
+get_min_float (const struct real_format *fmt, char *buf, size_t len)
+{
+  sprintf (buf, "0x1p%d", fmt->emin - 1);
+  gcc_assert (strlen (buf) < len);
+}
+
 /* True if mode M has a NaN representation and
    the treatment of NaN operands is important.  */
 
diff --git a/gcc/testsuite/gcc.dg/c99-builtins.c b/gcc/testsuite/gcc.dg/c99-builtins.c
new file mode 100644
index 0000000000000000000000000000000000000000..3ca3ed43e7a69a266467ae2a9aa738ce2d15afb9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c99-builtins.c
@@ -0,0 +1,131 @@ 
+/* { dg-options "-O2" } */
+/* { dg-do run } */
+
+#include <assert.h>
+#include <math.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+int
+main(void)
+{
+
+	/* Test FP Classify as a whole.  */
+
+	assert(fpclassify((float)0) == FP_ZERO);
+	assert(fpclassify((float)-0.0) == FP_ZERO);
+	printf("PASS fpclassify<FP_ZERO>(float)\n");
+
+	assert(fpclassify((double)0) == FP_ZERO);
+	assert(fpclassify((double)-0) == FP_ZERO);
+	printf("PASS fpclassify<FP_ZERO>(double)\n");
+
+	assert(fpclassify((long double)0) == FP_ZERO);
+	assert(fpclassify((long double)-0.0) == FP_ZERO);
+
+	printf("PASS fpclassify<FP_ZERO>(long double)\n");
+
+	assert(fpclassify((float)1) == FP_NORMAL);
+	assert(fpclassify((float)1000) == FP_NORMAL);
+	printf("PASS fpclassify<FP_NORMAL>(float)\n");
+
+	assert(fpclassify((double)1) == FP_NORMAL);
+	assert(fpclassify((double)1000) == FP_NORMAL);
+	printf("PASS fpclassify<FP_NORMAL>(double)\n");
+
+	assert(fpclassify((long double)1) == FP_NORMAL);
+	assert(fpclassify((long double)1000) == FP_NORMAL);
+	printf("PASS fpclassify<FP_NORMAL>(long double)\n");
+
+	assert(fpclassify(0x1.2p-150f) == FP_SUBNORMAL);
+	printf("PASS fpclassify<FP_NORMAL>(float)\n");
+
+	assert(fpclassify(0x1.2p-1075) == FP_SUBNORMAL);
+	printf("PASS fpclassify<FP_NORMAL>(double)\n");
+
+	assert(fpclassify(0x1.2p-16383L) == FP_SUBNORMAL);
+	printf("PASS fpclassify<FP_NORMAL>(long double)\n");
+
+	assert(fpclassify(HUGE_VALF) == FP_INFINITE);
+	assert(fpclassify((float)HUGE_VAL) == FP_INFINITE);
+	assert(fpclassify((float)HUGE_VALL) == FP_INFINITE);
+	printf("PASS fpclassify<FP_INFINITE>(float)\n");
+
+	assert(fpclassify(HUGE_VAL) == FP_INFINITE);
+	assert(fpclassify((double)HUGE_VALF) == FP_INFINITE);
+	assert(fpclassify((double)HUGE_VALL) == FP_INFINITE);
+	printf("PASS fpclassify<FP_INFINITE>(double)\n");
+
+	assert(fpclassify(HUGE_VALL) == FP_INFINITE);
+	assert(fpclassify((long double)HUGE_VALF) == FP_INFINITE);
+	assert(fpclassify((long double)HUGE_VAL) == FP_INFINITE);
+	printf("PASS fpclassify<FP_INFINITE>(long double)\n");
+
+	assert(fpclassify(NAN) == FP_NAN);
+	printf("PASS fpclassify<FP_NAN>(float)\n");
+
+	assert(fpclassify((double)NAN) == FP_NAN);
+	printf("PASS fpclassify<FP_NAN>(double)\n");
+
+	assert(fpclassify((long double)NAN) == FP_NAN);
+	printf("PASS fpclassify<FP_NAN>(long double)\n");
+
+	/* Test if individual builtins work.  */
+
+	assert(__builtin_iszero((float)0));
+	assert(__builtin_iszero((float)-0.0));
+	printf("PASS __builtin_iszero(float)\n");
+
+	assert(__builtin_iszero((double)0));
+	assert(__builtin_iszero((double)-0));
+	printf("PASS __builtin_iszero(double)\n");
+
+	assert(__builtin_iszero((long double)0));
+	assert(__builtin_iszero((long double)-0.0));
+	printf("PASS __builtin_iszero(long double)\n");
+
+	assert(__builtin_isnormal((float)1));
+	assert(__builtin_isnormal((float)1000));
+	printf("PASS __builtin_isnormal(float)\n");
+
+	assert(__builtin_isnormal((double)1));
+	assert(__builtin_isnormal((double)1000));
+	printf("PASS __builtin_isnormal(double)\n");
+
+	assert(__builtin_isnormal((long double)1));
+	assert(__builtin_isnormal((long double)1000));
+	printf("PASS __builtin_isnormal(long double)\n");
+
+	assert(__builtin_issubnormal(0x1.2p-150f));
+	printf("PASS __builtin_issubnormal(float)\n");
+
+	assert(__builtin_issubnormal(0x1.2p-1075));
+	printf("PASS __builtin_issubnormal(double)\n");
+
+	assert(__builtin_issubnormal(0x1.2p-16383L));
+	printf("PASS __builtin_issubnormal(long double)\n");
+
+	assert(__builtin_isinf(HUGE_VALF));
+	assert(__builtin_isinf((float)HUGE_VAL));
+	assert(__builtin_isinf((float)HUGE_VALL));
+	printf("PASS __builtin_isinf(float)\n");
+
+	assert(__builtin_isinf(HUGE_VAL));
+	assert(__builtin_isinf((double)HUGE_VALF));
+	assert(__builtin_isinf((double)HUGE_VALL));
+	printf("PASS __builtin_isinf(double)\n");
+	
+	assert(__builtin_isinf(HUGE_VALL));
+	assert(__builtin_isinf((long double)HUGE_VALF));
+	assert(__builtin_isinf((long double)HUGE_VAL));
+	printf("PASS __builtin_isinf(double)\n");
+
+	assert(__builtin_isnan(NAN));
+	printf("PASS __builtin_isnan(float)\n");
+	assert(__builtin_isnan((double)NAN));
+	printf("PASS __builtin_isnan(double)\n");
+	assert(__builtin_isnan((long double)NAN));
+	printf("PASS __builtin_isnan(double)\n");
+
+	exit(0);
+}
diff --git a/gcc/testsuite/gcc.dg/fold-notunord.c b/gcc/testsuite/gcc.dg/fold-notunord.c
deleted file mode 100644
index ca345154ac204cb5f380855828421b7f88d49052..0000000000000000000000000000000000000000
--- a/gcc/testsuite/gcc.dg/fold-notunord.c
+++ /dev/null
@@ -1,9 +0,0 @@ 
-/* { dg-do compile } */
-/* { dg-options "-O -ftrapping-math -fdump-tree-optimized" } */
-
-int f (double d)
-{
-  return !__builtin_isnan (d);
-}
-
-/* { dg-final { scan-tree-dump " ord " "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/builtin-fpclassify.c b/gcc/testsuite/gcc.target/aarch64/builtin-fpclassify.c
new file mode 100644
index 0000000000000000000000000000000000000000..84a73a6483780dac2347e72fa7d139545d2087eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/builtin-fpclassify.c
@@ -0,0 +1,22 @@ 
+/* This file checks the code generation for the new __builtin_fpclassify.
+   because checking the exact assembly isn't very useful, we'll just be checking
+   for the presence of certain instructions and the omition of others. */
+/* { dg-options "-O2" } */
+/* { dg-do compile } */
+/* { dg-final { scan-assembler-not "\[ \t\]?fabs\[ \t\]?" } } */
+/* { dg-final { scan-assembler-not "\[ \t\]?fcmp\[ \t\]?" } } */
+/* { dg-final { scan-assembler-not "\[ \t\]?fcmpe\[ \t\]?" } } */
+/* { dg-final { scan-assembler "\[ \t\]?sbfx\[ \t\]?" } } */
+
+#include <stdio.h>
+#include <math.h>
+
+/*
+ fp_nan = args[0];
+ fp_infinite = args[1];
+ fp_normal = args[2];
+ fp_subnormal = args[3];
+ fp_zero = args[4];
+*/
+
+int f(double x) { return __builtin_fpclassify(0, 1, 4, 3, 2, x); }