diff mbox

[0/7] Type promotion pass and elimination of zext/sext

Message ID 5628BF86.6070503@linaro.org
State Superseded
Headers show

Commit Message

Kugan Vivekanandarajah Oct. 22, 2015, 10:50 a.m. UTC
On 21/10/15 23:45, Richard Biener wrote:
> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 07/09/15 12:53, Kugan wrote:
>>>
>>> This a new version of the patch posted in
>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>> more testing and spitted the patch to make it more easier to review.
>>> There are still couple of issues to be addressed and I am working on them.
>>>
>>> 1. AARCH64 bootstrap now fails with the commit
>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>> time being, I am using  patch
>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>> workaround. This meeds to be fixed before the patches are ready to be
>>> committed.
>>>
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>
>> Hi Richard,
>>
>> Now that stage 1 is going to close, I would like to get these patches
>> accepted for stage1. I will try my best to address your review comments
>> ASAP.
> 
> Ok, can you make the whole patch series available so I can poke at the
> implementation a bit?  Please state the revision it was rebased on
> (or point me to a git/svn branch the work resides on).
> 

Thanks. Please find the patched rebated against trunk@229156. I have
skipped the test-case readjustment patches.


Thanks,
Kugan

Comments

Richard Biener Oct. 22, 2015, 2:23 p.m. UTC | #1
On Thu, Oct 22, 2015 at 12:50 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 21/10/15 23:45, Richard Biener wrote:
>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>> On 07/09/15 12:53, Kugan wrote:
>>>>
>>>> This a new version of the patch posted in
>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>> more testing and spitted the patch to make it more easier to review.
>>>> There are still couple of issues to be addressed and I am working on them.
>>>>
>>>> 1. AARCH64 bootstrap now fails with the commit
>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>> time being, I am using  patch
>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>> committed.
>>>>
>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>>
>>> Hi Richard,
>>>
>>> Now that stage 1 is going to close, I would like to get these patches
>>> accepted for stage1. I will try my best to address your review comments
>>> ASAP.
>>
>> Ok, can you make the whole patch series available so I can poke at the
>> implementation a bit?  Please state the revision it was rebased on
>> (or point me to a git/svn branch the work resides on).
>>
>
> Thanks. Please find the patched rebated against trunk@229156. I have
> skipped the test-case readjustment patches.

Some quick observations.  On x86_64 when building

short bar (short y);
int foo (short x)
{
  short y = bar (x) + 15;
  return y;
}

with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
I get

  <bb 2>:
  _1 = (int) x_10(D);
  _2 = (_1) sext (16);
  _11 = bar (_2);
  _5 = (int) _11;
  _12 = (unsigned int) _5;
  _6 = _12 & 65535;
  _7 = _6 + 15;
  _13 = (int) _7;
  _8 = (_13) sext (16);
  _9 = (_8) sext (16);
  return _9;

which looks fine but the VRP optimization doesn't trigger for the redundant sext
(ranges are computed correctly but the 2nd extension is not removed).

This also makes me notice trivial match.pd patterns are missing, like
for example

(simplify
 (sext (sext@2 @0 @1) @3)
 (if (tree_int_cst_compare (@1, @3) <= 0)
  @2
  (sext @0 @3)))

as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,
otherwise generated code might get worse compared to without the pass(?)

I also notice that the 'short' argument does not get it's sign-extension removed
as redundand either even though we have

_1 = (int) x_8(D);
Found new range for _1: [-32768, 32767]

In the end I suspect that keeping track of the "simple" cases in the promotion
pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do
its work).  In some way whether the ABI guarantees promoted argument
registers might need some other target hook queries.

Now onto the 0002 patch.

+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+         || TYPE_PRECISION (type) == 16
+         || TYPE_PRECISION (type) == 32);
+}

that's a weird function to me.  You probably want
TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?

+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;

I think what you want to verify is that TYPE_PRECISION (promoted_type)
== GET_MODE_PRECISION (mode).
And to not even bother with this simply use

promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);

You use a domwalk but also might create new basic-blocks during it
(insert_on_edge_immediate), that's a
no-no, commit edge inserts after the domwalk.
ssa_sets_higher_bits_bitmap looks unused and
we generally don't free dominance info, so please don't do that.

I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with

/abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
-B/usr/local/powerpc64-unknown-linux-gnu/bin/
-B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
/usr/local/powerpc64-unknown-linux-gnu/include -isystem
/usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
-O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
-Wno-format -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
-mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
-fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
-I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
-I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
-I../../../trunk/libgcc/../libdecnumber/dpd
-I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
-MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
../../../trunk/libgcc/libgcc2.c \
          -fexceptions -fnon-call-exceptions -fvisibility=hidden -DHIDE_EXPORTS
In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
expand_debug_locations, at cfgexpand.c:5277

as hinted at above a bootstrap on i?86 (yes, 32bit) with
--with-tune=pentiumpro might be another good testing candidate.

+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+       promote_def_and_uses (def);

it looks like you are doing some redundant work by walking both defs
and uses of each stmt.  I'd say you should separate
def and use processing and use

  FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
    promote_use (use);
  FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
    promote_def (def);

this should make processing more efficient (memory local) compared to
doing the split handling
in promote_def_and_uses.

I think it will be convenient to have a SSA name info structure where
you can remember the original
type a name was promoted from as well as whether it was promoted or
not.  This way adjusting
debug uses should be "trivial":

+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple *stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+       {
+       case GIMPLE_DEBUG:
+           {
+             gsi = gsi_for_stmt (stmt);
+             gsi_remove (&gsi, true);

rather than doing the above you'd do sth like

  SET_USE (use, fold_convert (old_type, new_def));
  update_stmt (stmt);

note that while you may not be able to use promoted regs at all uses
(like calls or asms) you can promote all defs, if only with a compensation
statement after the original def.  The SSA name info struct can be used
to note down the actual SSA name holding the promoted def.

The pass looks a lot better than last time (it's way smaller!) but
still needs some
improvements.  There are some more fishy details with respect to how you
allocate/change SSA names but I think those can be dealt with once the
basic structure looks how I like it to be.

Thanks,
Richard.



>
> Thanks,
> Kugan
Kugan Vivekanandarajah Oct. 27, 2015, 12:50 a.m. UTC | #2
On 23/10/15 01:23, Richard Biener wrote:
> On Thu, Oct 22, 2015 at 12:50 PM, Kugan

> <kugan.vivekanandarajah@linaro.org> wrote:

>>

>>

>> On 21/10/15 23:45, Richard Biener wrote:

>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan

>>> <kugan.vivekanandarajah@linaro.org> wrote:

>>>>

>>>>

>>>> On 07/09/15 12:53, Kugan wrote:

>>>>>

>>>>> This a new version of the patch posted in

>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done

>>>>> more testing and spitted the patch to make it more easier to review.

>>>>> There are still couple of issues to be addressed and I am working on them.

>>>>>

>>>>> 1. AARCH64 bootstrap now fails with the commit

>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled

>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent

>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64

>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the

>>>>> time being, I am using  patch

>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a

>>>>> workaround. This meeds to be fixed before the patches are ready to be

>>>>> committed.

>>>>>

>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with

>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works

>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.

>>>>

>>>> Hi Richard,

>>>>

>>>> Now that stage 1 is going to close, I would like to get these patches

>>>> accepted for stage1. I will try my best to address your review comments

>>>> ASAP.

>>>

>>> Ok, can you make the whole patch series available so I can poke at the

>>> implementation a bit?  Please state the revision it was rebased on

>>> (or point me to a git/svn branch the work resides on).

>>>

>>

>> Thanks. Please find the patched rebated against trunk@229156. I have

>> skipped the test-case readjustment patches.

>

> Some quick observations.  On x86_64 when building


Hi Richard,

Thanks for the review.
>

> short bar (short y);

> int foo (short x)

> {

>    short y = bar (x) + 15;

>    return y;

> }

>

> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)

> I get

>

>    <bb 2>:

>    _1 = (int) x_10(D);

>    _2 = (_1) sext (16);

>    _11 = bar (_2);

>    _5 = (int) _11;

>    _12 = (unsigned int) _5;

>    _6 = _12 & 65535;

>    _7 = _6 + 15;

>    _13 = (int) _7;

>    _8 = (_13) sext (16);

>    _9 = (_8) sext (16);

>    return _9;

>

> which looks fine but the VRP optimization doesn't trigger for the redundant sext

> (ranges are computed correctly but the 2nd extension is not removed).

>

> This also makes me notice trivial match.pd patterns are missing, like

> for example

>

> (simplify

>   (sext (sext@2 @0 @1) @3)

>   (if (tree_int_cst_compare (@1, @3) <= 0)

>    @2

>    (sext @0 @3)))

>

> as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,

> otherwise generated code might get worse compared to without the pass(?)


Do you think that we should enable this pass only when vrp is enabled. 
Otherwise, even when we do the simple optimizations you mentioned below, 
we might not be able to remove all the redundancies.

>

> I also notice that the 'short' argument does not get it's sign-extension removed

> as redundand either even though we have

>

> _1 = (int) x_8(D);

> Found new range for _1: [-32768, 32767]

>


I am looking into it.

> In the end I suspect that keeping track of the "simple" cases in the promotion

> pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do

> its work).  In some way whether the ABI guarantees promoted argument

> registers might need some other target hook queries.

>

> Now onto the 0002 patch.

>

> +static bool

> +type_precision_ok (tree type)

> +{

> +  return (TYPE_PRECISION (type)  == 8

> +         || TYPE_PRECISION (type) == 16

> +         || TYPE_PRECISION (type) == 32);

> +}

>

> that's a weird function to me.  You probably want

> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))

> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?

>


I will change this. (I have a patch which I am testing with other 
changes you have asked for)

> +/* Return the promoted type for TYPE.  */

> +static tree

> +get_promoted_type (tree type)

> +{

> +  tree promoted_type;

> +  enum machine_mode mode;

> +  int uns;

> +  if (POINTER_TYPE_P (type)

> +      || !INTEGRAL_TYPE_P (type)

> +      || !type_precision_ok (type))

> +    return type;

> +

> +  mode = TYPE_MODE (type);

> +#ifdef PROMOTE_MODE

> +  uns = TYPE_SIGN (type);

> +  PROMOTE_MODE (mode, uns, type);

> +#endif

> +  uns = TYPE_SIGN (type);

> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);

> +  if (promoted_type

> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))

> +    type = promoted_type;

>

> I think what you want to verify is that TYPE_PRECISION (promoted_type)

> == GET_MODE_PRECISION (mode).

> And to not even bother with this simply use

>

> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);

>


I am changing this too.

> You use a domwalk but also might create new basic-blocks during it

> (insert_on_edge_immediate), that's a

> no-no, commit edge inserts after the domwalk.


I am sorry, I dont understand "commit edge inserts after the domwalk" Is 
there a way to do this in the current implementation?

> ssa_sets_higher_bits_bitmap looks unused and

> we generally don't free dominance info, so please don't do that.

>

> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with

>

> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/

> -B/usr/local/powerpc64-unknown-linux-gnu/bin/

> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem

> /usr/local/powerpc64-unknown-linux-gnu/include -isystem

> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g

> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual

> -Wno-format -Wstrict-prototypes -Wmissing-prototypes

> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128

> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc

> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.

> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.

> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include

> -I../../../trunk/libgcc/../libdecnumber/dpd

> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o

> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c

> ../../../trunk/libgcc/libgcc2.c \

>            -fexceptions -fnon-call-exceptions -fvisibility=hidden -DHIDE_EXPORTS

> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:

> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:

> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in

> expand_debug_locations, at cfgexpand.c:5277

>


I am testing on gcc computefarm. I will get it to bootstrap and will do 
the regression testing before posting the next version.

> as hinted at above a bootstrap on i?86 (yes, 32bit) with

> --with-tune=pentiumpro might be another good testing candidate.

>

> +      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)

> +       promote_def_and_uses (def);

>

> it looks like you are doing some redundant work by walking both defs

> and uses of each stmt.  I'd say you should separate

> def and use processing and use

>

>    FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)

>      promote_use (use);

>    FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)

>      promote_def (def);

>


Name promote_def_and_uses in my implementation is a bit confusing. It is 
  promoting the SSA_NAMEs. We only have to do that for the definitions 
if we can do the SSA_NAMEs defined by parameters.

I also have a bitmap to see if we have promoted a variable and avoid 
doing it again. I will try to improve this.


> this should make processing more efficient (memory local) compared to

> doing the split handling

> in promote_def_and_uses.

>

> I think it will be convenient to have a SSA name info structure where

> you can remember the original

> type a name was promoted from as well as whether it was promoted or

> not.  This way adjusting

> debug uses should be "trivial":

>

> +static unsigned int

> +fixup_uses (tree use, tree promoted_type, tree old_type)

> +{

> +  gimple *stmt;

> +  imm_use_iterator ui;

> +  gimple_stmt_iterator gsi;

> +  use_operand_p op;

> +

> +  FOR_EACH_IMM_USE_STMT (stmt, ui, use)

> +    {

> +      bool do_not_promote = false;

> +      switch (gimple_code (stmt))

> +       {

> +       case GIMPLE_DEBUG:

> +           {

> +             gsi = gsi_for_stmt (stmt);

> +             gsi_remove (&gsi, true);

>

> rather than doing the above you'd do sth like

>

>    SET_USE (use, fold_convert (old_type, new_def));

>    update_stmt (stmt);

>


We do have these information (original type a name was promoted from as 
well as whether it was promoted or not). To make it easy to review, in 
the patch that adds the pass,I am removing these debug stmts. But in 
patch 4, I am trying to handle this properly. Maybe  I should combine them.

> note that while you may not be able to use promoted regs at all uses

> (like calls or asms) you can promote all defs, if only with a compensation

> statement after the original def.  The SSA name info struct can be used

> to note down the actual SSA name holding the promoted def.

>

> The pass looks a lot better than last time (it's way smaller!) but

> still needs some

> improvements.  There are some more fishy details with respect to how you

> allocate/change SSA names but I think those can be dealt with once the

> basic structure looks how I like it to be.

>


I will post an updated patch in a day or two.

Thanks again,
Kugan
diff mbox

Patch

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/4] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@  expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@  expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@  int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@  verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@  estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@  dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@  op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@  DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1