diff mbox series

Make ivopts handle calls to internal functions

Message ID 87h8ttymjo.fsf@linaro.org
State New
Headers show
Series Make ivopts handle calls to internal functions | expand

Commit Message

Richard Sandiford Nov. 17, 2017, 3:03 p.m. UTC
ivopts previously treated pointer arguments to internal functions
like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values.
This patch makes it treat them as addresses instead.  This makes
a significant difference to the code quality for SVE loops,
since we can then use loads and stores with scaled indices.

The patch also adds support for ADDR_EXPRs of TARGET_MEM_REFs,
which are the natural way of representing the result of the
ivopts transformation.

Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
and powerpc64le-linux-gnu.  OK to install?

Richard


2017-11-17  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of
	TARGET_MEM_REFs.
	* gimple-expr.h (is_gimple_addressable: Likewise.
	* gimple-expr.c (is_gimple_address): Likewise.
	* internal-fn.c (expand_call_mem_ref): New function.
	(expand_mask_load_optab_fn): Use it.
	(expand_mask_store_optab_fn): Likewise.
	* tree-ssa-loop-ivopts.c (USE_ADDRESS): Split into...
	(USE_REF_ADDRESS, USE_PTR_ADDRESS): ...these new use types.
	(dump_groups): Update accordingly.
	(iv_use::mem_type): New member variable.
	(address_p): New function.
	(record_use): Add a mem_type argument and initialize the new
	mem_type field.
	(record_group_use): Add a mem_type argument.  Use address_p.
	Update call to record_use.
	(find_interesting_uses_op): Update call to record_group_use.
	(find_interesting_uses_cond): Likewise.
	(find_interesting_uses_address): Likewise.
	(get_mem_type_for_internal_fn): New function.
	(find_address_like_use): Likewise.
	(find_interesting_uses_stmt): Try find_address_like_use before
	calling find_interesting_uses_op.
	(addr_offset_valid_p): Use the iv mem_type field as the type
	of the addressed memory.
	(add_autoinc_candidates): Likewise.
	(get_address_cost): Likewise.
	(split_small_address_groups_p): Use address_p.
	(split_address_groups): Likewise.
	(add_iv_candidate_for_use): Likewise.
	(autoinc_possible_for_pair): Likewise.
	(rewrite_groups): Likewise.
	(get_use_type): Check for USE_REF_ADDRESS instead of USE_ADDRESS.
	(determine_group_iv_cost): Update after split of USE_ADDRESS.
	(get_alias_ptr_type_for_ptr_address): New function.
	(rewrite_use_address): Rewrite address uses in calls that were
	identified by find_address_like_use.

gcc/testsuite/
	* gcc.dg/tree-ssa/scev-9.c: Expected REFERENCE ADDRESS
	instead of just ADDRESS.
	* gcc.dg/tree-ssa/scev-10.c: Likewise.
	* gcc.dg/tree-ssa/scev-11.c: Likewise.
	* gcc.dg/tree-ssa/scev-12.c: Likewise.
	* gcc.target/aarch64/sve_index_offset_1.c: New test.
	* gcc.target/aarch64/sve_index_offset_1_run.c: Likewise.
	* gcc.target/aarch64/sve_loop_add_2.c: Likewise.
	* gcc.target/aarch64/sve_loop_add_3.c: Likewise.
	* gcc.target/aarch64/sve_while_1.c: Check for indexed addressing modes.
	* gcc.target/aarch64/sve_while_2.c: Likewise.
	* gcc.target/aarch64/sve_while_3.c: Likewise.
	* gcc.target/aarch64/sve_while_4.c: Likewise.

Comments

Bin.Cheng Nov. 20, 2017, 11:31 a.m. UTC | #1
On Fri, Nov 17, 2017 at 3:03 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> ivopts previously treated pointer arguments to internal functions

> like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values.

> This patch makes it treat them as addresses instead.  This makes

> a significant difference to the code quality for SVE loops,

> since we can then use loads and stores with scaled indices.

Thanks for working on this.  This can be extended to other internal
functions which eventually
are expanded into memory references.  I believe (at least) both x86
and AArch64 has such
requirement.

>

> The patch also adds support for ADDR_EXPRs of TARGET_MEM_REFs,

> which are the natural way of representing the result of the

> ivopts transformation.

>

> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu

> and powerpc64le-linux-gnu.  OK to install?

>

> Richard

>

>

> 2017-11-17  Richard Sandiford  <richard.sandiford@linaro.org>

>             Alan Hayward  <alan.hayward@arm.com>

>             David Sherwood  <david.sherwood@arm.com>

>

> gcc/

>         * expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of

>         TARGET_MEM_REFs.

>         * gimple-expr.h (is_gimple_addressable: Likewise.

>         * gimple-expr.c (is_gimple_address): Likewise.

>         * internal-fn.c (expand_call_mem_ref): New function.

>         (expand_mask_load_optab_fn): Use it.

>         (expand_mask_store_optab_fn): Likewise.

>         * tree-ssa-loop-ivopts.c (USE_ADDRESS): Split into...

>         (USE_REF_ADDRESS, USE_PTR_ADDRESS): ...these new use types.

>         (dump_groups): Update accordingly.

>         (iv_use::mem_type): New member variable.

>         (address_p): New function.

>         (record_use): Add a mem_type argument and initialize the new

>         mem_type field.

>         (record_group_use): Add a mem_type argument.  Use address_p.

>         Update call to record_use.

>         (find_interesting_uses_op): Update call to record_group_use.

>         (find_interesting_uses_cond): Likewise.

>         (find_interesting_uses_address): Likewise.

>         (get_mem_type_for_internal_fn): New function.

>         (find_address_like_use): Likewise.

>         (find_interesting_uses_stmt): Try find_address_like_use before

>         calling find_interesting_uses_op.

>         (addr_offset_valid_p): Use the iv mem_type field as the type

>         of the addressed memory.

>         (add_autoinc_candidates): Likewise.

>         (get_address_cost): Likewise.

>         (split_small_address_groups_p): Use address_p.

>         (split_address_groups): Likewise.

>         (add_iv_candidate_for_use): Likewise.

>         (autoinc_possible_for_pair): Likewise.

>         (rewrite_groups): Likewise.

>         (get_use_type): Check for USE_REF_ADDRESS instead of USE_ADDRESS.

>         (determine_group_iv_cost): Update after split of USE_ADDRESS.

>         (get_alias_ptr_type_for_ptr_address): New function.

>         (rewrite_use_address): Rewrite address uses in calls that were

>         identified by find_address_like_use.

>

> gcc/testsuite/

>         * gcc.dg/tree-ssa/scev-9.c: Expected REFERENCE ADDRESS

>         instead of just ADDRESS.

>         * gcc.dg/tree-ssa/scev-10.c: Likewise.

>         * gcc.dg/tree-ssa/scev-11.c: Likewise.

>         * gcc.dg/tree-ssa/scev-12.c: Likewise.

>         * gcc.target/aarch64/sve_index_offset_1.c: New test.

>         * gcc.target/aarch64/sve_index_offset_1_run.c: Likewise.

>         * gcc.target/aarch64/sve_loop_add_2.c: Likewise.

>         * gcc.target/aarch64/sve_loop_add_3.c: Likewise.

>         * gcc.target/aarch64/sve_while_1.c: Check for indexed addressing modes.

>         * gcc.target/aarch64/sve_while_2.c: Likewise.

>         * gcc.target/aarch64/sve_while_3.c: Likewise.

>         * gcc.target/aarch64/sve_while_4.c: Likewise.

>

> Index: gcc/expr.c

> ===================================================================

> --- gcc/expr.c  2017-11-17 09:49:36.191354637 +0000

> +++ gcc/expr.c  2017-11-17 15:02:12.868132458 +0000

> @@ -7814,6 +7814,9 @@ expand_expr_addr_expr_1 (tree exp, rtx t

>         return expand_expr (tem, target, tmode, modifier);

>        }

>

> +    case TARGET_MEM_REF:

> +      return addr_for_mem_ref (exp, as, true);

> +

>      case CONST_DECL:

>        /* Expand the initializer like constants above.  */

>        result = XEXP (expand_expr_constant (DECL_INITIAL (exp),

> Index: gcc/gimple-expr.h

> ===================================================================

> --- gcc/gimple-expr.h   2017-11-17 09:40:43.520567009 +0000

> +++ gcc/gimple-expr.h   2017-11-17 15:02:12.868132458 +0000

> @@ -119,6 +119,7 @@ virtual_operand_p (tree op)

>  is_gimple_addressable (tree t)

>  {

>    return (is_gimple_id (t) || handled_component_p (t)

> +         || TREE_CODE (t) == TARGET_MEM_REF

>           || TREE_CODE (t) == MEM_REF);

>  }

>

> Index: gcc/gimple-expr.c

> ===================================================================

> --- gcc/gimple-expr.c   2017-10-13 10:23:39.845432950 +0100

> +++ gcc/gimple-expr.c   2017-11-17 15:02:12.868132458 +0000

> @@ -631,7 +631,9 @@ is_gimple_address (const_tree t)

>        op = TREE_OPERAND (op, 0);

>      }

>

> -  if (CONSTANT_CLASS_P (op) || TREE_CODE (op) == MEM_REF)

> +  if (CONSTANT_CLASS_P (op)

> +      || TREE_CODE (op) == TARGET_MEM_REF

> +      || TREE_CODE (op) == MEM_REF)

>      return true;

>

>    switch (TREE_CODE (op))

> Index: gcc/internal-fn.c

> ===================================================================

> --- gcc/internal-fn.c   2017-11-17 14:57:36.436527536 +0000

> +++ gcc/internal-fn.c   2017-11-17 15:02:12.869042409 +0000

> @@ -2367,15 +2367,47 @@ expand_LOOP_DIST_ALIAS (internal_fn, gca

>    gcc_unreachable ();

>  }

>

> +/* Return a memory reference of type TYPE for argument INDEX of STMT.

> +   Use argument INDEX + 1 to derive the second (TBAA) operand.  */

> +

> +static tree

> +expand_call_mem_ref (tree type, gcall *stmt, int index)

> +{

> +  tree addr = gimple_call_arg (stmt, index);

> +  tree alias_ptr_type = TREE_TYPE (gimple_call_arg (stmt, index + 1));

> +  unsigned int align = tree_to_shwi (gimple_call_arg (stmt, index + 1));

> +  if (TYPE_ALIGN (type) != align)

> +    type = build_aligned_type (type, align);

> +

> +  tree tmp = addr;

> +  if (TREE_CODE (tmp) == SSA_NAME)

> +    {

> +      gimple *def = SSA_NAME_DEF_STMT (tmp);

> +      if (gimple_assign_single_p (def))

> +       tmp = gimple_assign_rhs1 (def);

> +    }

> +

> +  if (TREE_CODE (tmp) == ADDR_EXPR)

> +    {

> +      tree mem = TREE_OPERAND (tmp, 0);

> +      if (TREE_CODE (mem) == TARGET_MEM_REF

> +         && types_compatible_p (TREE_TYPE (mem), type)

> +         && alias_ptr_type == TREE_TYPE (TMR_OFFSET (mem))

> +         && integer_zerop (TMR_OFFSET (mem)))

> +       return mem;

> +    }

> +

> +  return fold_build2 (MEM_REF, type, addr, build_int_cst (alias_ptr_type, 0));

> +}

> +

>  /* Expand MASK_LOAD{,_LANES} call STMT using optab OPTAB.  */

>

>  static void

>  expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)

>  {

>    struct expand_operand ops[3];

> -  tree type, lhs, rhs, maskt, ptr;

> +  tree type, lhs, rhs, maskt;

>    rtx mem, target, mask;

> -  unsigned align;

>    insn_code icode;

>

>    maskt = gimple_call_arg (stmt, 2);

> @@ -2383,11 +2415,7 @@ expand_mask_load_optab_fn (internal_fn,

>    if (lhs == NULL_TREE)

>      return;

>    type = TREE_TYPE (lhs);

> -  ptr = build_int_cst (TREE_TYPE (gimple_call_arg (stmt, 1)), 0);

> -  align = tree_to_shwi (gimple_call_arg (stmt, 1));

> -  if (TYPE_ALIGN (type) != align)

> -    type = build_aligned_type (type, align);

> -  rhs = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0), ptr);

> +  rhs = expand_call_mem_ref (type, stmt, 0);

>

>    if (optab == vec_mask_load_lanes_optab)

>      icode = get_multi_vector_move (type, optab);

> @@ -2413,19 +2441,14 @@ #define expand_mask_load_lanes_optab_fn

>  expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)

>  {

>    struct expand_operand ops[3];

> -  tree type, lhs, rhs, maskt, ptr;

> +  tree type, lhs, rhs, maskt;

>    rtx mem, reg, mask;

> -  unsigned align;

>    insn_code icode;

>

>    maskt = gimple_call_arg (stmt, 2);

>    rhs = gimple_call_arg (stmt, 3);

>    type = TREE_TYPE (rhs);

> -  ptr = build_int_cst (TREE_TYPE (gimple_call_arg (stmt, 1)), 0);

> -  align = tree_to_shwi (gimple_call_arg (stmt, 1));

> -  if (TYPE_ALIGN (type) != align)

> -    type = build_aligned_type (type, align);

> -  lhs = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0), ptr);

> +  lhs = expand_call_mem_ref (type, stmt, 0);

>

>    if (optab == vec_mask_store_lanes_optab)

>      icode = get_multi_vector_move (type, optab);

Support for TARGET_MEM_REF and refactoring of expand_call_* in above
code looks independent
to IVOPTs change?  If so, coule you split it into two patches?  I can
approve IVOPTs part patch with
below comments.

> Index: gcc/tree-ssa-loop-ivopts.c

> ===================================================================

> --- gcc/tree-ssa-loop-ivopts.c  2017-11-17 09:05:59.900349210 +0000

> +++ gcc/tree-ssa-loop-ivopts.c  2017-11-17 15:02:12.870862310 +0000

> @@ -166,7 +166,11 @@ struct version_info

>  enum use_type

>  {

>    USE_NONLINEAR_EXPR,  /* Use in a nonlinear expression.  */

> -  USE_ADDRESS,         /* Use in an address.  */

> +  USE_REF_ADDRESS,     /* Use is an address for an explicit memory

> +                          reference.  */

> +  USE_PTR_ADDRESS,     /* Use is a pointer argument to a function in

> +                          cases where the expansion of the function

> +                          will turn the argument into a normal address.  */

>    USE_COMPARE          /* Use is a compare.  */

>  };

>

> @@ -362,6 +366,9 @@ struct iv_use

>    unsigned id;         /* The id of the use.  */

>    unsigned group_id;   /* The group id the use belongs to.  */

>    enum use_type type;  /* Type of the use.  */

> +  tree mem_type;       /* The memory type to use when testing whether an

> +                          address is legitimate, and what the address's

> +                          cost is.  */

>    struct iv *iv;       /* The induction variable it is based on.  */

>    gimple *stmt;                /* Statement in that it occurs.  */

>    tree *op_p;          /* The place where it occurs.  */

> @@ -506,6 +513,14 @@ struct iv_inv_expr_hasher : free_ptr_has

>    static inline bool equal (const iv_inv_expr_ent *, const iv_inv_expr_ent *);

>  };

>

> +/* Return true if uses of type TYPE represent some form of address.  */

> +

> +inline bool

> +address_p (use_type type)

> +{

> +  return type == USE_REF_ADDRESS || type == USE_PTR_ADDRESS;

> +}

> +

>  /* Hash function for loop invariant expressions.  */

>

>  inline hashval_t

> @@ -768,8 +783,10 @@ dump_groups (FILE *file, struct ivopts_d

>        fprintf (file, "Group %d:\n", group->id);

>        if (group->type == USE_NONLINEAR_EXPR)

>         fprintf (file, "  Type:\tGENERIC\n");

> -      else if (group->type == USE_ADDRESS)

> -       fprintf (file, "  Type:\tADDRESS\n");

> +      else if (group->type == USE_REF_ADDRESS)

> +       fprintf (file, "  Type:\tREFERENCE ADDRESS\n");

> +      else if (group->type == USE_PTR_ADDRESS)

> +       fprintf (file, "  Type:\tPOINTER ARGUMENT ADDRESS\n");

>        else

>         {

>           gcc_assert (group->type == USE_COMPARE);

> @@ -1502,19 +1519,21 @@ find_induction_variables (struct ivopts_

>

>  /* Records a use of TYPE at *USE_P in STMT whose value is IV in GROUP.

>     For address type use, ADDR_BASE is the stripped IV base, ADDR_OFFSET

> -   is the const offset stripped from IV base; for other types use, both

> -   are zero by default.  */

> +   is the const offset stripped from IV base and MEM_TYPE is the type

> +   of the memory being addressed.  For uses of other types, ADDR_BASE

> +   and ADDR_OFFSET are zero by default and MEM_TYPE is NULL_TREE.  */

>

>  static struct iv_use *

>  record_use (struct iv_group *group, tree *use_p, struct iv *iv,

> -           gimple *stmt, enum use_type type, tree addr_base,

> -           poly_uint64 addr_offset)

> +           gimple *stmt, enum use_type type, tree mem_type,

> +           tree addr_base, poly_uint64 addr_offset)

>  {

>    struct iv_use *use = XCNEW (struct iv_use);

>

>    use->id = group->vuses.length ();

>    use->group_id = group->id;

>    use->type = type;

> +  use->mem_type = mem_type;

>    use->iv = iv;

>    use->stmt = stmt;

>    use->op_p = use_p;

> @@ -1569,18 +1588,21 @@ record_group (struct ivopts_data *data,

>  }

>

>  /* Record a use of TYPE at *USE_P in STMT whose value is IV in a group.

> -   New group will be created if there is no existing group for the use.  */

> +   New group will be created if there is no existing group for the use.

> +   MEM_TYPE is the type of memory being addressed, or NULL if this

> +   isn't an address reference.  */

>

>  static struct iv_use *

>  record_group_use (struct ivopts_data *data, tree *use_p,

> -                 struct iv *iv, gimple *stmt, enum use_type type)

> +                 struct iv *iv, gimple *stmt, enum use_type type,

> +                 tree mem_type)

>  {

>    tree addr_base = NULL;

>    struct iv_group *group = NULL;

>    poly_uint64 addr_offset = 0;

>

>    /* Record non address type use in a new group.  */

> -  if (type == USE_ADDRESS && iv->base_object)

> +  if (address_p (type) && iv->base_object)

I forgot to somplify this condition given (address_p (type) &&
!iv->base_object) is not allowed now.
You can simplify this after below comment on base_object if you want.
>      {

>        unsigned int i;

>

> @@ -1591,7 +1613,7 @@ record_group_use (struct ivopts_data *da

>

>           group = data->vgroups[i];

>           use = group->vuses[0];

> -         if (use->type != USE_ADDRESS || !use->iv->base_object)

And here.

> +         if (!address_p (use->type) || !use->iv->base_object)

>             continue;

>

>           /* Check if it has the same stripped base and step.  */

> @@ -1607,7 +1629,8 @@ record_group_use (struct ivopts_data *da

>    if (!group)

>      group = record_group (data, type);

>

> -  return record_use (group, use_p, iv, stmt, type, addr_base, addr_offset);

> +  return record_use (group, use_p, iv, stmt, type, mem_type,

> +                    addr_base, addr_offset);

>  }

>

>  /* Checks whether the use OP is interesting and if so, records it.  */

> @@ -1641,7 +1664,7 @@ find_interesting_uses_op (struct ivopts_

>    stmt = SSA_NAME_DEF_STMT (op);

>    gcc_assert (gimple_code (stmt) == GIMPLE_PHI || is_gimple_assign (stmt));

>

> -  use = record_group_use (data, NULL, iv, stmt, USE_NONLINEAR_EXPR);

> +  use = record_group_use (data, NULL, iv, stmt, USE_NONLINEAR_EXPR, NULL_TREE);

>    iv->nonlin_use = use;

>    return use;

>  }

> @@ -1757,10 +1780,10 @@ find_interesting_uses_cond (struct ivopt

>        return;

>      }

>

> -  record_group_use (data, var_p, var_iv, stmt, USE_COMPARE);

> +  record_group_use (data, var_p, var_iv, stmt, USE_COMPARE, NULL_TREE);

>    /* Record compare type iv_use for iv on the other side of comparison.  */

>    if (ret == COMP_IV_EXPR_2)

> -    record_group_use (data, bound_p, bound_iv, stmt, USE_COMPARE);

> +    record_group_use (data, bound_p, bound_iv, stmt, USE_COMPARE, NULL_TREE);

>  }

>

>  /* Returns the outermost loop EXPR is obviously invariant in

> @@ -2375,7 +2398,7 @@ find_interesting_uses_address (struct iv

>    if (civ->base_object == NULL_TREE)

>      goto fail;

>

> -  record_group_use (data, op_p, civ, stmt, USE_ADDRESS);

> +  record_group_use (data, op_p, civ, stmt, USE_REF_ADDRESS, TREE_TYPE (*op_p));

>    return;

>

>  fail:

> @@ -2398,6 +2421,51 @@ find_invariants_stmt (struct ivopts_data

>      }

>  }

>

> +/* CALL calls an internal function.  If operand *OP_P will become an

> +   address when the call is expanded, return the type of the memory

> +   being addressed, otherwise return null.  */

> +

> +static tree

> +get_mem_type_for_internal_fn (gcall *call, tree *op_p)

> +{

> +  switch (gimple_call_internal_fn (call))

> +    {

> +    case IFN_MASK_LOAD:

> +      if (op_p == gimple_call_arg_ptr (call, 0))

> +       return TREE_TYPE (gimple_call_lhs (call));

> +      return NULL_TREE;

> +

> +    case IFN_MASK_STORE:

> +      if (op_p == gimple_call_arg_ptr (call, 0))

> +       return TREE_TYPE (gimple_call_arg (call, 3));

> +      return NULL_TREE;

> +

> +    default:

> +      return NULL_TREE;

> +    }

> +}

> +

> +/* IV is a (non-address) iv that describes operand *OP_P of STMT.

> +   Return true if the operand will become an address when STMT

> +   is expanded and record the associated address use if so.  */

> +

> +static bool

> +find_address_like_use (struct ivopts_data *data, gimple *stmt, tree *op_p,

> +                      struct iv *iv)

> +{

> +  tree mem_type = NULL_TREE;

> +  if (gcall *call = dyn_cast <gcall *> (stmt))

> +    if (gimple_call_internal_p (call))

> +      mem_type = get_mem_type_for_internal_fn (call, op_p);

> +  if (mem_type)

> +    {

> +      iv = alloc_iv (data, iv->base, iv->step);

We now don't allow address type iv_use without base_object.  So
checking code like below
is needed here:

  /* Fail if base object of this memory reference is unknown.  */
  if (iv->base_object == NULL_TREE)
    return false;

IVOPTs part is OK with this change.

Thanks,
bin
> +      record_group_use (data, op_p, iv, stmt, USE_PTR_ADDRESS, mem_type);

> +      return true;

> +    }

> +  return false;

> +}

> +

>  /* Finds interesting uses of induction variables in the statement STMT.  */

>

>  static void

> @@ -2482,7 +2550,8 @@ find_interesting_uses_stmt (struct ivopt

>        if (!iv)

>         continue;

>

> -      find_interesting_uses_op (data, op);

> +      if (!find_address_like_use (data, stmt, use_p->use, iv))

> +       find_interesting_uses_op (data, op);

>      }

>  }

>

> @@ -2516,7 +2585,7 @@ addr_offset_valid_p (struct iv_use *use,

>    rtx reg, addr;

>    unsigned list_index;

>    addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));

> -  machine_mode addr_mode, mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));

> +  machine_mode addr_mode, mem_mode = TYPE_MODE (use->mem_type);

>

>    list_index = (unsigned) as * MAX_MACHINE_MODE + (unsigned) mem_mode;

>    if (list_index >= vec_safe_length (addr_list))

> @@ -2573,7 +2642,7 @@ split_small_address_groups_p (struct ivo

>        if (group->vuses.length () == 1)

>         continue;

>

> -      gcc_assert (group->type == USE_ADDRESS);

> +      gcc_assert (address_p (group->type));

>        if (group->vuses.length () == 2)

>         {

>           if (compare_sizes_for_sort (group->vuses[0]->addr_offset,

> @@ -2625,7 +2694,7 @@ split_address_groups (struct ivopts_data

>        if (group->vuses.length () == 1)

>         continue;

>

> -      gcc_assert (group->type == USE_ADDRESS);

> +      gcc_assert (address_p (use->type));

>

>        for (j = 1; j < group->vuses.length ();)

>         {

> @@ -3145,7 +3214,7 @@ add_autoinc_candidates (struct ivopts_da

>

>    cstepi = int_cst_value (step);

>

> -  mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));

> +  mem_mode = TYPE_MODE (use->mem_type);

>    if (((USE_LOAD_PRE_INCREMENT (mem_mode)

>         || USE_STORE_PRE_INCREMENT (mem_mode))

>         && must_eq (GET_MODE_SIZE (mem_mode), cstepi))

> @@ -3436,7 +3505,7 @@ add_iv_candidate_for_use (struct ivopts_

>    /* At last, add auto-incremental candidates.  Make such variables

>       important since other iv uses with same base object may be based

>       on it.  */

> -  if (use != NULL && use->type == USE_ADDRESS)

> +  if (use != NULL && address_p (use->type))

>      add_autoinc_candidates (data, iv->base, iv->step, true, use);

>  }

>

> @@ -3903,7 +3972,7 @@ get_use_type (struct iv_use *use)

>    tree base_type = TREE_TYPE (use->iv->base);

>    tree type;

>

> -  if (use->type == USE_ADDRESS)

> +  if (use->type == USE_REF_ADDRESS)

>      {

>        /* The base_type may be a void pointer.  Create a pointer type based on

>          the mem_ref instead.  */

> @@ -4331,7 +4400,7 @@ get_address_cost (struct ivopts_data *da

>    struct mem_address parts = {NULL_TREE, integer_one_node,

>                               NULL_TREE, NULL_TREE, NULL_TREE};

>    machine_mode addr_mode = TYPE_MODE (type);

> -  machine_mode mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));

> +  machine_mode mem_mode = TYPE_MODE (use->mem_type);

>    addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));

>    /* Only true if ratio != 1.  */

>    bool ok_with_ratio_p = false;

> @@ -5220,7 +5289,8 @@ determine_group_iv_cost (struct ivopts_d

>      case USE_NONLINEAR_EXPR:

>        return determine_group_iv_cost_generic (data, group, cand);

>

> -    case USE_ADDRESS:

> +    case USE_REF_ADDRESS:

> +    case USE_PTR_ADDRESS:

>        return determine_group_iv_cost_address (data, group, cand);

>

>      case USE_COMPARE:

> @@ -5238,7 +5308,7 @@ determine_group_iv_cost (struct ivopts_d

>  autoinc_possible_for_pair (struct ivopts_data *data, struct iv_use *use,

>                            struct iv_cand *cand)

>  {

> -  if (use->type != USE_ADDRESS)

> +  if (!address_p (use->type))

>      return false;

>

>    bool can_autoinc = false;

> @@ -6997,6 +7067,27 @@ adjust_iv_update_pos (struct iv_cand *ca

>    cand->incremented_at = use->stmt;

>  }

>

> +/* Return the alias pointer type that should be used for a MEM_REF

> +   associated with USE, which has type USE_PTR_ADDRESS.  */

> +

> +static tree

> +get_alias_ptr_type_for_ptr_address (iv_use *use)

> +{

> +  gcall *call = as_a <gcall *> (use->stmt);

> +  switch (gimple_call_internal_fn (call))

> +    {

> +    case IFN_MASK_LOAD:

> +    case IFN_MASK_STORE:

> +      /* The second argument contains the correct alias type.  */

> +      gcc_assert (use->op_p = gimple_call_arg_ptr (call, 0));

> +      return TREE_TYPE (gimple_call_arg (call, 1));

> +

> +    default:

> +      gcc_unreachable ();

> +    }

> +}

> +

> +

>  /* Rewrites USE (address that is an iv) using candidate CAND.  */

>

>  static void

> @@ -7025,16 +7116,31 @@ rewrite_use_address (struct ivopts_data

>    tree iv = var_at_stmt (data->current_loop, cand, use->stmt);

>    tree base_hint = (cand->iv->base_object) ? iv : NULL_TREE;

>    gimple_stmt_iterator bsi = gsi_for_stmt (use->stmt);

> -  tree type = TREE_TYPE (*use->op_p);

> -  unsigned int align = get_object_alignment (*use->op_p);

> -  if (align != TYPE_ALIGN (type))

> -    type = build_aligned_type (type, align);

> -

> -  tree ref = create_mem_ref (&bsi, type, &aff,

> -                            reference_alias_ptr_type (*use->op_p),

> +  tree type = use->mem_type;

> +  tree alias_ptr_type;

> +  if (use->type == USE_PTR_ADDRESS)

> +    alias_ptr_type = get_alias_ptr_type_for_ptr_address (use);

> +  else

> +    {

> +      gcc_assert (type == TREE_TYPE (*use->op_p));

> +      unsigned int align = get_object_alignment (*use->op_p);

> +      if (align != TYPE_ALIGN (type))

> +       type = build_aligned_type (type, align);

> +      alias_ptr_type = reference_alias_ptr_type (*use->op_p);

> +    }

> +  tree ref = create_mem_ref (&bsi, type, &aff, alias_ptr_type,

>                              iv, base_hint, data->speed);

>

> -  copy_ref_info (ref, *use->op_p);

> +  if (use->type == USE_PTR_ADDRESS)

> +    {

> +      ref = fold_build1 (ADDR_EXPR, build_pointer_type (use->mem_type), ref);

> +      ref = fold_convert (get_use_type (use), ref);

> +      ref = force_gimple_operand_gsi (&bsi, ref, true, NULL_TREE,

> +                                     true, GSI_SAME_STMT);

> +    }

> +  else

> +    copy_ref_info (ref, *use->op_p);

> +

>    *use->op_p = ref;

>  }

>

> @@ -7110,7 +7216,7 @@ rewrite_groups (struct ivopts_data *data

>               update_stmt (group->vuses[j]->stmt);

>             }

>         }

> -      else if (group->type == USE_ADDRESS)

> +      else if (address_p (group->type))

>         {

>           for (j = 0; j < group->vuses.length (); j++)

>             {

> Index: gcc/testsuite/gcc.dg/tree-ssa/scev-9.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/tree-ssa/scev-9.c      2016-05-02 10:44:33.000000000 +0100

> +++ gcc/testsuite/gcc.dg/tree-ssa/scev-9.c      2017-11-17 15:02:12.869042409 +0000

> @@ -18,5 +18,5 @@ foo (unsigned char s, unsigned char l)

>  }

>

>  /* Address of array reference is scev.  */

> -/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

> +/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

>

> Index: gcc/testsuite/gcc.dg/tree-ssa/scev-10.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/tree-ssa/scev-10.c     2016-05-02 10:44:33.000000000 +0100

> +++ gcc/testsuite/gcc.dg/tree-ssa/scev-10.c     2017-11-17 15:02:12.869042409 +0000

> @@ -18,5 +18,5 @@ foo (signed char s, signed char l)

>  }

>

>  /* Address of array reference is scev.  */

> -/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

> +/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

>

> Index: gcc/testsuite/gcc.dg/tree-ssa/scev-11.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/tree-ssa/scev-11.c     2016-05-02 10:44:33.000000000 +0100

> +++ gcc/testsuite/gcc.dg/tree-ssa/scev-11.c     2017-11-17 15:02:12.869042409 +0000

> @@ -23,4 +23,4 @@ foo (int n)

>  }

>

>  /* Address of array reference to b is scev.  */

> -/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 2 "ivopts" } } */

> +/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 2 "ivopts" } } */

> Index: gcc/testsuite/gcc.dg/tree-ssa/scev-12.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/tree-ssa/scev-12.c     2016-05-02 10:44:33.000000000 +0100

> +++ gcc/testsuite/gcc.dg/tree-ssa/scev-12.c     2017-11-17 15:02:12.869042409 +0000

> @@ -24,4 +24,4 @@ foo (int x, int n)

>  }

>

>  /* Address of array reference to b is not scev.  */

> -/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

> +/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

> Index: gcc/testsuite/gcc.target/aarch64/sve_index_offset_1.c

> ===================================================================

> --- /dev/null   2017-11-14 14:28:07.424493901 +0000

> +++ gcc/testsuite/gcc.target/aarch64/sve_index_offset_1.c       2017-11-17 15:02:12.869042409 +0000

> @@ -0,0 +1,49 @@

> +/* { dg-do compile } */

> +/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve -msve-vector-bits=256" } */

> +

> +#define SIZE 15*8+3

> +

> +#define INDEX_OFFSET_TEST_1(SIGNED, TYPE, ITERTYPE) \

> +void set_##SIGNED##_##TYPE##_##ITERTYPE (SIGNED TYPE *__restrict out, \

> +                                        SIGNED TYPE *__restrict in) \

> +{ \

> +  SIGNED ITERTYPE i; \

> +  for (i = 0; i < SIZE; i++) \

> +  { \

> +    out[i] = in[i]; \

> +  } \

> +} \

> +void set_##SIGNED##_##TYPE##_##ITERTYPE##_var (SIGNED TYPE *__restrict out, \

> +                                              SIGNED TYPE *__restrict in, \

> +                                              SIGNED ITERTYPE n) \

> +{\

> +  SIGNED ITERTYPE i;\

> +  for (i = 0; i < n; i++)\

> +  {\

> +    out[i] = in[i];\

> +  }\

> +}

> +

> +#define INDEX_OFFSET_TEST(SIGNED, TYPE)\

> +  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, char) \

> +  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, short) \

> +  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, int) \

> +  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, long)

> +

> +INDEX_OFFSET_TEST (signed, long)

> +INDEX_OFFSET_TEST (unsigned, long)

> +INDEX_OFFSET_TEST (signed, int)

> +INDEX_OFFSET_TEST (unsigned, int)

> +INDEX_OFFSET_TEST (signed, short)

> +INDEX_OFFSET_TEST (unsigned, short)

> +INDEX_OFFSET_TEST (signed, char)

> +INDEX_OFFSET_TEST (unsigned, char)

> +

> +/* { dg-final { scan-assembler-times "ld1d\\tz\[0-9\]+.d, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 3\\\]" 16 } } */

> +/* { dg-final { scan-assembler-times "st1d\\tz\[0-9\]+.d, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 3\\\]" 16 } } */

> +/* { dg-final { scan-assembler-times "ld1w\\tz\[0-9\]+.s, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 2\\\]" 16 } } */

> +/* { dg-final { scan-assembler-times "st1w\\tz\[0-9\]+.s, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 2\\\]" 16 } } */

> +/* { dg-final { scan-assembler-times "ld1h\\tz\[0-9\]+.h, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 1\\\]" 16 } } */

> +/* { dg-final { scan-assembler-times "st1h\\tz\[0-9\]+.h, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 1\\\]" 16 } } */

> +/* { dg-final { scan-assembler-times "ld1b\\tz\[0-9\]+.b, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+\\\]" 16 } } */

> +/* { dg-final { scan-assembler-times "st1b\\tz\[0-9\]+.b, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+\\\]" 16 } } */

> Index: gcc/testsuite/gcc.target/aarch64/sve_index_offset_1_run.c

> ===================================================================

> --- /dev/null   2017-11-14 14:28:07.424493901 +0000

> +++ gcc/testsuite/gcc.target/aarch64/sve_index_offset_1_run.c   2017-11-17 15:02:12.869042409 +0000

> @@ -0,0 +1,48 @@

> +/* { dg-do run { target aarch64_sve_hw } } */

> +/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve" } */

> +/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve -msve-vector-bits=256" { target aarch64_sve256_hw } } */

> +

> +#include "sve_index_offset_1_run.c"

> +

> +#include <string.h>

> +

> +#define CALL_INDEX_OFFSET_TEST_1(SIGNED, TYPE, ITERTYPE)\

> +{\

> +  SIGNED TYPE out[SIZE + 1];\

> +  SIGNED TYPE in1[SIZE + 1];\

> +  SIGNED TYPE in2[SIZE + 1];\

> +  for (int i = 0; i < SIZE + 1; ++i)\

> +    {\

> +      in1[i] = (i * 4) ^ i;\

> +      in2[i] = (i * 2) ^ i;\

> +    }\

> +  out[SIZE] = 42;\

> +  set_##SIGNED##_##TYPE##_##ITERTYPE (out, in1); \

> +  if (0 != memcmp (out, in1, SIZE * sizeof (TYPE)))\

> +    return 1;\

> +  set_##SIGNED##_##TYPE##_##ITERTYPE##_var (out, in2, SIZE); \

> +  if (0 != memcmp (out, in2, SIZE * sizeof (TYPE)))\

> +    return 1;\

> +  if (out[SIZE] != 42)\

> +    return 1;\

> +}

> +

> +#define CALL_INDEX_OFFSET_TEST(SIGNED, TYPE)\

> +  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, char) \

> +  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, short) \

> +  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, int) \

> +  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, long)

> +

> +int

> +main (void)

> +{

> +  CALL_INDEX_OFFSET_TEST (signed, long)

> +  CALL_INDEX_OFFSET_TEST (unsigned, long)

> +  CALL_INDEX_OFFSET_TEST (signed, int)

> +  CALL_INDEX_OFFSET_TEST (unsigned, int)

> +  CALL_INDEX_OFFSET_TEST (signed, short)

> +  CALL_INDEX_OFFSET_TEST (unsigned, short)

> +  CALL_INDEX_OFFSET_TEST (signed, char)

> +  CALL_INDEX_OFFSET_TEST (unsigned, char)

> +  return 0;

> +}

> Index: gcc/testsuite/gcc.target/aarch64/sve_loop_add_2.c

> ===================================================================

> --- /dev/null   2017-11-14 14:28:07.424493901 +0000

> +++ gcc/testsuite/gcc.target/aarch64/sve_loop_add_2.c   2017-11-17 15:02:12.869952359 +0000

> @@ -0,0 +1,12 @@

> +/* { dg-do compile } */

> +/* { dg-options "-std=c99 -O3 -march=armv8-a+sve" } */

> +

> +void

> +foo (int *__restrict a, int *__restrict b)

> +{

> +  for (int i = 0; i < 512; ++i)

> +    a[i] += b[i];

> +}

> +

> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 1 } } */

> Index: gcc/testsuite/gcc.target/aarch64/sve_loop_add_3.c

> ===================================================================

> --- /dev/null   2017-11-14 14:28:07.424493901 +0000

> +++ gcc/testsuite/gcc.target/aarch64/sve_loop_add_3.c   2017-11-17 15:02:12.869952359 +0000

> @@ -0,0 +1,20 @@

> +/* { dg-do compile } */

> +/* { dg-options "-std=c99 -O3 -march=armv8-a+sve" } */

> +

> +void

> +f (int *__restrict a,

> +   int *__restrict b,

> +   int *__restrict c,

> +   int *__restrict d,

> +   int *__restrict e,

> +   int *__restrict f,

> +   int *__restrict g,

> +   int *__restrict h,

> +   int count)

> +{

> +  for (int i = 0; i < count; ++i)

> +    a[i] = b[i] + c[i] + d[i] + e[i] + f[i] + g[i] + h[i];

> +}

> +

> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 7 } } */

> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 1 } } */

> Index: gcc/testsuite/gcc.target/aarch64/sve_while_1.c

> ===================================================================

> --- gcc/testsuite/gcc.target/aarch64/sve_while_1.c      2017-11-17 14:54:06.035305786 +0000

> +++ gcc/testsuite/gcc.target/aarch64/sve_while_1.c      2017-11-17 15:02:12.869952359 +0000

> @@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */

> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

> Index: gcc/testsuite/gcc.target/aarch64/sve_while_2.c

> ===================================================================

> --- gcc/testsuite/gcc.target/aarch64/sve_while_2.c      2017-11-17 14:54:06.035305786 +0000

> +++ gcc/testsuite/gcc.target/aarch64/sve_while_2.c      2017-11-17 15:02:12.869952359 +0000

> @@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */

> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

> Index: gcc/testsuite/gcc.target/aarch64/sve_while_3.c

> ===================================================================

> --- gcc/testsuite/gcc.target/aarch64/sve_while_3.c      2017-11-17 14:54:06.035305786 +0000

> +++ gcc/testsuite/gcc.target/aarch64/sve_while_3.c      2017-11-17 15:02:12.869952359 +0000

> @@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */

> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

> Index: gcc/testsuite/gcc.target/aarch64/sve_while_4.c

> ===================================================================

> --- gcc/testsuite/gcc.target/aarch64/sve_while_4.c      2017-11-17 14:54:06.035305786 +0000

> +++ gcc/testsuite/gcc.target/aarch64/sve_while_4.c      2017-11-17 15:02:12.869952359 +0000

> @@ -35,3 +35,11 @@ TEST_ALL (ADD_LOOP)

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */

>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */

> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
Richard Biener Nov. 21, 2017, 2:28 p.m. UTC | #2
On Mon, Nov 20, 2017 at 12:31 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Fri, Nov 17, 2017 at 3:03 PM, Richard Sandiford

> <richard.sandiford@linaro.org> wrote:

>> ivopts previously treated pointer arguments to internal functions

>> like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values.

>> This patch makes it treat them as addresses instead.  This makes

>> a significant difference to the code quality for SVE loops,

>> since we can then use loads and stores with scaled indices.

> Thanks for working on this.  This can be extended to other internal

> functions which eventually

> are expanded into memory references.  I believe (at least) both x86

> and AArch64 has such

> requirement.


In addition to Bins comments I only have a single one (the rest of the
middle-end
changes look OK).  The alias type of MEM_REFs and TARGET_MEM_REFs
in ADDR_EXPR context is meaningless so you don't need to jump through hoops
to get at it or preserve it in any way, likewise for CLIQUE/BASE if it
were present.

Maybe you can simplify code with this.  As you're introducing &TARGET_MEM_REF
as a valid construct (it weren't before) you'll run into missing /
misguided foldings
eventually.  So be prepared to fix up fallout.

Thanks,
Richard.

>>

>> The patch also adds support for ADDR_EXPRs of TARGET_MEM_REFs,

>> which are the natural way of representing the result of the

>> ivopts transformation.

>>

>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu

>> and powerpc64le-linux-gnu.  OK to install?

>>

>> Richard

>>

>>

>> 2017-11-17  Richard Sandiford  <richard.sandiford@linaro.org>

>>             Alan Hayward  <alan.hayward@arm.com>

>>             David Sherwood  <david.sherwood@arm.com>

>>

>> gcc/

>>         * expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of

>>         TARGET_MEM_REFs.

>>         * gimple-expr.h (is_gimple_addressable: Likewise.

>>         * gimple-expr.c (is_gimple_address): Likewise.

>>         * internal-fn.c (expand_call_mem_ref): New function.

>>         (expand_mask_load_optab_fn): Use it.

>>         (expand_mask_store_optab_fn): Likewise.

>>         * tree-ssa-loop-ivopts.c (USE_ADDRESS): Split into...

>>         (USE_REF_ADDRESS, USE_PTR_ADDRESS): ...these new use types.

>>         (dump_groups): Update accordingly.

>>         (iv_use::mem_type): New member variable.

>>         (address_p): New function.

>>         (record_use): Add a mem_type argument and initialize the new

>>         mem_type field.

>>         (record_group_use): Add a mem_type argument.  Use address_p.

>>         Update call to record_use.

>>         (find_interesting_uses_op): Update call to record_group_use.

>>         (find_interesting_uses_cond): Likewise.

>>         (find_interesting_uses_address): Likewise.

>>         (get_mem_type_for_internal_fn): New function.

>>         (find_address_like_use): Likewise.

>>         (find_interesting_uses_stmt): Try find_address_like_use before

>>         calling find_interesting_uses_op.

>>         (addr_offset_valid_p): Use the iv mem_type field as the type

>>         of the addressed memory.

>>         (add_autoinc_candidates): Likewise.

>>         (get_address_cost): Likewise.

>>         (split_small_address_groups_p): Use address_p.

>>         (split_address_groups): Likewise.

>>         (add_iv_candidate_for_use): Likewise.

>>         (autoinc_possible_for_pair): Likewise.

>>         (rewrite_groups): Likewise.

>>         (get_use_type): Check for USE_REF_ADDRESS instead of USE_ADDRESS.

>>         (determine_group_iv_cost): Update after split of USE_ADDRESS.

>>         (get_alias_ptr_type_for_ptr_address): New function.

>>         (rewrite_use_address): Rewrite address uses in calls that were

>>         identified by find_address_like_use.

>>

>> gcc/testsuite/

>>         * gcc.dg/tree-ssa/scev-9.c: Expected REFERENCE ADDRESS

>>         instead of just ADDRESS.

>>         * gcc.dg/tree-ssa/scev-10.c: Likewise.

>>         * gcc.dg/tree-ssa/scev-11.c: Likewise.

>>         * gcc.dg/tree-ssa/scev-12.c: Likewise.

>>         * gcc.target/aarch64/sve_index_offset_1.c: New test.

>>         * gcc.target/aarch64/sve_index_offset_1_run.c: Likewise.

>>         * gcc.target/aarch64/sve_loop_add_2.c: Likewise.

>>         * gcc.target/aarch64/sve_loop_add_3.c: Likewise.

>>         * gcc.target/aarch64/sve_while_1.c: Check for indexed addressing modes.

>>         * gcc.target/aarch64/sve_while_2.c: Likewise.

>>         * gcc.target/aarch64/sve_while_3.c: Likewise.

>>         * gcc.target/aarch64/sve_while_4.c: Likewise.

>>

>> Index: gcc/expr.c

>> ===================================================================

>> --- gcc/expr.c  2017-11-17 09:49:36.191354637 +0000

>> +++ gcc/expr.c  2017-11-17 15:02:12.868132458 +0000

>> @@ -7814,6 +7814,9 @@ expand_expr_addr_expr_1 (tree exp, rtx t

>>         return expand_expr (tem, target, tmode, modifier);

>>        }

>>

>> +    case TARGET_MEM_REF:

>> +      return addr_for_mem_ref (exp, as, true);

>> +

>>      case CONST_DECL:

>>        /* Expand the initializer like constants above.  */

>>        result = XEXP (expand_expr_constant (DECL_INITIAL (exp),

>> Index: gcc/gimple-expr.h

>> ===================================================================

>> --- gcc/gimple-expr.h   2017-11-17 09:40:43.520567009 +0000

>> +++ gcc/gimple-expr.h   2017-11-17 15:02:12.868132458 +0000

>> @@ -119,6 +119,7 @@ virtual_operand_p (tree op)

>>  is_gimple_addressable (tree t)

>>  {

>>    return (is_gimple_id (t) || handled_component_p (t)

>> +         || TREE_CODE (t) == TARGET_MEM_REF

>>           || TREE_CODE (t) == MEM_REF);

>>  }

>>

>> Index: gcc/gimple-expr.c

>> ===================================================================

>> --- gcc/gimple-expr.c   2017-10-13 10:23:39.845432950 +0100

>> +++ gcc/gimple-expr.c   2017-11-17 15:02:12.868132458 +0000

>> @@ -631,7 +631,9 @@ is_gimple_address (const_tree t)

>>        op = TREE_OPERAND (op, 0);

>>      }

>>

>> -  if (CONSTANT_CLASS_P (op) || TREE_CODE (op) == MEM_REF)

>> +  if (CONSTANT_CLASS_P (op)

>> +      || TREE_CODE (op) == TARGET_MEM_REF

>> +      || TREE_CODE (op) == MEM_REF)

>>      return true;

>>

>>    switch (TREE_CODE (op))

>> Index: gcc/internal-fn.c

>> ===================================================================

>> --- gcc/internal-fn.c   2017-11-17 14:57:36.436527536 +0000

>> +++ gcc/internal-fn.c   2017-11-17 15:02:12.869042409 +0000

>> @@ -2367,15 +2367,47 @@ expand_LOOP_DIST_ALIAS (internal_fn, gca

>>    gcc_unreachable ();

>>  }

>>

>> +/* Return a memory reference of type TYPE for argument INDEX of STMT.

>> +   Use argument INDEX + 1 to derive the second (TBAA) operand.  */

>> +

>> +static tree

>> +expand_call_mem_ref (tree type, gcall *stmt, int index)

>> +{

>> +  tree addr = gimple_call_arg (stmt, index);

>> +  tree alias_ptr_type = TREE_TYPE (gimple_call_arg (stmt, index + 1));

>> +  unsigned int align = tree_to_shwi (gimple_call_arg (stmt, index + 1));

>> +  if (TYPE_ALIGN (type) != align)

>> +    type = build_aligned_type (type, align);

>> +

>> +  tree tmp = addr;

>> +  if (TREE_CODE (tmp) == SSA_NAME)

>> +    {

>> +      gimple *def = SSA_NAME_DEF_STMT (tmp);

>> +      if (gimple_assign_single_p (def))

>> +       tmp = gimple_assign_rhs1 (def);

>> +    }

>> +

>> +  if (TREE_CODE (tmp) == ADDR_EXPR)

>> +    {

>> +      tree mem = TREE_OPERAND (tmp, 0);

>> +      if (TREE_CODE (mem) == TARGET_MEM_REF

>> +         && types_compatible_p (TREE_TYPE (mem), type)

>> +         && alias_ptr_type == TREE_TYPE (TMR_OFFSET (mem))

>> +         && integer_zerop (TMR_OFFSET (mem)))

>> +       return mem;

>> +    }

>> +

>> +  return fold_build2 (MEM_REF, type, addr, build_int_cst (alias_ptr_type, 0));

>> +}

>> +

>>  /* Expand MASK_LOAD{,_LANES} call STMT using optab OPTAB.  */

>>

>>  static void

>>  expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)

>>  {

>>    struct expand_operand ops[3];

>> -  tree type, lhs, rhs, maskt, ptr;

>> +  tree type, lhs, rhs, maskt;

>>    rtx mem, target, mask;

>> -  unsigned align;

>>    insn_code icode;

>>

>>    maskt = gimple_call_arg (stmt, 2);

>> @@ -2383,11 +2415,7 @@ expand_mask_load_optab_fn (internal_fn,

>>    if (lhs == NULL_TREE)

>>      return;

>>    type = TREE_TYPE (lhs);

>> -  ptr = build_int_cst (TREE_TYPE (gimple_call_arg (stmt, 1)), 0);

>> -  align = tree_to_shwi (gimple_call_arg (stmt, 1));

>> -  if (TYPE_ALIGN (type) != align)

>> -    type = build_aligned_type (type, align);

>> -  rhs = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0), ptr);

>> +  rhs = expand_call_mem_ref (type, stmt, 0);

>>

>>    if (optab == vec_mask_load_lanes_optab)

>>      icode = get_multi_vector_move (type, optab);

>> @@ -2413,19 +2441,14 @@ #define expand_mask_load_lanes_optab_fn

>>  expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)

>>  {

>>    struct expand_operand ops[3];

>> -  tree type, lhs, rhs, maskt, ptr;

>> +  tree type, lhs, rhs, maskt;

>>    rtx mem, reg, mask;

>> -  unsigned align;

>>    insn_code icode;

>>

>>    maskt = gimple_call_arg (stmt, 2);

>>    rhs = gimple_call_arg (stmt, 3);

>>    type = TREE_TYPE (rhs);

>> -  ptr = build_int_cst (TREE_TYPE (gimple_call_arg (stmt, 1)), 0);

>> -  align = tree_to_shwi (gimple_call_arg (stmt, 1));

>> -  if (TYPE_ALIGN (type) != align)

>> -    type = build_aligned_type (type, align);

>> -  lhs = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0), ptr);

>> +  lhs = expand_call_mem_ref (type, stmt, 0);

>>

>>    if (optab == vec_mask_store_lanes_optab)

>>      icode = get_multi_vector_move (type, optab);

> Support for TARGET_MEM_REF and refactoring of expand_call_* in above

> code looks independent

> to IVOPTs change?  If so, coule you split it into two patches?  I can

> approve IVOPTs part patch with

> below comments.

>

>> Index: gcc/tree-ssa-loop-ivopts.c

>> ===================================================================

>> --- gcc/tree-ssa-loop-ivopts.c  2017-11-17 09:05:59.900349210 +0000

>> +++ gcc/tree-ssa-loop-ivopts.c  2017-11-17 15:02:12.870862310 +0000

>> @@ -166,7 +166,11 @@ struct version_info

>>  enum use_type

>>  {

>>    USE_NONLINEAR_EXPR,  /* Use in a nonlinear expression.  */

>> -  USE_ADDRESS,         /* Use in an address.  */

>> +  USE_REF_ADDRESS,     /* Use is an address for an explicit memory

>> +                          reference.  */

>> +  USE_PTR_ADDRESS,     /* Use is a pointer argument to a function in

>> +                          cases where the expansion of the function

>> +                          will turn the argument into a normal address.  */

>>    USE_COMPARE          /* Use is a compare.  */

>>  };

>>

>> @@ -362,6 +366,9 @@ struct iv_use

>>    unsigned id;         /* The id of the use.  */

>>    unsigned group_id;   /* The group id the use belongs to.  */

>>    enum use_type type;  /* Type of the use.  */

>> +  tree mem_type;       /* The memory type to use when testing whether an

>> +                          address is legitimate, and what the address's

>> +                          cost is.  */

>>    struct iv *iv;       /* The induction variable it is based on.  */

>>    gimple *stmt;                /* Statement in that it occurs.  */

>>    tree *op_p;          /* The place where it occurs.  */

>> @@ -506,6 +513,14 @@ struct iv_inv_expr_hasher : free_ptr_has

>>    static inline bool equal (const iv_inv_expr_ent *, const iv_inv_expr_ent *);

>>  };

>>

>> +/* Return true if uses of type TYPE represent some form of address.  */

>> +

>> +inline bool

>> +address_p (use_type type)

>> +{

>> +  return type == USE_REF_ADDRESS || type == USE_PTR_ADDRESS;

>> +}

>> +

>>  /* Hash function for loop invariant expressions.  */

>>

>>  inline hashval_t

>> @@ -768,8 +783,10 @@ dump_groups (FILE *file, struct ivopts_d

>>        fprintf (file, "Group %d:\n", group->id);

>>        if (group->type == USE_NONLINEAR_EXPR)

>>         fprintf (file, "  Type:\tGENERIC\n");

>> -      else if (group->type == USE_ADDRESS)

>> -       fprintf (file, "  Type:\tADDRESS\n");

>> +      else if (group->type == USE_REF_ADDRESS)

>> +       fprintf (file, "  Type:\tREFERENCE ADDRESS\n");

>> +      else if (group->type == USE_PTR_ADDRESS)

>> +       fprintf (file, "  Type:\tPOINTER ARGUMENT ADDRESS\n");

>>        else

>>         {

>>           gcc_assert (group->type == USE_COMPARE);

>> @@ -1502,19 +1519,21 @@ find_induction_variables (struct ivopts_

>>

>>  /* Records a use of TYPE at *USE_P in STMT whose value is IV in GROUP.

>>     For address type use, ADDR_BASE is the stripped IV base, ADDR_OFFSET

>> -   is the const offset stripped from IV base; for other types use, both

>> -   are zero by default.  */

>> +   is the const offset stripped from IV base and MEM_TYPE is the type

>> +   of the memory being addressed.  For uses of other types, ADDR_BASE

>> +   and ADDR_OFFSET are zero by default and MEM_TYPE is NULL_TREE.  */

>>

>>  static struct iv_use *

>>  record_use (struct iv_group *group, tree *use_p, struct iv *iv,

>> -           gimple *stmt, enum use_type type, tree addr_base,

>> -           poly_uint64 addr_offset)

>> +           gimple *stmt, enum use_type type, tree mem_type,

>> +           tree addr_base, poly_uint64 addr_offset)

>>  {

>>    struct iv_use *use = XCNEW (struct iv_use);

>>

>>    use->id = group->vuses.length ();

>>    use->group_id = group->id;

>>    use->type = type;

>> +  use->mem_type = mem_type;

>>    use->iv = iv;

>>    use->stmt = stmt;

>>    use->op_p = use_p;

>> @@ -1569,18 +1588,21 @@ record_group (struct ivopts_data *data,

>>  }

>>

>>  /* Record a use of TYPE at *USE_P in STMT whose value is IV in a group.

>> -   New group will be created if there is no existing group for the use.  */

>> +   New group will be created if there is no existing group for the use.

>> +   MEM_TYPE is the type of memory being addressed, or NULL if this

>> +   isn't an address reference.  */

>>

>>  static struct iv_use *

>>  record_group_use (struct ivopts_data *data, tree *use_p,

>> -                 struct iv *iv, gimple *stmt, enum use_type type)

>> +                 struct iv *iv, gimple *stmt, enum use_type type,

>> +                 tree mem_type)

>>  {

>>    tree addr_base = NULL;

>>    struct iv_group *group = NULL;

>>    poly_uint64 addr_offset = 0;

>>

>>    /* Record non address type use in a new group.  */

>> -  if (type == USE_ADDRESS && iv->base_object)

>> +  if (address_p (type) && iv->base_object)

> I forgot to somplify this condition given (address_p (type) &&

> !iv->base_object) is not allowed now.

> You can simplify this after below comment on base_object if you want.

>>      {

>>        unsigned int i;

>>

>> @@ -1591,7 +1613,7 @@ record_group_use (struct ivopts_data *da

>>

>>           group = data->vgroups[i];

>>           use = group->vuses[0];

>> -         if (use->type != USE_ADDRESS || !use->iv->base_object)

> And here.

>

>> +         if (!address_p (use->type) || !use->iv->base_object)

>>             continue;

>>

>>           /* Check if it has the same stripped base and step.  */

>> @@ -1607,7 +1629,8 @@ record_group_use (struct ivopts_data *da

>>    if (!group)

>>      group = record_group (data, type);

>>

>> -  return record_use (group, use_p, iv, stmt, type, addr_base, addr_offset);

>> +  return record_use (group, use_p, iv, stmt, type, mem_type,

>> +                    addr_base, addr_offset);

>>  }

>>

>>  /* Checks whether the use OP is interesting and if so, records it.  */

>> @@ -1641,7 +1664,7 @@ find_interesting_uses_op (struct ivopts_

>>    stmt = SSA_NAME_DEF_STMT (op);

>>    gcc_assert (gimple_code (stmt) == GIMPLE_PHI || is_gimple_assign (stmt));

>>

>> -  use = record_group_use (data, NULL, iv, stmt, USE_NONLINEAR_EXPR);

>> +  use = record_group_use (data, NULL, iv, stmt, USE_NONLINEAR_EXPR, NULL_TREE);

>>    iv->nonlin_use = use;

>>    return use;

>>  }

>> @@ -1757,10 +1780,10 @@ find_interesting_uses_cond (struct ivopt

>>        return;

>>      }

>>

>> -  record_group_use (data, var_p, var_iv, stmt, USE_COMPARE);

>> +  record_group_use (data, var_p, var_iv, stmt, USE_COMPARE, NULL_TREE);

>>    /* Record compare type iv_use for iv on the other side of comparison.  */

>>    if (ret == COMP_IV_EXPR_2)

>> -    record_group_use (data, bound_p, bound_iv, stmt, USE_COMPARE);

>> +    record_group_use (data, bound_p, bound_iv, stmt, USE_COMPARE, NULL_TREE);

>>  }

>>

>>  /* Returns the outermost loop EXPR is obviously invariant in

>> @@ -2375,7 +2398,7 @@ find_interesting_uses_address (struct iv

>>    if (civ->base_object == NULL_TREE)

>>      goto fail;

>>

>> -  record_group_use (data, op_p, civ, stmt, USE_ADDRESS);

>> +  record_group_use (data, op_p, civ, stmt, USE_REF_ADDRESS, TREE_TYPE (*op_p));

>>    return;

>>

>>  fail:

>> @@ -2398,6 +2421,51 @@ find_invariants_stmt (struct ivopts_data

>>      }

>>  }

>>

>> +/* CALL calls an internal function.  If operand *OP_P will become an

>> +   address when the call is expanded, return the type of the memory

>> +   being addressed, otherwise return null.  */

>> +

>> +static tree

>> +get_mem_type_for_internal_fn (gcall *call, tree *op_p)

>> +{

>> +  switch (gimple_call_internal_fn (call))

>> +    {

>> +    case IFN_MASK_LOAD:

>> +      if (op_p == gimple_call_arg_ptr (call, 0))

>> +       return TREE_TYPE (gimple_call_lhs (call));

>> +      return NULL_TREE;

>> +

>> +    case IFN_MASK_STORE:

>> +      if (op_p == gimple_call_arg_ptr (call, 0))

>> +       return TREE_TYPE (gimple_call_arg (call, 3));

>> +      return NULL_TREE;

>> +

>> +    default:

>> +      return NULL_TREE;

>> +    }

>> +}

>> +

>> +/* IV is a (non-address) iv that describes operand *OP_P of STMT.

>> +   Return true if the operand will become an address when STMT

>> +   is expanded and record the associated address use if so.  */

>> +

>> +static bool

>> +find_address_like_use (struct ivopts_data *data, gimple *stmt, tree *op_p,

>> +                      struct iv *iv)

>> +{

>> +  tree mem_type = NULL_TREE;

>> +  if (gcall *call = dyn_cast <gcall *> (stmt))

>> +    if (gimple_call_internal_p (call))

>> +      mem_type = get_mem_type_for_internal_fn (call, op_p);

>> +  if (mem_type)

>> +    {

>> +      iv = alloc_iv (data, iv->base, iv->step);

> We now don't allow address type iv_use without base_object.  So

> checking code like below

> is needed here:

>

>   /* Fail if base object of this memory reference is unknown.  */

>   if (iv->base_object == NULL_TREE)

>     return false;

>

> IVOPTs part is OK with this change.

>

> Thanks,

> bin

>> +      record_group_use (data, op_p, iv, stmt, USE_PTR_ADDRESS, mem_type);

>> +      return true;

>> +    }

>> +  return false;

>> +}

>> +

>>  /* Finds interesting uses of induction variables in the statement STMT.  */

>>

>>  static void

>> @@ -2482,7 +2550,8 @@ find_interesting_uses_stmt (struct ivopt

>>        if (!iv)

>>         continue;

>>

>> -      find_interesting_uses_op (data, op);

>> +      if (!find_address_like_use (data, stmt, use_p->use, iv))

>> +       find_interesting_uses_op (data, op);

>>      }

>>  }

>>

>> @@ -2516,7 +2585,7 @@ addr_offset_valid_p (struct iv_use *use,

>>    rtx reg, addr;

>>    unsigned list_index;

>>    addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));

>> -  machine_mode addr_mode, mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));

>> +  machine_mode addr_mode, mem_mode = TYPE_MODE (use->mem_type);

>>

>>    list_index = (unsigned) as * MAX_MACHINE_MODE + (unsigned) mem_mode;

>>    if (list_index >= vec_safe_length (addr_list))

>> @@ -2573,7 +2642,7 @@ split_small_address_groups_p (struct ivo

>>        if (group->vuses.length () == 1)

>>         continue;

>>

>> -      gcc_assert (group->type == USE_ADDRESS);

>> +      gcc_assert (address_p (group->type));

>>        if (group->vuses.length () == 2)

>>         {

>>           if (compare_sizes_for_sort (group->vuses[0]->addr_offset,

>> @@ -2625,7 +2694,7 @@ split_address_groups (struct ivopts_data

>>        if (group->vuses.length () == 1)

>>         continue;

>>

>> -      gcc_assert (group->type == USE_ADDRESS);

>> +      gcc_assert (address_p (use->type));

>>

>>        for (j = 1; j < group->vuses.length ();)

>>         {

>> @@ -3145,7 +3214,7 @@ add_autoinc_candidates (struct ivopts_da

>>

>>    cstepi = int_cst_value (step);

>>

>> -  mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));

>> +  mem_mode = TYPE_MODE (use->mem_type);

>>    if (((USE_LOAD_PRE_INCREMENT (mem_mode)

>>         || USE_STORE_PRE_INCREMENT (mem_mode))

>>         && must_eq (GET_MODE_SIZE (mem_mode), cstepi))

>> @@ -3436,7 +3505,7 @@ add_iv_candidate_for_use (struct ivopts_

>>    /* At last, add auto-incremental candidates.  Make such variables

>>       important since other iv uses with same base object may be based

>>       on it.  */

>> -  if (use != NULL && use->type == USE_ADDRESS)

>> +  if (use != NULL && address_p (use->type))

>>      add_autoinc_candidates (data, iv->base, iv->step, true, use);

>>  }

>>

>> @@ -3903,7 +3972,7 @@ get_use_type (struct iv_use *use)

>>    tree base_type = TREE_TYPE (use->iv->base);

>>    tree type;

>>

>> -  if (use->type == USE_ADDRESS)

>> +  if (use->type == USE_REF_ADDRESS)

>>      {

>>        /* The base_type may be a void pointer.  Create a pointer type based on

>>          the mem_ref instead.  */

>> @@ -4331,7 +4400,7 @@ get_address_cost (struct ivopts_data *da

>>    struct mem_address parts = {NULL_TREE, integer_one_node,

>>                               NULL_TREE, NULL_TREE, NULL_TREE};

>>    machine_mode addr_mode = TYPE_MODE (type);

>> -  machine_mode mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));

>> +  machine_mode mem_mode = TYPE_MODE (use->mem_type);

>>    addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));

>>    /* Only true if ratio != 1.  */

>>    bool ok_with_ratio_p = false;

>> @@ -5220,7 +5289,8 @@ determine_group_iv_cost (struct ivopts_d

>>      case USE_NONLINEAR_EXPR:

>>        return determine_group_iv_cost_generic (data, group, cand);

>>

>> -    case USE_ADDRESS:

>> +    case USE_REF_ADDRESS:

>> +    case USE_PTR_ADDRESS:

>>        return determine_group_iv_cost_address (data, group, cand);

>>

>>      case USE_COMPARE:

>> @@ -5238,7 +5308,7 @@ determine_group_iv_cost (struct ivopts_d

>>  autoinc_possible_for_pair (struct ivopts_data *data, struct iv_use *use,

>>                            struct iv_cand *cand)

>>  {

>> -  if (use->type != USE_ADDRESS)

>> +  if (!address_p (use->type))

>>      return false;

>>

>>    bool can_autoinc = false;

>> @@ -6997,6 +7067,27 @@ adjust_iv_update_pos (struct iv_cand *ca

>>    cand->incremented_at = use->stmt;

>>  }

>>

>> +/* Return the alias pointer type that should be used for a MEM_REF

>> +   associated with USE, which has type USE_PTR_ADDRESS.  */

>> +

>> +static tree

>> +get_alias_ptr_type_for_ptr_address (iv_use *use)

>> +{

>> +  gcall *call = as_a <gcall *> (use->stmt);

>> +  switch (gimple_call_internal_fn (call))

>> +    {

>> +    case IFN_MASK_LOAD:

>> +    case IFN_MASK_STORE:

>> +      /* The second argument contains the correct alias type.  */

>> +      gcc_assert (use->op_p = gimple_call_arg_ptr (call, 0));

>> +      return TREE_TYPE (gimple_call_arg (call, 1));

>> +

>> +    default:

>> +      gcc_unreachable ();

>> +    }

>> +}

>> +

>> +

>>  /* Rewrites USE (address that is an iv) using candidate CAND.  */

>>

>>  static void

>> @@ -7025,16 +7116,31 @@ rewrite_use_address (struct ivopts_data

>>    tree iv = var_at_stmt (data->current_loop, cand, use->stmt);

>>    tree base_hint = (cand->iv->base_object) ? iv : NULL_TREE;

>>    gimple_stmt_iterator bsi = gsi_for_stmt (use->stmt);

>> -  tree type = TREE_TYPE (*use->op_p);

>> -  unsigned int align = get_object_alignment (*use->op_p);

>> -  if (align != TYPE_ALIGN (type))

>> -    type = build_aligned_type (type, align);

>> -

>> -  tree ref = create_mem_ref (&bsi, type, &aff,

>> -                            reference_alias_ptr_type (*use->op_p),

>> +  tree type = use->mem_type;

>> +  tree alias_ptr_type;

>> +  if (use->type == USE_PTR_ADDRESS)

>> +    alias_ptr_type = get_alias_ptr_type_for_ptr_address (use);

>> +  else

>> +    {

>> +      gcc_assert (type == TREE_TYPE (*use->op_p));

>> +      unsigned int align = get_object_alignment (*use->op_p);

>> +      if (align != TYPE_ALIGN (type))

>> +       type = build_aligned_type (type, align);

>> +      alias_ptr_type = reference_alias_ptr_type (*use->op_p);

>> +    }

>> +  tree ref = create_mem_ref (&bsi, type, &aff, alias_ptr_type,

>>                              iv, base_hint, data->speed);

>>

>> -  copy_ref_info (ref, *use->op_p);

>> +  if (use->type == USE_PTR_ADDRESS)

>> +    {

>> +      ref = fold_build1 (ADDR_EXPR, build_pointer_type (use->mem_type), ref);

>> +      ref = fold_convert (get_use_type (use), ref);

>> +      ref = force_gimple_operand_gsi (&bsi, ref, true, NULL_TREE,

>> +                                     true, GSI_SAME_STMT);

>> +    }

>> +  else

>> +    copy_ref_info (ref, *use->op_p);

>> +

>>    *use->op_p = ref;

>>  }

>>

>> @@ -7110,7 +7216,7 @@ rewrite_groups (struct ivopts_data *data

>>               update_stmt (group->vuses[j]->stmt);

>>             }

>>         }

>> -      else if (group->type == USE_ADDRESS)

>> +      else if (address_p (group->type))

>>         {

>>           for (j = 0; j < group->vuses.length (); j++)

>>             {

>> Index: gcc/testsuite/gcc.dg/tree-ssa/scev-9.c

>> ===================================================================

>> --- gcc/testsuite/gcc.dg/tree-ssa/scev-9.c      2016-05-02 10:44:33.000000000 +0100

>> +++ gcc/testsuite/gcc.dg/tree-ssa/scev-9.c      2017-11-17 15:02:12.869042409 +0000

>> @@ -18,5 +18,5 @@ foo (unsigned char s, unsigned char l)

>>  }

>>

>>  /* Address of array reference is scev.  */

>> -/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

>> +/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

>>

>> Index: gcc/testsuite/gcc.dg/tree-ssa/scev-10.c

>> ===================================================================

>> --- gcc/testsuite/gcc.dg/tree-ssa/scev-10.c     2016-05-02 10:44:33.000000000 +0100

>> +++ gcc/testsuite/gcc.dg/tree-ssa/scev-10.c     2017-11-17 15:02:12.869042409 +0000

>> @@ -18,5 +18,5 @@ foo (signed char s, signed char l)

>>  }

>>

>>  /* Address of array reference is scev.  */

>> -/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

>> +/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

>>

>> Index: gcc/testsuite/gcc.dg/tree-ssa/scev-11.c

>> ===================================================================

>> --- gcc/testsuite/gcc.dg/tree-ssa/scev-11.c     2016-05-02 10:44:33.000000000 +0100

>> +++ gcc/testsuite/gcc.dg/tree-ssa/scev-11.c     2017-11-17 15:02:12.869042409 +0000

>> @@ -23,4 +23,4 @@ foo (int n)

>>  }

>>

>>  /* Address of array reference to b is scev.  */

>> -/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 2 "ivopts" } } */

>> +/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 2 "ivopts" } } */

>> Index: gcc/testsuite/gcc.dg/tree-ssa/scev-12.c

>> ===================================================================

>> --- gcc/testsuite/gcc.dg/tree-ssa/scev-12.c     2016-05-02 10:44:33.000000000 +0100

>> +++ gcc/testsuite/gcc.dg/tree-ssa/scev-12.c     2017-11-17 15:02:12.869042409 +0000

>> @@ -24,4 +24,4 @@ foo (int x, int n)

>>  }

>>

>>  /* Address of array reference to b is not scev.  */

>> -/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

>> +/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */

>> Index: gcc/testsuite/gcc.target/aarch64/sve_index_offset_1.c

>> ===================================================================

>> --- /dev/null   2017-11-14 14:28:07.424493901 +0000

>> +++ gcc/testsuite/gcc.target/aarch64/sve_index_offset_1.c       2017-11-17 15:02:12.869042409 +0000

>> @@ -0,0 +1,49 @@

>> +/* { dg-do compile } */

>> +/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve -msve-vector-bits=256" } */

>> +

>> +#define SIZE 15*8+3

>> +

>> +#define INDEX_OFFSET_TEST_1(SIGNED, TYPE, ITERTYPE) \

>> +void set_##SIGNED##_##TYPE##_##ITERTYPE (SIGNED TYPE *__restrict out, \

>> +                                        SIGNED TYPE *__restrict in) \

>> +{ \

>> +  SIGNED ITERTYPE i; \

>> +  for (i = 0; i < SIZE; i++) \

>> +  { \

>> +    out[i] = in[i]; \

>> +  } \

>> +} \

>> +void set_##SIGNED##_##TYPE##_##ITERTYPE##_var (SIGNED TYPE *__restrict out, \

>> +                                              SIGNED TYPE *__restrict in, \

>> +                                              SIGNED ITERTYPE n) \

>> +{\

>> +  SIGNED ITERTYPE i;\

>> +  for (i = 0; i < n; i++)\

>> +  {\

>> +    out[i] = in[i];\

>> +  }\

>> +}

>> +

>> +#define INDEX_OFFSET_TEST(SIGNED, TYPE)\

>> +  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, char) \

>> +  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, short) \

>> +  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, int) \

>> +  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, long)

>> +

>> +INDEX_OFFSET_TEST (signed, long)

>> +INDEX_OFFSET_TEST (unsigned, long)

>> +INDEX_OFFSET_TEST (signed, int)

>> +INDEX_OFFSET_TEST (unsigned, int)

>> +INDEX_OFFSET_TEST (signed, short)

>> +INDEX_OFFSET_TEST (unsigned, short)

>> +INDEX_OFFSET_TEST (signed, char)

>> +INDEX_OFFSET_TEST (unsigned, char)

>> +

>> +/* { dg-final { scan-assembler-times "ld1d\\tz\[0-9\]+.d, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 3\\\]" 16 } } */

>> +/* { dg-final { scan-assembler-times "st1d\\tz\[0-9\]+.d, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 3\\\]" 16 } } */

>> +/* { dg-final { scan-assembler-times "ld1w\\tz\[0-9\]+.s, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 2\\\]" 16 } } */

>> +/* { dg-final { scan-assembler-times "st1w\\tz\[0-9\]+.s, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 2\\\]" 16 } } */

>> +/* { dg-final { scan-assembler-times "ld1h\\tz\[0-9\]+.h, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 1\\\]" 16 } } */

>> +/* { dg-final { scan-assembler-times "st1h\\tz\[0-9\]+.h, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 1\\\]" 16 } } */

>> +/* { dg-final { scan-assembler-times "ld1b\\tz\[0-9\]+.b, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+\\\]" 16 } } */

>> +/* { dg-final { scan-assembler-times "st1b\\tz\[0-9\]+.b, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+\\\]" 16 } } */

>> Index: gcc/testsuite/gcc.target/aarch64/sve_index_offset_1_run.c

>> ===================================================================

>> --- /dev/null   2017-11-14 14:28:07.424493901 +0000

>> +++ gcc/testsuite/gcc.target/aarch64/sve_index_offset_1_run.c   2017-11-17 15:02:12.869042409 +0000

>> @@ -0,0 +1,48 @@

>> +/* { dg-do run { target aarch64_sve_hw } } */

>> +/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve" } */

>> +/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve -msve-vector-bits=256" { target aarch64_sve256_hw } } */

>> +

>> +#include "sve_index_offset_1_run.c"

>> +

>> +#include <string.h>

>> +

>> +#define CALL_INDEX_OFFSET_TEST_1(SIGNED, TYPE, ITERTYPE)\

>> +{\

>> +  SIGNED TYPE out[SIZE + 1];\

>> +  SIGNED TYPE in1[SIZE + 1];\

>> +  SIGNED TYPE in2[SIZE + 1];\

>> +  for (int i = 0; i < SIZE + 1; ++i)\

>> +    {\

>> +      in1[i] = (i * 4) ^ i;\

>> +      in2[i] = (i * 2) ^ i;\

>> +    }\

>> +  out[SIZE] = 42;\

>> +  set_##SIGNED##_##TYPE##_##ITERTYPE (out, in1); \

>> +  if (0 != memcmp (out, in1, SIZE * sizeof (TYPE)))\

>> +    return 1;\

>> +  set_##SIGNED##_##TYPE##_##ITERTYPE##_var (out, in2, SIZE); \

>> +  if (0 != memcmp (out, in2, SIZE * sizeof (TYPE)))\

>> +    return 1;\

>> +  if (out[SIZE] != 42)\

>> +    return 1;\

>> +}

>> +

>> +#define CALL_INDEX_OFFSET_TEST(SIGNED, TYPE)\

>> +  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, char) \

>> +  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, short) \

>> +  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, int) \

>> +  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, long)

>> +

>> +int

>> +main (void)

>> +{

>> +  CALL_INDEX_OFFSET_TEST (signed, long)

>> +  CALL_INDEX_OFFSET_TEST (unsigned, long)

>> +  CALL_INDEX_OFFSET_TEST (signed, int)

>> +  CALL_INDEX_OFFSET_TEST (unsigned, int)

>> +  CALL_INDEX_OFFSET_TEST (signed, short)

>> +  CALL_INDEX_OFFSET_TEST (unsigned, short)

>> +  CALL_INDEX_OFFSET_TEST (signed, char)

>> +  CALL_INDEX_OFFSET_TEST (unsigned, char)

>> +  return 0;

>> +}

>> Index: gcc/testsuite/gcc.target/aarch64/sve_loop_add_2.c

>> ===================================================================

>> --- /dev/null   2017-11-14 14:28:07.424493901 +0000

>> +++ gcc/testsuite/gcc.target/aarch64/sve_loop_add_2.c   2017-11-17 15:02:12.869952359 +0000

>> @@ -0,0 +1,12 @@

>> +/* { dg-do compile } */

>> +/* { dg-options "-std=c99 -O3 -march=armv8-a+sve" } */

>> +

>> +void

>> +foo (int *__restrict a, int *__restrict b)

>> +{

>> +  for (int i = 0; i < 512; ++i)

>> +    a[i] += b[i];

>> +}

>> +

>> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 1 } } */

>> Index: gcc/testsuite/gcc.target/aarch64/sve_loop_add_3.c

>> ===================================================================

>> --- /dev/null   2017-11-14 14:28:07.424493901 +0000

>> +++ gcc/testsuite/gcc.target/aarch64/sve_loop_add_3.c   2017-11-17 15:02:12.869952359 +0000

>> @@ -0,0 +1,20 @@

>> +/* { dg-do compile } */

>> +/* { dg-options "-std=c99 -O3 -march=armv8-a+sve" } */

>> +

>> +void

>> +f (int *__restrict a,

>> +   int *__restrict b,

>> +   int *__restrict c,

>> +   int *__restrict d,

>> +   int *__restrict e,

>> +   int *__restrict f,

>> +   int *__restrict g,

>> +   int *__restrict h,

>> +   int count)

>> +{

>> +  for (int i = 0; i < count; ++i)

>> +    a[i] = b[i] + c[i] + d[i] + e[i] + f[i] + g[i] + h[i];

>> +}

>> +

>> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 7 } } */

>> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 1 } } */

>> Index: gcc/testsuite/gcc.target/aarch64/sve_while_1.c

>> ===================================================================

>> --- gcc/testsuite/gcc.target/aarch64/sve_while_1.c      2017-11-17 14:54:06.035305786 +0000

>> +++ gcc/testsuite/gcc.target/aarch64/sve_while_1.c      2017-11-17 15:02:12.869952359 +0000

>> @@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

>> Index: gcc/testsuite/gcc.target/aarch64/sve_while_2.c

>> ===================================================================

>> --- gcc/testsuite/gcc.target/aarch64/sve_while_2.c      2017-11-17 14:54:06.035305786 +0000

>> +++ gcc/testsuite/gcc.target/aarch64/sve_while_2.c      2017-11-17 15:02:12.869952359 +0000

>> @@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

>> Index: gcc/testsuite/gcc.target/aarch64/sve_while_3.c

>> ===================================================================

>> --- gcc/testsuite/gcc.target/aarch64/sve_while_3.c      2017-11-17 14:54:06.035305786 +0000

>> +++ gcc/testsuite/gcc.target/aarch64/sve_while_3.c      2017-11-17 15:02:12.869952359 +0000

>> @@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

>> Index: gcc/testsuite/gcc.target/aarch64/sve_while_4.c

>> ===================================================================

>> --- gcc/testsuite/gcc.target/aarch64/sve_while_4.c      2017-11-17 14:54:06.035305786 +0000

>> +++ gcc/testsuite/gcc.target/aarch64/sve_while_4.c      2017-11-17 15:02:12.869952359 +0000

>> @@ -35,3 +35,11 @@ TEST_ALL (ADD_LOOP)

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */

>>  /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */

>> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */

>> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
Richard Sandiford Jan. 9, 2018, 3:23 p.m. UTC | #3
Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Nov 20, 2017 at 12:31 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:

>> On Fri, Nov 17, 2017 at 3:03 PM, Richard Sandiford

>> <richard.sandiford@linaro.org> wrote:

>>> ivopts previously treated pointer arguments to internal functions

>>> like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values.

>>> This patch makes it treat them as addresses instead.  This makes

>>> a significant difference to the code quality for SVE loops,

>>> since we can then use loads and stores with scaled indices.

>> Thanks for working on this.  This can be extended to other internal

>> functions which eventually

>> are expanded into memory references.  I believe (at least) both x86

>> and AArch64 has such

>> requirement.

>

> In addition to Bins comments I only have a single one (the rest of the

> middle-end

> changes look OK).  The alias type of MEM_REFs and TARGET_MEM_REFs

> in ADDR_EXPR context is meaningless so you don't need to jump through hoops

> to get at it or preserve it in any way, likewise for CLIQUE/BASE if it

> were present.


Ah, OK.

> Maybe you can simplify code with this.


In the end it didn't really simplify the code, since internal-fn.c
uses the address to build a (TARGET_)MEM_REF, and the alias information
of that ref needs to be correct, since it gets carried across to the
MEM rtx.  But it does mean that the alias_ptr_type check in the previous:

      if (TREE_CODE (mem) == TARGET_MEM_REF
	  && types_compatible_p (TREE_TYPE (mem), type)
	  && alias_ptr_type == TREE_TYPE (TMR_OFFSET (mem))
	  && integer_zerop (TMR_OFFSET (mem)))
	return mem;

made no sense: we should simply replace the TMR_OFFSET if it has
the wrong type.

> As you're introducing &TARGET_MEM_REF as a valid construct (it weren't

> before) you'll run into missing / misguided foldings eventually.  So

> be prepared to fix up fallout.


OK :-) I haven't hit any new places yet, but like you say, I'll be on
the lookout.

Is the version below OK?  Tested on aarch64-linux-gnu, x86_64-linux-gnu
and powerpc64le-linux-gnu.

Richard


2018-01-09  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of
	TARGET_MEM_REFs.
	* gimple-expr.h (is_gimple_addressable: Likewise.
	* gimple-expr.c (is_gimple_address): Likewise.
	* internal-fn.c (expand_call_mem_ref): New function.
	(expand_mask_load_optab_fn): Use it.
	(expand_mask_store_optab_fn): Likewise.

Index: gcc/expr.c
===================================================================
--- gcc/expr.c	2018-01-09 15:13:32.603106251 +0000
+++ gcc/expr.c	2018-01-09 15:13:32.784098242 +0000
@@ -7885,6 +7885,9 @@ expand_expr_addr_expr_1 (tree exp, rtx t
 	return expand_expr (tem, target, tmode, modifier);
       }
 
+    case TARGET_MEM_REF:
+      return addr_for_mem_ref (exp, as, true);
+
     case CONST_DECL:
       /* Expand the initializer like constants above.  */
       result = XEXP (expand_expr_constant (DECL_INITIAL (exp),
Index: gcc/gimple-expr.h
===================================================================
--- gcc/gimple-expr.h	2018-01-09 15:13:32.603106251 +0000
+++ gcc/gimple-expr.h	2018-01-09 15:13:32.785098198 +0000
@@ -119,6 +119,7 @@ virtual_operand_p (tree op)
 is_gimple_addressable (tree t)
 {
   return (is_gimple_id (t) || handled_component_p (t)
+	  || TREE_CODE (t) == TARGET_MEM_REF
 	  || TREE_CODE (t) == MEM_REF);
 }
 
Index: gcc/gimple-expr.c
===================================================================
--- gcc/gimple-expr.c	2018-01-09 15:13:32.603106251 +0000
+++ gcc/gimple-expr.c	2018-01-09 15:13:32.784098242 +0000
@@ -631,7 +631,9 @@ is_gimple_address (const_tree t)
       op = TREE_OPERAND (op, 0);
     }
 
-  if (CONSTANT_CLASS_P (op) || TREE_CODE (op) == MEM_REF)
+  if (CONSTANT_CLASS_P (op)
+      || TREE_CODE (op) == TARGET_MEM_REF
+      || TREE_CODE (op) == MEM_REF)
     return true;
 
   switch (TREE_CODE (op))
Index: gcc/internal-fn.c
===================================================================
--- gcc/internal-fn.c	2018-01-09 15:13:32.603106251 +0000
+++ gcc/internal-fn.c	2018-01-09 15:13:32.785098198 +0000
@@ -2412,15 +2412,53 @@ expand_LOOP_DIST_ALIAS (internal_fn, gca
   gcc_unreachable ();
 }
 
+/* Return a memory reference of type TYPE for argument INDEX of STMT.
+   Use argument INDEX + 1 to derive the second (TBAA) operand.  */
+
+static tree
+expand_call_mem_ref (tree type, gcall *stmt, int index)
+{
+  tree addr = gimple_call_arg (stmt, index);
+  tree alias_ptr_type = TREE_TYPE (gimple_call_arg (stmt, index + 1));
+  unsigned int align = tree_to_shwi (gimple_call_arg (stmt, index + 1));
+  if (TYPE_ALIGN (type) != align)
+    type = build_aligned_type (type, align);
+
+  tree tmp = addr;
+  if (TREE_CODE (tmp) == SSA_NAME)
+    {
+      gimple *def = SSA_NAME_DEF_STMT (tmp);
+      if (gimple_assign_single_p (def))
+	tmp = gimple_assign_rhs1 (def);
+    }
+
+  if (TREE_CODE (tmp) == ADDR_EXPR)
+    {
+      tree mem = TREE_OPERAND (tmp, 0);
+      if (TREE_CODE (mem) == TARGET_MEM_REF
+	  && types_compatible_p (TREE_TYPE (mem), type)
+	  && integer_zerop (TMR_OFFSET (mem)))
+	{
+	  if (alias_ptr_type != TREE_TYPE (TMR_OFFSET (mem)))
+	    {
+	      mem = copy_node (mem);
+	      TMR_OFFSET (mem) = build_int_cst (alias_ptr_type, 0);
+	    }
+	  return mem;
+	}
+    }
+
+  return fold_build2 (MEM_REF, type, addr, build_int_cst (alias_ptr_type, 0));
+}
+
 /* Expand MASK_LOAD{,_LANES} call STMT using optab OPTAB.  */
 
 static void
 expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 {
   struct expand_operand ops[3];
-  tree type, lhs, rhs, maskt, ptr;
+  tree type, lhs, rhs, maskt;
   rtx mem, target, mask;
-  unsigned align;
   insn_code icode;
 
   maskt = gimple_call_arg (stmt, 2);
@@ -2428,11 +2466,7 @@ expand_mask_load_optab_fn (internal_fn,
   if (lhs == NULL_TREE)
     return;
   type = TREE_TYPE (lhs);
-  ptr = build_int_cst (TREE_TYPE (gimple_call_arg (stmt, 1)), 0);
-  align = tree_to_shwi (gimple_call_arg (stmt, 1));
-  if (TYPE_ALIGN (type) != align)
-    type = build_aligned_type (type, align);
-  rhs = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0), ptr);
+  rhs = expand_call_mem_ref (type, stmt, 0);
 
   if (optab == vec_mask_load_lanes_optab)
     icode = get_multi_vector_move (type, optab);
@@ -2458,19 +2492,14 @@ #define expand_mask_load_lanes_optab_fn
 expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 {
   struct expand_operand ops[3];
-  tree type, lhs, rhs, maskt, ptr;
+  tree type, lhs, rhs, maskt;
   rtx mem, reg, mask;
-  unsigned align;
   insn_code icode;
 
   maskt = gimple_call_arg (stmt, 2);
   rhs = gimple_call_arg (stmt, 3);
   type = TREE_TYPE (rhs);
-  ptr = build_int_cst (TREE_TYPE (gimple_call_arg (stmt, 1)), 0);
-  align = tree_to_shwi (gimple_call_arg (stmt, 1));
-  if (TYPE_ALIGN (type) != align)
-    type = build_aligned_type (type, align);
-  lhs = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0), ptr);
+  lhs = expand_call_mem_ref (type, stmt, 0);
 
   if (optab == vec_mask_store_lanes_optab)
     icode = get_multi_vector_move (type, optab);
Jeff Law Jan. 13, 2018, 3:34 p.m. UTC | #4
On 01/09/2018 08:23 AM, Richard Sandiford wrote:
> Richard Biener <richard.guenther@gmail.com> writes:

>> On Mon, Nov 20, 2017 at 12:31 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:

>>> On Fri, Nov 17, 2017 at 3:03 PM, Richard Sandiford

>>> <richard.sandiford@linaro.org> wrote:

>>>> ivopts previously treated pointer arguments to internal functions

>>>> like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values.

>>>> This patch makes it treat them as addresses instead.  This makes

>>>> a significant difference to the code quality for SVE loops,

>>>> since we can then use loads and stores with scaled indices.

>>> Thanks for working on this.  This can be extended to other internal

>>> functions which eventually

>>> are expanded into memory references.  I believe (at least) both x86

>>> and AArch64 has such

>>> requirement.

>>

>> In addition to Bins comments I only have a single one (the rest of the

>> middle-end

>> changes look OK).  The alias type of MEM_REFs and TARGET_MEM_REFs

>> in ADDR_EXPR context is meaningless so you don't need to jump through hoops

>> to get at it or preserve it in any way, likewise for CLIQUE/BASE if it

>> were present.

> 

> Ah, OK.

> 

>> Maybe you can simplify code with this.

> 

> In the end it didn't really simplify the code, since internal-fn.c

> uses the address to build a (TARGET_)MEM_REF, and the alias information

> of that ref needs to be correct, since it gets carried across to the

> MEM rtx.  But it does mean that the alias_ptr_type check in the previous:

> 

>       if (TREE_CODE (mem) == TARGET_MEM_REF

> 	  && types_compatible_p (TREE_TYPE (mem), type)

> 	  && alias_ptr_type == TREE_TYPE (TMR_OFFSET (mem))

> 	  && integer_zerop (TMR_OFFSET (mem)))

> 	return mem;

> 

> made no sense: we should simply replace the TMR_OFFSET if it has

> the wrong type.

> 

>> As you're introducing &TARGET_MEM_REF as a valid construct (it weren't

>> before) you'll run into missing / misguided foldings eventually.  So

>> be prepared to fix up fallout.

> 

> OK :-) I haven't hit any new places yet, but like you say, I'll be on

> the lookout.

> 

> Is the version below OK?  Tested on aarch64-linux-gnu, x86_64-linux-gnu

> and powerpc64le-linux-gnu.

> 

> Richard

> 

> 

> 2018-01-09  Richard Sandiford  <richard.sandiford@linaro.org>

> 	    Alan Hayward  <alan.hayward@arm.com>

> 	    David Sherwood  <david.sherwood@arm.com>

> 

> gcc/

> 	* expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of

> 	TARGET_MEM_REFs.

> 	* gimple-expr.h (is_gimple_addressable: Likewise.

> 	* gimple-expr.c (is_gimple_address): Likewise.

> 	* internal-fn.c (expand_call_mem_ref): New function.

> 	(expand_mask_load_optab_fn): Use it.

> 	(expand_mask_store_optab_fn): Likewise.

OK.
jeff
Christophe Lyon Jan. 15, 2018, 10:09 a.m. UTC | #5
Hi,


On 13 January 2018 at 16:34, Jeff Law <law@redhat.com> wrote:
> On 01/09/2018 08:23 AM, Richard Sandiford wrote:

>> Richard Biener <richard.guenther@gmail.com> writes:

>>> On Mon, Nov 20, 2017 at 12:31 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:

>>>> On Fri, Nov 17, 2017 at 3:03 PM, Richard Sandiford

>>>> <richard.sandiford@linaro.org> wrote:

>>>>> ivopts previously treated pointer arguments to internal functions

>>>>> like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values.

>>>>> This patch makes it treat them as addresses instead.  This makes

>>>>> a significant difference to the code quality for SVE loops,

>>>>> since we can then use loads and stores with scaled indices.

>>>> Thanks for working on this.  This can be extended to other internal

>>>> functions which eventually

>>>> are expanded into memory references.  I believe (at least) both x86

>>>> and AArch64 has such

>>>> requirement.

>>>

>>> In addition to Bins comments I only have a single one (the rest of the

>>> middle-end

>>> changes look OK).  The alias type of MEM_REFs and TARGET_MEM_REFs

>>> in ADDR_EXPR context is meaningless so you don't need to jump through hoops

>>> to get at it or preserve it in any way, likewise for CLIQUE/BASE if it

>>> were present.

>>

>> Ah, OK.

>>

>>> Maybe you can simplify code with this.

>>

>> In the end it didn't really simplify the code, since internal-fn.c

>> uses the address to build a (TARGET_)MEM_REF, and the alias information

>> of that ref needs to be correct, since it gets carried across to the

>> MEM rtx.  But it does mean that the alias_ptr_type check in the previous:

>>

>>       if (TREE_CODE (mem) == TARGET_MEM_REF

>>         && types_compatible_p (TREE_TYPE (mem), type)

>>         && alias_ptr_type == TREE_TYPE (TMR_OFFSET (mem))

>>         && integer_zerop (TMR_OFFSET (mem)))

>>       return mem;

>>

>> made no sense: we should simply replace the TMR_OFFSET if it has

>> the wrong type.

>>

>>> As you're introducing &TARGET_MEM_REF as a valid construct (it weren't

>>> before) you'll run into missing / misguided foldings eventually.  So

>>> be prepared to fix up fallout.

>>

>> OK :-) I haven't hit any new places yet, but like you say, I'll be on

>> the lookout.

>>

>> Is the version below OK?  Tested on aarch64-linux-gnu, x86_64-linux-gnu

>> and powerpc64le-linux-gnu.

>>

>> Richard

>>

>>

>> 2018-01-09  Richard Sandiford  <richard.sandiford@linaro.org>

>>           Alan Hayward  <alan.hayward@arm.com>

>>           David Sherwood  <david.sherwood@arm.com>

>>

>> gcc/

>>       * expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of

>>       TARGET_MEM_REFs.

>>       * gimple-expr.h (is_gimple_addressable: Likewise.

>>       * gimple-expr.c (is_gimple_address): Likewise.

>>       * internal-fn.c (expand_call_mem_ref): New function.

>>       (expand_mask_load_optab_fn): Use it.

>>       (expand_mask_store_optab_fn): Likewise.

> OK.

> jeff



I've reported that the updated tests fail on aarch64-none-elf -mabi=ilp32:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83848

Christophe
diff mbox series

Patch

Index: gcc/expr.c
===================================================================
--- gcc/expr.c	2017-11-17 09:49:36.191354637 +0000
+++ gcc/expr.c	2017-11-17 15:02:12.868132458 +0000
@@ -7814,6 +7814,9 @@  expand_expr_addr_expr_1 (tree exp, rtx t
 	return expand_expr (tem, target, tmode, modifier);
       }
 
+    case TARGET_MEM_REF:
+      return addr_for_mem_ref (exp, as, true);
+
     case CONST_DECL:
       /* Expand the initializer like constants above.  */
       result = XEXP (expand_expr_constant (DECL_INITIAL (exp),
Index: gcc/gimple-expr.h
===================================================================
--- gcc/gimple-expr.h	2017-11-17 09:40:43.520567009 +0000
+++ gcc/gimple-expr.h	2017-11-17 15:02:12.868132458 +0000
@@ -119,6 +119,7 @@  virtual_operand_p (tree op)
 is_gimple_addressable (tree t)
 {
   return (is_gimple_id (t) || handled_component_p (t)
+	  || TREE_CODE (t) == TARGET_MEM_REF
 	  || TREE_CODE (t) == MEM_REF);
 }
 
Index: gcc/gimple-expr.c
===================================================================
--- gcc/gimple-expr.c	2017-10-13 10:23:39.845432950 +0100
+++ gcc/gimple-expr.c	2017-11-17 15:02:12.868132458 +0000
@@ -631,7 +631,9 @@  is_gimple_address (const_tree t)
       op = TREE_OPERAND (op, 0);
     }
 
-  if (CONSTANT_CLASS_P (op) || TREE_CODE (op) == MEM_REF)
+  if (CONSTANT_CLASS_P (op)
+      || TREE_CODE (op) == TARGET_MEM_REF
+      || TREE_CODE (op) == MEM_REF)
     return true;
 
   switch (TREE_CODE (op))
Index: gcc/internal-fn.c
===================================================================
--- gcc/internal-fn.c	2017-11-17 14:57:36.436527536 +0000
+++ gcc/internal-fn.c	2017-11-17 15:02:12.869042409 +0000
@@ -2367,15 +2367,47 @@  expand_LOOP_DIST_ALIAS (internal_fn, gca
   gcc_unreachable ();
 }
 
+/* Return a memory reference of type TYPE for argument INDEX of STMT.
+   Use argument INDEX + 1 to derive the second (TBAA) operand.  */
+
+static tree
+expand_call_mem_ref (tree type, gcall *stmt, int index)
+{
+  tree addr = gimple_call_arg (stmt, index);
+  tree alias_ptr_type = TREE_TYPE (gimple_call_arg (stmt, index + 1));
+  unsigned int align = tree_to_shwi (gimple_call_arg (stmt, index + 1));
+  if (TYPE_ALIGN (type) != align)
+    type = build_aligned_type (type, align);
+
+  tree tmp = addr;
+  if (TREE_CODE (tmp) == SSA_NAME)
+    {
+      gimple *def = SSA_NAME_DEF_STMT (tmp);
+      if (gimple_assign_single_p (def))
+	tmp = gimple_assign_rhs1 (def);
+    }
+
+  if (TREE_CODE (tmp) == ADDR_EXPR)
+    {
+      tree mem = TREE_OPERAND (tmp, 0);
+      if (TREE_CODE (mem) == TARGET_MEM_REF
+	  && types_compatible_p (TREE_TYPE (mem), type)
+	  && alias_ptr_type == TREE_TYPE (TMR_OFFSET (mem))
+	  && integer_zerop (TMR_OFFSET (mem)))
+	return mem;
+    }
+
+  return fold_build2 (MEM_REF, type, addr, build_int_cst (alias_ptr_type, 0));
+}
+
 /* Expand MASK_LOAD{,_LANES} call STMT using optab OPTAB.  */
 
 static void
 expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 {
   struct expand_operand ops[3];
-  tree type, lhs, rhs, maskt, ptr;
+  tree type, lhs, rhs, maskt;
   rtx mem, target, mask;
-  unsigned align;
   insn_code icode;
 
   maskt = gimple_call_arg (stmt, 2);
@@ -2383,11 +2415,7 @@  expand_mask_load_optab_fn (internal_fn,
   if (lhs == NULL_TREE)
     return;
   type = TREE_TYPE (lhs);
-  ptr = build_int_cst (TREE_TYPE (gimple_call_arg (stmt, 1)), 0);
-  align = tree_to_shwi (gimple_call_arg (stmt, 1));
-  if (TYPE_ALIGN (type) != align)
-    type = build_aligned_type (type, align);
-  rhs = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0), ptr);
+  rhs = expand_call_mem_ref (type, stmt, 0);
 
   if (optab == vec_mask_load_lanes_optab)
     icode = get_multi_vector_move (type, optab);
@@ -2413,19 +2441,14 @@  #define expand_mask_load_lanes_optab_fn
 expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 {
   struct expand_operand ops[3];
-  tree type, lhs, rhs, maskt, ptr;
+  tree type, lhs, rhs, maskt;
   rtx mem, reg, mask;
-  unsigned align;
   insn_code icode;
 
   maskt = gimple_call_arg (stmt, 2);
   rhs = gimple_call_arg (stmt, 3);
   type = TREE_TYPE (rhs);
-  ptr = build_int_cst (TREE_TYPE (gimple_call_arg (stmt, 1)), 0);
-  align = tree_to_shwi (gimple_call_arg (stmt, 1));
-  if (TYPE_ALIGN (type) != align)
-    type = build_aligned_type (type, align);
-  lhs = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0), ptr);
+  lhs = expand_call_mem_ref (type, stmt, 0);
 
   if (optab == vec_mask_store_lanes_optab)
     icode = get_multi_vector_move (type, optab);
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	2017-11-17 09:05:59.900349210 +0000
+++ gcc/tree-ssa-loop-ivopts.c	2017-11-17 15:02:12.870862310 +0000
@@ -166,7 +166,11 @@  struct version_info
 enum use_type
 {
   USE_NONLINEAR_EXPR,	/* Use in a nonlinear expression.  */
-  USE_ADDRESS,		/* Use in an address.  */
+  USE_REF_ADDRESS,	/* Use is an address for an explicit memory
+			   reference.  */
+  USE_PTR_ADDRESS,	/* Use is a pointer argument to a function in
+			   cases where the expansion of the function
+			   will turn the argument into a normal address.  */
   USE_COMPARE		/* Use is a compare.  */
 };
 
@@ -362,6 +366,9 @@  struct iv_use
   unsigned id;		/* The id of the use.  */
   unsigned group_id;	/* The group id the use belongs to.  */
   enum use_type type;	/* Type of the use.  */
+  tree mem_type;	/* The memory type to use when testing whether an
+			   address is legitimate, and what the address's
+			   cost is.  */
   struct iv *iv;	/* The induction variable it is based on.  */
   gimple *stmt;		/* Statement in that it occurs.  */
   tree *op_p;		/* The place where it occurs.  */
@@ -506,6 +513,14 @@  struct iv_inv_expr_hasher : free_ptr_has
   static inline bool equal (const iv_inv_expr_ent *, const iv_inv_expr_ent *);
 };
 
+/* Return true if uses of type TYPE represent some form of address.  */
+
+inline bool
+address_p (use_type type)
+{
+  return type == USE_REF_ADDRESS || type == USE_PTR_ADDRESS;
+}
+
 /* Hash function for loop invariant expressions.  */
 
 inline hashval_t
@@ -768,8 +783,10 @@  dump_groups (FILE *file, struct ivopts_d
       fprintf (file, "Group %d:\n", group->id);
       if (group->type == USE_NONLINEAR_EXPR)
 	fprintf (file, "  Type:\tGENERIC\n");
-      else if (group->type == USE_ADDRESS)
-	fprintf (file, "  Type:\tADDRESS\n");
+      else if (group->type == USE_REF_ADDRESS)
+	fprintf (file, "  Type:\tREFERENCE ADDRESS\n");
+      else if (group->type == USE_PTR_ADDRESS)
+	fprintf (file, "  Type:\tPOINTER ARGUMENT ADDRESS\n");
       else
 	{
 	  gcc_assert (group->type == USE_COMPARE);
@@ -1502,19 +1519,21 @@  find_induction_variables (struct ivopts_
 
 /* Records a use of TYPE at *USE_P in STMT whose value is IV in GROUP.
    For address type use, ADDR_BASE is the stripped IV base, ADDR_OFFSET
-   is the const offset stripped from IV base; for other types use, both
-   are zero by default.  */
+   is the const offset stripped from IV base and MEM_TYPE is the type
+   of the memory being addressed.  For uses of other types, ADDR_BASE
+   and ADDR_OFFSET are zero by default and MEM_TYPE is NULL_TREE.  */
 
 static struct iv_use *
 record_use (struct iv_group *group, tree *use_p, struct iv *iv,
-	    gimple *stmt, enum use_type type, tree addr_base,
-	    poly_uint64 addr_offset)
+	    gimple *stmt, enum use_type type, tree mem_type,
+	    tree addr_base, poly_uint64 addr_offset)
 {
   struct iv_use *use = XCNEW (struct iv_use);
 
   use->id = group->vuses.length ();
   use->group_id = group->id;
   use->type = type;
+  use->mem_type = mem_type;
   use->iv = iv;
   use->stmt = stmt;
   use->op_p = use_p;
@@ -1569,18 +1588,21 @@  record_group (struct ivopts_data *data,
 }
 
 /* Record a use of TYPE at *USE_P in STMT whose value is IV in a group.
-   New group will be created if there is no existing group for the use.  */
+   New group will be created if there is no existing group for the use.
+   MEM_TYPE is the type of memory being addressed, or NULL if this
+   isn't an address reference.  */
 
 static struct iv_use *
 record_group_use (struct ivopts_data *data, tree *use_p,
-		  struct iv *iv, gimple *stmt, enum use_type type)
+		  struct iv *iv, gimple *stmt, enum use_type type,
+		  tree mem_type)
 {
   tree addr_base = NULL;
   struct iv_group *group = NULL;
   poly_uint64 addr_offset = 0;
 
   /* Record non address type use in a new group.  */
-  if (type == USE_ADDRESS && iv->base_object)
+  if (address_p (type) && iv->base_object)
     {
       unsigned int i;
 
@@ -1591,7 +1613,7 @@  record_group_use (struct ivopts_data *da
 
 	  group = data->vgroups[i];
 	  use = group->vuses[0];
-	  if (use->type != USE_ADDRESS || !use->iv->base_object)
+	  if (!address_p (use->type) || !use->iv->base_object)
 	    continue;
 
 	  /* Check if it has the same stripped base and step.  */
@@ -1607,7 +1629,8 @@  record_group_use (struct ivopts_data *da
   if (!group)
     group = record_group (data, type);
 
-  return record_use (group, use_p, iv, stmt, type, addr_base, addr_offset);
+  return record_use (group, use_p, iv, stmt, type, mem_type,
+		     addr_base, addr_offset);
 }
 
 /* Checks whether the use OP is interesting and if so, records it.  */
@@ -1641,7 +1664,7 @@  find_interesting_uses_op (struct ivopts_
   stmt = SSA_NAME_DEF_STMT (op);
   gcc_assert (gimple_code (stmt) == GIMPLE_PHI || is_gimple_assign (stmt));
 
-  use = record_group_use (data, NULL, iv, stmt, USE_NONLINEAR_EXPR);
+  use = record_group_use (data, NULL, iv, stmt, USE_NONLINEAR_EXPR, NULL_TREE);
   iv->nonlin_use = use;
   return use;
 }
@@ -1757,10 +1780,10 @@  find_interesting_uses_cond (struct ivopt
       return;
     }
 
-  record_group_use (data, var_p, var_iv, stmt, USE_COMPARE);
+  record_group_use (data, var_p, var_iv, stmt, USE_COMPARE, NULL_TREE);
   /* Record compare type iv_use for iv on the other side of comparison.  */
   if (ret == COMP_IV_EXPR_2)
-    record_group_use (data, bound_p, bound_iv, stmt, USE_COMPARE);
+    record_group_use (data, bound_p, bound_iv, stmt, USE_COMPARE, NULL_TREE);
 }
 
 /* Returns the outermost loop EXPR is obviously invariant in
@@ -2375,7 +2398,7 @@  find_interesting_uses_address (struct iv
   if (civ->base_object == NULL_TREE)
     goto fail;
 
-  record_group_use (data, op_p, civ, stmt, USE_ADDRESS);
+  record_group_use (data, op_p, civ, stmt, USE_REF_ADDRESS, TREE_TYPE (*op_p));
   return;
 
 fail:
@@ -2398,6 +2421,51 @@  find_invariants_stmt (struct ivopts_data
     }
 }
 
+/* CALL calls an internal function.  If operand *OP_P will become an
+   address when the call is expanded, return the type of the memory
+   being addressed, otherwise return null.  */
+
+static tree
+get_mem_type_for_internal_fn (gcall *call, tree *op_p)
+{
+  switch (gimple_call_internal_fn (call))
+    {
+    case IFN_MASK_LOAD:
+      if (op_p == gimple_call_arg_ptr (call, 0))
+	return TREE_TYPE (gimple_call_lhs (call));
+      return NULL_TREE;
+
+    case IFN_MASK_STORE:
+      if (op_p == gimple_call_arg_ptr (call, 0))
+	return TREE_TYPE (gimple_call_arg (call, 3));
+      return NULL_TREE;
+
+    default:
+      return NULL_TREE;
+    }
+}
+
+/* IV is a (non-address) iv that describes operand *OP_P of STMT.
+   Return true if the operand will become an address when STMT
+   is expanded and record the associated address use if so.  */
+
+static bool
+find_address_like_use (struct ivopts_data *data, gimple *stmt, tree *op_p,
+		       struct iv *iv)
+{
+  tree mem_type = NULL_TREE;
+  if (gcall *call = dyn_cast <gcall *> (stmt))
+    if (gimple_call_internal_p (call))
+      mem_type = get_mem_type_for_internal_fn (call, op_p);
+  if (mem_type)
+    {
+      iv = alloc_iv (data, iv->base, iv->step);
+      record_group_use (data, op_p, iv, stmt, USE_PTR_ADDRESS, mem_type);
+      return true;
+    }
+  return false;
+}
+
 /* Finds interesting uses of induction variables in the statement STMT.  */
 
 static void
@@ -2482,7 +2550,8 @@  find_interesting_uses_stmt (struct ivopt
       if (!iv)
 	continue;
 
-      find_interesting_uses_op (data, op);
+      if (!find_address_like_use (data, stmt, use_p->use, iv))
+	find_interesting_uses_op (data, op);
     }
 }
 
@@ -2516,7 +2585,7 @@  addr_offset_valid_p (struct iv_use *use,
   rtx reg, addr;
   unsigned list_index;
   addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));
-  machine_mode addr_mode, mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));
+  machine_mode addr_mode, mem_mode = TYPE_MODE (use->mem_type);
 
   list_index = (unsigned) as * MAX_MACHINE_MODE + (unsigned) mem_mode;
   if (list_index >= vec_safe_length (addr_list))
@@ -2573,7 +2642,7 @@  split_small_address_groups_p (struct ivo
       if (group->vuses.length () == 1)
 	continue;
 
-      gcc_assert (group->type == USE_ADDRESS);
+      gcc_assert (address_p (group->type));
       if (group->vuses.length () == 2)
 	{
 	  if (compare_sizes_for_sort (group->vuses[0]->addr_offset,
@@ -2625,7 +2694,7 @@  split_address_groups (struct ivopts_data
       if (group->vuses.length () == 1)
 	continue;
 
-      gcc_assert (group->type == USE_ADDRESS);
+      gcc_assert (address_p (use->type));
 
       for (j = 1; j < group->vuses.length ();)
 	{
@@ -3145,7 +3214,7 @@  add_autoinc_candidates (struct ivopts_da
 
   cstepi = int_cst_value (step);
 
-  mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));
+  mem_mode = TYPE_MODE (use->mem_type);
   if (((USE_LOAD_PRE_INCREMENT (mem_mode)
 	|| USE_STORE_PRE_INCREMENT (mem_mode))
        && must_eq (GET_MODE_SIZE (mem_mode), cstepi))
@@ -3436,7 +3505,7 @@  add_iv_candidate_for_use (struct ivopts_
   /* At last, add auto-incremental candidates.  Make such variables
      important since other iv uses with same base object may be based
      on it.  */
-  if (use != NULL && use->type == USE_ADDRESS)
+  if (use != NULL && address_p (use->type))
     add_autoinc_candidates (data, iv->base, iv->step, true, use);
 }
 
@@ -3903,7 +3972,7 @@  get_use_type (struct iv_use *use)
   tree base_type = TREE_TYPE (use->iv->base);
   tree type;
 
-  if (use->type == USE_ADDRESS)
+  if (use->type == USE_REF_ADDRESS)
     {
       /* The base_type may be a void pointer.  Create a pointer type based on
 	 the mem_ref instead.  */
@@ -4331,7 +4400,7 @@  get_address_cost (struct ivopts_data *da
   struct mem_address parts = {NULL_TREE, integer_one_node,
 			      NULL_TREE, NULL_TREE, NULL_TREE};
   machine_mode addr_mode = TYPE_MODE (type);
-  machine_mode mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));
+  machine_mode mem_mode = TYPE_MODE (use->mem_type);
   addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));
   /* Only true if ratio != 1.  */
   bool ok_with_ratio_p = false;
@@ -5220,7 +5289,8 @@  determine_group_iv_cost (struct ivopts_d
     case USE_NONLINEAR_EXPR:
       return determine_group_iv_cost_generic (data, group, cand);
 
-    case USE_ADDRESS:
+    case USE_REF_ADDRESS:
+    case USE_PTR_ADDRESS:
       return determine_group_iv_cost_address (data, group, cand);
 
     case USE_COMPARE:
@@ -5238,7 +5308,7 @@  determine_group_iv_cost (struct ivopts_d
 autoinc_possible_for_pair (struct ivopts_data *data, struct iv_use *use,
 			   struct iv_cand *cand)
 {
-  if (use->type != USE_ADDRESS)
+  if (!address_p (use->type))
     return false;
 
   bool can_autoinc = false;
@@ -6997,6 +7067,27 @@  adjust_iv_update_pos (struct iv_cand *ca
   cand->incremented_at = use->stmt;
 }
 
+/* Return the alias pointer type that should be used for a MEM_REF
+   associated with USE, which has type USE_PTR_ADDRESS.  */
+
+static tree
+get_alias_ptr_type_for_ptr_address (iv_use *use)
+{
+  gcall *call = as_a <gcall *> (use->stmt);
+  switch (gimple_call_internal_fn (call))
+    {
+    case IFN_MASK_LOAD:
+    case IFN_MASK_STORE:
+      /* The second argument contains the correct alias type.  */
+      gcc_assert (use->op_p = gimple_call_arg_ptr (call, 0));
+      return TREE_TYPE (gimple_call_arg (call, 1));
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+
 /* Rewrites USE (address that is an iv) using candidate CAND.  */
 
 static void
@@ -7025,16 +7116,31 @@  rewrite_use_address (struct ivopts_data
   tree iv = var_at_stmt (data->current_loop, cand, use->stmt);
   tree base_hint = (cand->iv->base_object) ? iv : NULL_TREE;
   gimple_stmt_iterator bsi = gsi_for_stmt (use->stmt);
-  tree type = TREE_TYPE (*use->op_p);
-  unsigned int align = get_object_alignment (*use->op_p);
-  if (align != TYPE_ALIGN (type))
-    type = build_aligned_type (type, align);
-
-  tree ref = create_mem_ref (&bsi, type, &aff,
-			     reference_alias_ptr_type (*use->op_p),
+  tree type = use->mem_type;
+  tree alias_ptr_type;
+  if (use->type == USE_PTR_ADDRESS)
+    alias_ptr_type = get_alias_ptr_type_for_ptr_address (use);
+  else
+    {
+      gcc_assert (type == TREE_TYPE (*use->op_p));
+      unsigned int align = get_object_alignment (*use->op_p);
+      if (align != TYPE_ALIGN (type))
+	type = build_aligned_type (type, align);
+      alias_ptr_type = reference_alias_ptr_type (*use->op_p);
+    }
+  tree ref = create_mem_ref (&bsi, type, &aff, alias_ptr_type,
 			     iv, base_hint, data->speed);
 
-  copy_ref_info (ref, *use->op_p);
+  if (use->type == USE_PTR_ADDRESS)
+    {
+      ref = fold_build1 (ADDR_EXPR, build_pointer_type (use->mem_type), ref);
+      ref = fold_convert (get_use_type (use), ref);
+      ref = force_gimple_operand_gsi (&bsi, ref, true, NULL_TREE,
+				      true, GSI_SAME_STMT);
+    }
+  else
+    copy_ref_info (ref, *use->op_p);
+
   *use->op_p = ref;
 }
 
@@ -7110,7 +7216,7 @@  rewrite_groups (struct ivopts_data *data
 	      update_stmt (group->vuses[j]->stmt);
 	    }
 	}
-      else if (group->type == USE_ADDRESS)
+      else if (address_p (group->type))
 	{
 	  for (j = 0; j < group->vuses.length (); j++)
 	    {
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-9.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/scev-9.c	2016-05-02 10:44:33.000000000 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-9.c	2017-11-17 15:02:12.869042409 +0000
@@ -18,5 +18,5 @@  foo (unsigned char s, unsigned char l)
 }
 
 /* Address of array reference is scev.  */
-/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
+/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
 
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-10.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/scev-10.c	2016-05-02 10:44:33.000000000 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-10.c	2017-11-17 15:02:12.869042409 +0000
@@ -18,5 +18,5 @@  foo (signed char s, signed char l)
 }
 
 /* Address of array reference is scev.  */
-/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
+/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
 
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-11.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/scev-11.c	2016-05-02 10:44:33.000000000 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-11.c	2017-11-17 15:02:12.869042409 +0000
@@ -23,4 +23,4 @@  foo (int n)
 }
 
 /* Address of array reference to b is scev.  */
-/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 2 "ivopts" } } */
+/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 2 "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-12.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/scev-12.c	2016-05-02 10:44:33.000000000 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-12.c	2017-11-17 15:02:12.869042409 +0000
@@ -24,4 +24,4 @@  foo (int x, int n)
 }
 
 /* Address of array reference to b is not scev.  */
-/* { dg-final { scan-tree-dump-times "  Type:\\tADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
+/* { dg-final { scan-tree-dump-times "  Type:\\tREFERENCE ADDRESS\n  Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_index_offset_1.c
===================================================================
--- /dev/null	2017-11-14 14:28:07.424493901 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_index_offset_1.c	2017-11-17 15:02:12.869042409 +0000
@@ -0,0 +1,49 @@ 
+/* { dg-do compile } */
+/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve -msve-vector-bits=256" } */
+
+#define SIZE 15*8+3
+
+#define INDEX_OFFSET_TEST_1(SIGNED, TYPE, ITERTYPE) \
+void set_##SIGNED##_##TYPE##_##ITERTYPE (SIGNED TYPE *__restrict out, \
+					 SIGNED TYPE *__restrict in) \
+{ \
+  SIGNED ITERTYPE i; \
+  for (i = 0; i < SIZE; i++) \
+  { \
+    out[i] = in[i]; \
+  } \
+} \
+void set_##SIGNED##_##TYPE##_##ITERTYPE##_var (SIGNED TYPE *__restrict out, \
+					       SIGNED TYPE *__restrict in, \
+					       SIGNED ITERTYPE n) \
+{\
+  SIGNED ITERTYPE i;\
+  for (i = 0; i < n; i++)\
+  {\
+    out[i] = in[i];\
+  }\
+}
+
+#define INDEX_OFFSET_TEST(SIGNED, TYPE)\
+  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, char) \
+  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, short) \
+  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, int) \
+  INDEX_OFFSET_TEST_1 (SIGNED, TYPE, long)
+
+INDEX_OFFSET_TEST (signed, long)
+INDEX_OFFSET_TEST (unsigned, long)
+INDEX_OFFSET_TEST (signed, int)
+INDEX_OFFSET_TEST (unsigned, int)
+INDEX_OFFSET_TEST (signed, short)
+INDEX_OFFSET_TEST (unsigned, short)
+INDEX_OFFSET_TEST (signed, char)
+INDEX_OFFSET_TEST (unsigned, char)
+
+/* { dg-final { scan-assembler-times "ld1d\\tz\[0-9\]+.d, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 3\\\]" 16 } } */
+/* { dg-final { scan-assembler-times "st1d\\tz\[0-9\]+.d, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 3\\\]" 16 } } */
+/* { dg-final { scan-assembler-times "ld1w\\tz\[0-9\]+.s, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 2\\\]" 16 } } */
+/* { dg-final { scan-assembler-times "st1w\\tz\[0-9\]+.s, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 2\\\]" 16 } } */
+/* { dg-final { scan-assembler-times "ld1h\\tz\[0-9\]+.h, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 1\\\]" 16 } } */
+/* { dg-final { scan-assembler-times "st1h\\tz\[0-9\]+.h, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 1\\\]" 16 } } */
+/* { dg-final { scan-assembler-times "ld1b\\tz\[0-9\]+.b, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+\\\]" 16 } } */
+/* { dg-final { scan-assembler-times "st1b\\tz\[0-9\]+.b, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+\\\]" 16 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_index_offset_1_run.c
===================================================================
--- /dev/null	2017-11-14 14:28:07.424493901 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_index_offset_1_run.c	2017-11-17 15:02:12.869042409 +0000
@@ -0,0 +1,48 @@ 
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve -msve-vector-bits=256" { target aarch64_sve256_hw } } */
+
+#include "sve_index_offset_1_run.c"
+
+#include <string.h>
+
+#define CALL_INDEX_OFFSET_TEST_1(SIGNED, TYPE, ITERTYPE)\
+{\
+  SIGNED TYPE out[SIZE + 1];\
+  SIGNED TYPE in1[SIZE + 1];\
+  SIGNED TYPE in2[SIZE + 1];\
+  for (int i = 0; i < SIZE + 1; ++i)\
+    {\
+      in1[i] = (i * 4) ^ i;\
+      in2[i] = (i * 2) ^ i;\
+    }\
+  out[SIZE] = 42;\
+  set_##SIGNED##_##TYPE##_##ITERTYPE (out, in1); \
+  if (0 != memcmp (out, in1, SIZE * sizeof (TYPE)))\
+    return 1;\
+  set_##SIGNED##_##TYPE##_##ITERTYPE##_var (out, in2, SIZE); \
+  if (0 != memcmp (out, in2, SIZE * sizeof (TYPE)))\
+    return 1;\
+  if (out[SIZE] != 42)\
+    return 1;\
+}
+
+#define CALL_INDEX_OFFSET_TEST(SIGNED, TYPE)\
+  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, char) \
+  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, short) \
+  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, int) \
+  CALL_INDEX_OFFSET_TEST_1 (SIGNED, TYPE, long)
+
+int
+main (void)
+{
+  CALL_INDEX_OFFSET_TEST (signed, long)
+  CALL_INDEX_OFFSET_TEST (unsigned, long)
+  CALL_INDEX_OFFSET_TEST (signed, int)
+  CALL_INDEX_OFFSET_TEST (unsigned, int)
+  CALL_INDEX_OFFSET_TEST (signed, short)
+  CALL_INDEX_OFFSET_TEST (unsigned, short)
+  CALL_INDEX_OFFSET_TEST (signed, char)
+  CALL_INDEX_OFFSET_TEST (unsigned, char)
+  return 0;
+}
Index: gcc/testsuite/gcc.target/aarch64/sve_loop_add_2.c
===================================================================
--- /dev/null	2017-11-14 14:28:07.424493901 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_loop_add_2.c	2017-11-17 15:02:12.869952359 +0000
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-std=c99 -O3 -march=armv8-a+sve" } */
+
+void
+foo (int *__restrict a, int *__restrict b)
+{
+  for (int i = 0; i < 512; ++i)
+    a[i] += b[i];
+}
+
+/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 1 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_loop_add_3.c
===================================================================
--- /dev/null	2017-11-14 14:28:07.424493901 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_loop_add_3.c	2017-11-17 15:02:12.869952359 +0000
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-std=c99 -O3 -march=armv8-a+sve" } */
+
+void
+f (int *__restrict a,
+   int *__restrict b,
+   int *__restrict c,
+   int *__restrict d,
+   int *__restrict e,
+   int *__restrict f,
+   int *__restrict g,
+   int *__restrict h,
+   int count)
+{
+  for (int i = 0; i < count; ++i)
+    a[i] = b[i] + c[i] + d[i] + e[i] + f[i] + g[i] + h[i];
+}
+
+/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 7 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 1 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_while_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_while_1.c	2017-11-17 14:54:06.035305786 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_while_1.c	2017-11-17 15:02:12.869952359 +0000
@@ -34,3 +34,11 @@  TEST_ALL (ADD_LOOP)
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_while_2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_while_2.c	2017-11-17 14:54:06.035305786 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_while_2.c	2017-11-17 15:02:12.869952359 +0000
@@ -34,3 +34,11 @@  TEST_ALL (ADD_LOOP)
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_while_3.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_while_3.c	2017-11-17 14:54:06.035305786 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_while_3.c	2017-11-17 15:02:12.869952359 +0000
@@ -34,3 +34,11 @@  TEST_ALL (ADD_LOOP)
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_while_4.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_while_4.c	2017-11-17 14:54:06.035305786 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_while_4.c	2017-11-17 15:02:12.869952359 +0000
@@ -35,3 +35,11 @@  TEST_ALL (ADD_LOOP)
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
 /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */