From patchwork Tue Jan 10 08:26:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chung-Lin Tang X-Patchwork-Id: 90649 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp503105qgi; Tue, 10 Jan 2017 00:27:29 -0800 (PST) X-Received: by 10.99.54.79 with SMTP id d76mr2523628pga.91.1484036849062; Tue, 10 Jan 2017 00:27:29 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id l36si1365522plg.145.2017.01.10.00.27.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Jan 2017 00:27:29 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-445744-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-445744-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-445744-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=YNkkevfvJHWPjVTv53fyweBrr1IIU4DNVnA7ICnz+ihrslTVb5 L5vMIf3K1j5tzhGpfDeIm9JPMg0RNKhzUnoeVc0xFHa4E7gnwVwcj11dQlJmA8bc g7t2FFeM/fmSqAFhS8G6gqj+TY0PmiE/3IMnZ0r0pu08uXA/RdXyB7430= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; s= default; bh=EteIvUH429yNBbS8EL+vREHilgA=; b=N/qkotzWGUOlSG6yRxnZ bAjBv4g8/zzO6O92PuTxv5tNAOXX3CdLOBdiu/G7A+t7cvHRQpNR/psQE4SuCfwU mbDntINwwtVOY9cVaAK4eGVMJCrb5ycJZcAlkm59LZI2Y1YePtXbtXZIgezetQkS HqNULy1lqkcTIOVRLoSLTtA= Received: (qmail 12577 invoked by alias); 10 Jan 2017 08:27:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 11775 invoked by uid 89); 10 Jan 2017 08:27:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.6 required=5.0 tests=AWL, BAYES_40, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS, URIBL_RED autolearn=no version=3.3.2 spammy=H*r:ip*0.0.0.0, descr, H*r:0.0.0, sk:multid X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 10 Jan 2017 08:26:52 +0000 Received: from svr-orw-mbx-06.mgc.mentorg.com ([147.34.90.206]) by relay1.mentorg.com with esmtp id 1cQrla-0004IG-22 from ChungLin_Tang@mentor.com for gcc-patches@gcc.gnu.org; Tue, 10 Jan 2017 00:26:50 -0800 Received: from svr-orw-mbx-01.mgc.mentorg.com (147.34.90.201) by SVR-ORW-MBX-06.mgc.mentorg.com (147.34.90.206) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Tue, 10 Jan 2017 00:26:46 -0800 Received: from [0.0.0.0] (147.34.91.1) by svr-orw-mbx-01.mgc.mentorg.com (147.34.90.201) with Microsoft SMTP Server (TLS) id 15.0.1210.3 via Frontend Transport; Tue, 10 Jan 2017 00:26:46 -0800 To: gcc-patches CC: Thomas Schwinge , Cesar Philippidis From: Chung-Lin Tang Subject: [gomp4] Support multi-dimensional pointer based arrays in OpenACC data clauses Message-ID: <9bd92682-c1d3-5530-4f76-fdc68318d8e9@mentor.com> Date: Tue, 10 Jan 2017 16:26:42 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 This patch implements support for dynamically allocated multi-dimensional arrays in OpenACC data clauses. To illustrate, these kinds of arrays now work: int **a; float *f[100]; double ***d; #pragma acc parallel copy (a[0:100][x:y], f[10:20][0:30]) copyout(d[x:y][x:y][x:y]) { ... } The pointer-to-array-rows kind of case is supposedly also supported in the OpenACC spec (e.g. int (*x)[50]), though support for that is currently still TBD. I've rejected those cases in omp-low. Instead of using multiple continuous map entries like pset/pointer maps, I've opted to use a different style. The compiler creates a descriptor on stack, and passes the pointer into libgomp. libgomp will then process and exchange it for the actual target dynamic array pointer before kernel launch. Tested and committed to gomp-4_0-branch. This will probably also be sent some time during the next stage-1 for trunk. Chung-Lin 2017-01-10 Chung-Lin Tang gcc/c/ * c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous' parameter, adjust recursive call site, add cases for allowing pointer based multi-dimensional arrays for OpenACC. (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call, handle non-contiguous case to create dynamic array map. gcc/cp/ * semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous' parameter, adjust recursive call site, add cases for allowing pointer based multi-dimensional arrays for OpenACC. (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call, handle non-contiguous case to create dynamic array map. gcc/ * gimplify.c (gimplify_scan_omp_clauses): For dynamic array map kinds, make sure bias in each dimension are put into firstprivate variables. * tree-pretty-print.c (dump_omp_clauses): Add cases for printing GOMP_MAP_DYNAMIC_ARRAY map kinds. * omp-low.c (struct omp_context): Add 'hash_map *dynamic_arrays' field, also added include of "tree-hash-traits.h". (append_field_to_record_type): New function. (create_dynamic_array_descr_type): Likewise. (create_dynamic_array_descr_init_code): Likewise. (new_omp_context): Add initialize of dynamic_arrays field. (delete_omp_context): Add delete of dynamic_arrays field. (scan_sharing_clauses): For dynamic array map kinds, check for supported dimension structure, and install dynamic array variable into current omp_context. (lower_omp_target): Add handling for dynamic array map kinds. (dynamic_array_lookup): New function. (dynamic_array_reference_start): Likewise. (scan_for_op): Likewise. (scan_for_reference): Likewise. (da_create_bias): Likewise. (da_dimension_peel): Likewise. (lower_omp_1): Add case to look for start of dynamic array reference, and handle bias adjustments for the code sequence. include/ * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define. (enum gomp_map_kind): Add GOMP_MAP_DYNAMIC_ARRAY, GOMP_MAP_DYNAMIC_ARRAY_TO, GOMP_MAP_DYNAMIC_ARRAY_FROM, GOMP_MAP_DYNAMIC_ARRAY_TOFROM, GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO, GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM, GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM, GOMP_MAP_DYNAMIC_ARRAY_ALLOC, GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC, GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT. (GOMP_MAP_DYNAMIC_ARRAY_P): Define. libgomp/ * target.c (struct da_dim): New struct declaration. (struct da_descr_type): Likewise. (struct da_info): Likewise. (gomp_dynamic_array_count_rows): New function. (gomp_dynamic_array_compute_info): Likewise. (gomp_dynamic_array_fill_rows_1): Likewise. (gomp_dynamic_array_fill_rows): Likewise. (gomp_dynamic_array_create_ptrblock): Likewise. (gomp_map_vars): Add code to handle dynamic array map kinds. * testsuite/libgomp.oacc-c-c++-common/da-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/da-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/da-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/da-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/da-utils.h: New test. Index: gcc/c/c-typeck.c =================================================================== --- gcc/c/c-typeck.c (revision 244258) +++ gcc/c/c-typeck.c (revision 244259) @@ -11926,7 +11926,7 @@ static tree handle_omp_array_sections_1 (tree c, tree t, vec &types, bool &maybe_zero_len, unsigned int &first_non_one, - enum c_omp_region_type ort) + bool &non_contiguous, enum c_omp_region_type ort) { tree ret, low_bound, length, type; if (TREE_CODE (t) != TREE_LIST) @@ -11982,7 +11982,8 @@ } ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types, - maybe_zero_len, first_non_one, ort); + maybe_zero_len, first_non_one, + non_contiguous, ort); if (ret == error_mark_node || ret == NULL_TREE) return ret; @@ -12142,6 +12143,21 @@ } } } + + /* For OpenACC, if the low_bound/length suggest this is a subarray, + and is referenced through by a pointer, then mark this as + non-contiguous. */ + if (ort == C_ORT_ACC + && types.length () > 0 + && (TREE_CODE (low_bound) != INTEGER_CST + || integer_nonzerop (low_bound) + || (length && (TREE_CODE (length) != INTEGER_CST + || !tree_int_cst_equal (size, length))))) + { + tree x = types.last (); + if (TREE_CODE (x) == POINTER_TYPE) + non_contiguous = true; + } } else if (length == NULL_TREE) { @@ -12183,13 +12199,16 @@ /* If there is a pointer type anywhere but in the very first array-section-subscript, the array section can't be contiguous. */ if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND - && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST) + && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST + && ort != C_ORT_ACC) { error_at (OMP_CLAUSE_LOCATION (c), "array section is not contiguous in %qs clause", omp_clause_code_name[OMP_CLAUSE_CODE (c)]); return error_mark_node; } + else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST) + non_contiguous = true; } else { @@ -12217,10 +12236,11 @@ { bool maybe_zero_len = false; unsigned int first_non_one = 0; + bool non_contiguous = false; auto_vec types; tree first = handle_omp_array_sections_1 (c, OMP_CLAUSE_DECL (c), types, maybe_zero_len, first_non_one, - ort); + non_contiguous, ort); if (first == error_mark_node) return true; if (first == NULL_TREE) @@ -12253,6 +12273,7 @@ unsigned int num = types.length (), i; tree t, side_effects = NULL_TREE, size = NULL_TREE; tree condition = NULL_TREE; + tree da_dims = NULL_TREE; if (int_size_in_bytes (TREE_TYPE (first)) <= 0) maybe_zero_len = true; @@ -12276,6 +12297,13 @@ length = fold_convert (sizetype, length); if (low_bound == NULL_TREE) low_bound = integer_zero_node; + + if (non_contiguous) + { + da_dims = tree_cons (low_bound, length, da_dims); + continue; + } + if (!maybe_zero_len && i > first_non_one) { if (integer_nonzerop (low_bound)) @@ -12368,6 +12396,14 @@ size = size_binop (MULT_EXPR, size, l); } } + if (non_contiguous) + { + int kind = OMP_CLAUSE_MAP_KIND (c); + OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_DYNAMIC_ARRAY); + OMP_CLAUSE_DECL (c) = t; + OMP_CLAUSE_SIZE (c) = da_dims; + return false; + } if (side_effects) size = build2 (COMPOUND_EXPR, sizetype, side_effects, size); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION) Index: gcc/cp/semantics.c =================================================================== --- gcc/cp/semantics.c (revision 244258) +++ gcc/cp/semantics.c (revision 244259) @@ -4482,7 +4482,7 @@ static tree handle_omp_array_sections_1 (tree c, tree t, vec &types, bool &maybe_zero_len, unsigned int &first_non_one, - enum c_omp_region_type ort) + bool &non_contiguous, enum c_omp_region_type ort) { tree ret, low_bound, length, type; if (TREE_CODE (t) != TREE_LIST) @@ -4565,7 +4565,8 @@ && TREE_CODE (TREE_CHAIN (t)) == FIELD_DECL) TREE_CHAIN (t) = omp_privatize_field (TREE_CHAIN (t), false); ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types, - maybe_zero_len, first_non_one, ort); + maybe_zero_len, first_non_one, + non_contiguous, ort); if (ret == error_mark_node || ret == NULL_TREE) return ret; @@ -4737,6 +4738,21 @@ } } } + + /* For OpenACC, if the low_bound/length suggest this is a subarray, + and is referenced through by a pointer, then mark this as + non-contiguous. */ + if (ort == C_ORT_ACC + && types.length () > 0 + && (TREE_CODE (low_bound) != INTEGER_CST + || integer_nonzerop (low_bound) + || (length && (TREE_CODE (length) != INTEGER_CST + || !tree_int_cst_equal (size, length))))) + { + tree x = types.last (); + if (TREE_CODE (x) == POINTER_TYPE) + non_contiguous = true; + } } else if (length == NULL_TREE) { @@ -4778,13 +4794,16 @@ /* If there is a pointer type anywhere but in the very first array-section-subscript, the array section can't be contiguous. */ if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND - && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST) + && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST + && ort != C_ORT_ACC) { error_at (OMP_CLAUSE_LOCATION (c), "array section is not contiguous in %qs clause", omp_clause_code_name[OMP_CLAUSE_CODE (c)]); return error_mark_node; } + else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST) + non_contiguous = true; } else { @@ -4812,10 +4831,11 @@ { bool maybe_zero_len = false; unsigned int first_non_one = 0; + bool non_contiguous = false; auto_vec types; tree first = handle_omp_array_sections_1 (c, OMP_CLAUSE_DECL (c), types, maybe_zero_len, first_non_one, - ort); + non_contiguous, ort); if (first == error_mark_node) return true; if (first == NULL_TREE) @@ -4849,6 +4869,7 @@ unsigned int num = types.length (), i; tree t, side_effects = NULL_TREE, size = NULL_TREE; tree condition = NULL_TREE; + tree da_dims = NULL_TREE; if (int_size_in_bytes (TREE_TYPE (first)) <= 0) maybe_zero_len = true; @@ -4874,6 +4895,13 @@ length = fold_convert (sizetype, length); if (low_bound == NULL_TREE) low_bound = integer_zero_node; + + if (non_contiguous) + { + da_dims = tree_cons (low_bound, length, da_dims); + continue; + } + if (!maybe_zero_len && i > first_non_one) { if (integer_nonzerop (low_bound)) @@ -4961,6 +4989,14 @@ } if (!processing_template_decl) { + if (non_contiguous) + { + int kind = OMP_CLAUSE_MAP_KIND (c); + OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_DYNAMIC_ARRAY); + OMP_CLAUSE_DECL (c) = t; + OMP_CLAUSE_SIZE (c) = da_dims; + return false; + } if (side_effects) size = build2 (COMPOUND_EXPR, sizetype, side_effects, size); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION) Index: gcc/gimplify.c =================================================================== --- gcc/gimplify.c (revision 244258) +++ gcc/gimplify.c (revision 244259) @@ -6928,9 +6928,29 @@ if (OMP_CLAUSE_SIZE (c) == NULL_TREE) OMP_CLAUSE_SIZE (c) = DECL_P (decl) ? DECL_SIZE_UNIT (decl) : TYPE_SIZE_UNIT (TREE_TYPE (decl)); - if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p, - NULL, is_gimple_val, fb_rvalue) == GS_ERROR) + if (OMP_CLAUSE_SIZE (c) + && TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST + && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) { + tree dims = OMP_CLAUSE_SIZE (c); + for (tree t = dims; t; t = TREE_CHAIN (t)) + { + /* If a dimension bias isn't a constant, we have to ensure + that the value gets transferred to the offload target. */ + tree low_bound = TREE_PURPOSE (t); + if (TREE_CODE (low_bound) != INTEGER_CST) + { + low_bound = get_initialized_tmp_var (low_bound, pre_p, + NULL); + omp_add_variable (ctx, low_bound, + GOVD_FIRSTPRIVATE | GOVD_SEEN); + TREE_PURPOSE (t) = low_bound; + } + } + } + else if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p, + NULL, is_gimple_val, fb_rvalue) == GS_ERROR) + { remove = true; break; } Index: gcc/tree-pretty-print.c =================================================================== --- gcc/tree-pretty-print.c (revision 244258) +++ gcc/tree-pretty-print.c (revision 244259) @@ -737,6 +737,33 @@ case GOMP_MAP_LINK: pp_string (pp, "link"); break; + case GOMP_MAP_DYNAMIC_ARRAY_TO: + pp_string (pp, "to,dynamic_array"); + break; + case GOMP_MAP_DYNAMIC_ARRAY_FROM: + pp_string (pp, "from,dynamic_array"); + break; + case GOMP_MAP_DYNAMIC_ARRAY_TOFROM: + pp_string (pp, "tofrom,dynamic_array"); + break; + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO: + pp_string (pp, "force_to,dynamic_array"); + break; + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM: + pp_string (pp, "force_from,dynamic_array"); + break; + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM: + pp_string (pp, "force_tofrom,dynamic_array"); + break; + case GOMP_MAP_DYNAMIC_ARRAY_ALLOC: + pp_string (pp, "alloc,dynamic_array"); + break; + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC: + pp_string (pp, "force_alloc,dynamic_array"); + break; + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT: + pp_string (pp, "force_present,dynamic_array"); + break; default: gcc_unreachable (); } @@ -758,6 +785,10 @@ case GOMP_MAP_TO_PSET: pp_string (pp, " [pointer set, len: "); break; + case GOMP_MAP_DYNAMIC_ARRAY: + gcc_assert (TREE_CODE (OMP_CLAUSE_SIZE (clause)) == TREE_LIST); + pp_string (pp, " [dimensions: "); + break; default: pp_string (pp, " [len: "); break; Index: gcc/omp-low.c =================================================================== --- gcc/omp-low.c (revision 244258) +++ gcc/omp-low.c (revision 244259) @@ -84,6 +84,7 @@ #include "hsa.h" #include "params.h" #include "tree-ssa-propagate.h" +#include "tree-hash-traits.h" /* Lowering of OMP parallel and workshare constructs proceeds in two phases. The first phase scans the function looking for OMP statements @@ -203,6 +204,9 @@ /* True if this construct can be cancelled. */ bool cancellable; + + /* Hash map of dynamic arrays in this context. */ + hash_map *dynamic_arrays; }; /* A structure holding the elements of: @@ -1619,7 +1623,136 @@ return error_mark_node; } +/* Helper function for create_dynamic_array_descr_type(), to append a new field + to a record type. */ +static void +append_field_to_record_type (tree record_type, tree fld_ident, tree fld_type) +{ + tree *p, fld = build_decl (UNKNOWN_LOCATION, FIELD_DECL, fld_ident, fld_type); + DECL_CONTEXT (fld) = record_type; + + for (p = &TYPE_FIELDS (record_type); *p; p = &DECL_CHAIN (*p)) + ; + *p = fld; +} + +/* Create type for dynamic array descriptor. Returns created type, and + returns the number of dimensions in *DIM_NUM. */ + +static tree +create_dynamic_array_descr_type (tree decl, tree dims, int *dim_num) +{ + int n = 0; + tree da_descr_type, name, x; + gcc_assert (TREE_CODE (dims) == TREE_LIST); + + da_descr_type = lang_hooks.types.make_type (RECORD_TYPE); + name = create_tmp_var_name (".omp_dynamic_array_descr_type"); + name = build_decl (UNKNOWN_LOCATION, TYPE_DECL, name, da_descr_type); + DECL_ARTIFICIAL (name) = 1; + DECL_NAMELESS (name) = 1; + TYPE_NAME (da_descr_type) = name; + TYPE_ARTIFICIAL (da_descr_type) = 1; + + /* Main starting pointer/array. */ + tree main_var_type = TREE_TYPE (decl); + if (TREE_CODE (main_var_type) == REFERENCE_TYPE) + main_var_type = TREE_TYPE (main_var_type); + append_field_to_record_type (da_descr_type, DECL_NAME (decl), + (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE + ? main_var_type + : build_pointer_type (main_var_type))); + /* Number of dimensions. */ + append_field_to_record_type (da_descr_type, get_identifier ("$dim_num"), + sizetype); + + for (x = dims; x; x = TREE_CHAIN (x), n++) + { + char *fldname; + /* One for the start index. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_base", n); + append_field_to_record_type (da_descr_type, get_identifier (fldname), + sizetype); + /* One for the length. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_length", n); + append_field_to_record_type (da_descr_type, get_identifier (fldname), + sizetype); + /* One for the element size. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_elem_size", n); + append_field_to_record_type (da_descr_type, get_identifier (fldname), + sizetype); + /* One for is_array flag. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_is_array", n); + append_field_to_record_type (da_descr_type, get_identifier (fldname), + sizetype); + } + + layout_type (da_descr_type); + *dim_num = n; + return da_descr_type; +} + +/* Generate code sequence for initializing dynamic array descriptor. */ + +static void +create_dynamic_array_descr_init_code (tree da_descr, tree da_var, + tree dimensions, int da_dim_num, + gimple_seq *ilist) +{ + tree fld, fldref; + tree da_descr_type = TREE_TYPE (da_descr); + tree dim_type = TREE_TYPE (da_var); + + fld = TYPE_FIELDS (da_descr_type); + fldref = omp_build_component_ref (da_descr, fld); + gimplify_assign (fldref, (TREE_CODE (dim_type) == ARRAY_TYPE + ? build_fold_addr_expr (da_var) : da_var), ilist); + + if (TREE_CODE (dim_type) == REFERENCE_TYPE) + dim_type = TREE_TYPE (dim_type); + + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (da_descr, fld); + gimplify_assign (fldref, build_int_cst (sizetype, da_dim_num), ilist); + + while (dimensions) + { + tree dim_base = fold_convert (sizetype, TREE_PURPOSE (dimensions)); + tree dim_length = fold_convert (sizetype, TREE_VALUE (dimensions)); + tree dim_elem_size = TYPE_SIZE_UNIT (TREE_TYPE (dim_type)); + tree dim_is_array = (TREE_CODE (dim_type) == ARRAY_TYPE + ? integer_one_node : integer_zero_node); + /* Set base. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (da_descr, fld); + dim_base = fold_build2 (MULT_EXPR, sizetype, dim_base, dim_elem_size); + gimplify_assign (fldref, dim_base, ilist); + + /* Set length. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (da_descr, fld); + dim_length = fold_build2 (MULT_EXPR, sizetype, dim_length, dim_elem_size); + gimplify_assign (fldref, dim_length, ilist); + + /* Set elem_size. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (da_descr, fld); + dim_elem_size = fold_convert (sizetype, dim_elem_size); + gimplify_assign (fldref, dim_elem_size, ilist); + + /* Set is_array flag. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (da_descr, fld); + dim_is_array = fold_convert (sizetype, dim_is_array); + gimplify_assign (fldref, dim_is_array, ilist); + + dimensions = TREE_CHAIN (dimensions); + dim_type = TREE_TYPE (dim_type); + } + gcc_assert (TREE_CHAIN (fld) == NULL_TREE); +} + /* Debugging dumps for parallel regions. */ void dump_omp_region (FILE *, struct omp_region *, int); void debug_omp_region (struct omp_region *); @@ -1760,6 +1893,8 @@ ctx->cb.decl_map = new hash_map; + ctx->dynamic_arrays = new hash_map; + return ctx; } @@ -1834,6 +1969,8 @@ if (is_task_ctx (ctx)) finalize_task_copyfn (as_a (ctx->stmt)); + delete ctx->dynamic_arrays; + XDELETE (ctx); } @@ -2144,6 +2281,42 @@ install_var_local (decl, ctx); break; } + + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + { + tree da_decl = OMP_CLAUSE_DECL (c); + tree da_dimensions = OMP_CLAUSE_SIZE (c); + tree da_type = TREE_TYPE (da_decl); + bool by_ref = (TREE_CODE (da_type) == ARRAY_TYPE + ? true : false); + + /* Checking code to ensure we only have arrays at top dimension. + This limitation might be lifted in the future. */ + if (TREE_CODE (da_type) == REFERENCE_TYPE) + da_type = TREE_TYPE (da_type); + tree t = da_type, prev_t = NULL_TREE; + while (t) + { + if (TREE_CODE (t) == ARRAY_TYPE && prev_t) + { + error_at (gimple_location (ctx->stmt), "array types are" + " only allowed at outermost dimension of" + " dynamic array"); + break; + } + prev_t = t; + t = TREE_TYPE (t); + } + + install_var_field (da_decl, by_ref, 3, ctx); + tree new_var = install_var_local (da_decl, ctx); + + bool existed = ctx->dynamic_arrays->put (new_var, da_dimensions); + gcc_assert (!existed); + break; + } + if (DECL_P (decl)) { if (DECL_SIZE (decl) @@ -16359,6 +16532,15 @@ case GOMP_MAP_FORCE_PRESENT: case GOMP_MAP_FORCE_DEVICEPTR: case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_DYNAMIC_ARRAY_TO: + case GOMP_MAP_DYNAMIC_ARRAY_FROM: + case GOMP_MAP_DYNAMIC_ARRAY_TOFROM: + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO: + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM: + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM: + case GOMP_MAP_DYNAMIC_ARRAY_ALLOC: + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC: + case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT: case GOMP_MAP_LINK: gcc_assert (is_gimple_omp_oacc (stmt)); break; @@ -16421,7 +16603,14 @@ if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP && OMP_CLAUSE_MAP_IN_REDUCTION (c))) { - x = build_receiver_ref (var, true, ctx); + tree var_type = TREE_TYPE (var); + bool rcv_by_ref = + (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)) + && TREE_CODE (var_type) != ARRAY_TYPE + ? false : true); + + x = build_receiver_ref (var, rcv_by_ref, ctx); tree new_var = lookup_decl (var, ctx); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP @@ -16665,6 +16854,25 @@ avar = build_fold_addr_expr (avar); gimplify_assign (x, avar, &ilist); } + else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_DYNAMIC_ARRAY)) + { + int da_dim_num; + tree dimensions = OMP_CLAUSE_SIZE (c); + + tree da_descr_type = + create_dynamic_array_descr_type (OMP_CLAUSE_DECL (c), + dimensions, &da_dim_num); + tree da_descr = + create_tmp_var_raw (da_descr_type, ".$omp_da_descr"); + gimple_add_tmp_var (da_descr); + + create_dynamic_array_descr_init_code + (da_descr, ovar, dimensions, da_dim_num, &ilist); + + gimplify_assign (x, build_fold_addr_expr (da_descr), + &ilist); + } else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE) { gcc_checking_assert (is_gimple_omp_oacc (ctx->stmt)); @@ -16725,6 +16933,9 @@ s = TREE_TYPE (s); s = TYPE_SIZE_UNIT (s); } + else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_DYNAMIC_ARRAY)) + s = NULL_TREE; else s = OMP_CLAUSE_SIZE (c); if (s == NULL_TREE) @@ -17406,7 +17617,202 @@ gimple_build_omp_return (false)); } +/* Helper to lookup dynamic array through nested omp contexts. Returns + TREE_LIST of dimensions, and the CTX where it was found in *CTX_P. */ +static tree +dynamic_array_lookup (tree t, omp_context **ctx_p) +{ + omp_context *c = *ctx_p; + while (c) + { + tree *dims = c->dynamic_arrays->get (t); + if (dims) + { + *ctx_p = c; + return *dims; + } + c = c->outer; + } + return NULL_TREE; +} + +/* Tests if this gimple STMT is the start of a dynamic array access sequence. + Returns true if found, and also returns the gimple operand ptr and + dimensions tree list through *OUT_REF and *OUT_DIMS respectively. */ + +static bool +dynamic_array_reference_start (gimple *stmt, omp_context **ctx_p, + tree **out_ref, tree *out_dims) +{ + if (gimple_code (stmt) == GIMPLE_ASSIGN) + for (unsigned i = 1; i < gimple_num_ops (stmt); i++) + { + tree *op = gimple_op_ptr (stmt, i), dims; + if (TREE_CODE (*op) == ARRAY_REF) + op = &TREE_OPERAND (*op, 0); + if (TREE_CODE (*op) == MEM_REF) + op = &TREE_OPERAND (*op, 0); + if ((dims = dynamic_array_lookup (*op, ctx_p)) != NULL_TREE) + { + *out_ref = op; + *out_dims = dims; + return true; + } + } + return false; +} + +static tree +scan_for_op (tree *tp, int *walk_subtrees, void *data) +{ + struct walk_stmt_info *wi = (struct walk_stmt_info *) data; + tree t = *tp; + tree op = (tree) wi->info; + *walk_subtrees = 1; + if (operand_equal_p (t, op, 0)) + { + wi->info = tp; + return t; + } + return NULL_TREE; +} + +static tree * +scan_for_reference (gimple *stmt, tree op) +{ + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = op; + if (walk_gimple_op (stmt, scan_for_op, &wi)) + return (tree *) wi.info; + return NULL; +} + +static tree +da_create_bias (tree orig_bias, tree unit_type) +{ + return build2 (MULT_EXPR, sizetype, fold_convert (sizetype, orig_bias), + TYPE_SIZE_UNIT (unit_type)); +} + +/* Main worker for adjusting dynamic array accesses, handles the adjustment + of many cases of statement forms, and called multiple times to 'peel' away + each dimension. */ + +static gimple_stmt_iterator +da_dimension_peel (omp_context *da_ctx, + gimple_stmt_iterator da_gsi, tree orig_da, + tree *da_op_p, tree *da_type_p, tree *da_dims_p) +{ + gimple *stmt = gsi_stmt (da_gsi); + tree lhs = gimple_assign_lhs (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + + if (gimple_num_ops (stmt) == 2 + && TREE_CODE (rhs) == MEM_REF + && operand_equal_p (*da_op_p, TREE_OPERAND (rhs, 0), 0) + && !operand_equal_p (orig_da, TREE_OPERAND (rhs, 0), 0) + && (TREE_OPERAND (rhs, 1) == NULL_TREE + || integer_zerop (TREE_OPERAND (rhs, 1)))) + { + gcc_assert (TREE_CODE (TREE_TYPE (*da_type_p)) == POINTER_TYPE); + *da_type_p = TREE_TYPE (*da_type_p); + } + else + { + gimple *g; + gimple_seq ilist = NULL; + tree bias, t; + tree op = *da_op_p; + tree orig_type = *da_type_p; + tree orig_bias = TREE_PURPOSE (*da_dims_p); + bool by_ref = false; + + if (TREE_CODE (orig_bias) != INTEGER_CST) + orig_bias = lookup_decl (orig_bias, da_ctx); + + if (gimple_num_ops (stmt) == 2) + { + if (TREE_CODE (rhs) == ADDR_EXPR) + { + rhs = TREE_OPERAND (rhs, 0); + *da_dims_p = NULL_TREE; + } + + if (TREE_CODE (rhs) == ARRAY_REF + && TREE_CODE (TREE_OPERAND (rhs, 0)) == MEM_REF + && operand_equal_p (TREE_OPERAND (TREE_OPERAND (rhs, 0), 0), + *da_op_p, 0)) + { + bias = da_create_bias (orig_bias, + TREE_TYPE (TREE_TYPE (orig_type))); + *da_type_p = TREE_TYPE (TREE_TYPE (orig_type)); + } + else if (TREE_CODE (rhs) == ARRAY_REF + && TREE_CODE (TREE_OPERAND (rhs, 0)) == VAR_DECL + && operand_equal_p (TREE_OPERAND (rhs, 0), *da_op_p, 0)) + { + tree ptr_type = build_pointer_type (orig_type); + op = create_tmp_var (ptr_type); + gimplify_assign (op, build_fold_addr_expr (TREE_OPERAND (rhs, 0)), + &ilist); + bias = da_create_bias (orig_bias, TREE_TYPE (orig_type)); + *da_type_p = TREE_TYPE (orig_type); + orig_type = ptr_type; + by_ref = true; + } + else if (TREE_CODE (rhs) == MEM_REF + && operand_equal_p (*da_op_p, TREE_OPERAND (rhs, 0), 0) + && TREE_OPERAND (rhs, 1) != NULL_TREE) + { + bias = da_create_bias (orig_bias, TREE_TYPE (orig_type)); + *da_type_p = TREE_TYPE (orig_type); + } + else if (TREE_CODE (lhs) == MEM_REF + && operand_equal_p (*da_op_p, TREE_OPERAND (lhs, 0), 0)) + { + if (*da_dims_p != NULL_TREE) + { + gcc_assert (TREE_CHAIN (*da_dims_p) == NULL_TREE); + bias = da_create_bias (orig_bias, TREE_TYPE (orig_type)); + *da_type_p = TREE_TYPE (orig_type); + } + else + /* This should be the end of the dynamic array access + sequence. */ + return da_gsi; + } + else + gcc_unreachable (); + } + else if (gimple_num_ops (stmt) == 3 + && gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR + && operand_equal_p (*da_op_p, rhs, 0)) + { + bias = da_create_bias (orig_bias, TREE_TYPE (orig_type)); + } + else + gcc_unreachable (); + + bias = fold_build1 (NEGATE_EXPR, sizetype, bias); + bias = fold_build2 (POINTER_PLUS_EXPR, orig_type, op, bias); + + t = create_tmp_var (by_ref ? build_pointer_type (orig_type) : orig_type); + + g = gimplify_assign (t, bias, &ilist); + gsi_insert_seq_before (&da_gsi, ilist, GSI_NEW_STMT); + *da_op_p = gimple_assign_lhs (g); + + if (by_ref) + *da_op_p = build2 (MEM_REF, TREE_TYPE (orig_type), *da_op_p, + build_int_cst (orig_type, 0)); + *da_dims_p = TREE_CHAIN (*da_dims_p); + } + + return da_gsi; +} + /* Callback for lower_omp_1. Return non-NULL if *tp needs to be regimplified. If DATA is non-NULL, lower_omp_1 is outside of OMP context, but with task_shared_vars set. */ @@ -17681,6 +18087,51 @@ } /* FALLTHRU */ default: + + /* If we detect the start of a dynamic array reference sequence, scan + and do the needed adjustments. */ + tree da_dims, *da_op_p; + omp_context *da_ctx = ctx; + if (da_ctx && dynamic_array_reference_start (stmt, &da_ctx, + &da_op_p, &da_dims)) + { + bool started = false; + tree orig_da = *da_op_p; + tree da_type = TREE_TYPE (orig_da); + tree next_da_op; + + gimple_stmt_iterator da_gsi = *gsi_p, new_gsi; + while (da_op_p) + { + if (!is_gimple_assign (gsi_stmt (da_gsi)) + || ((gimple_assign_single_p (gsi_stmt (da_gsi)) + || gimple_assign_cast_p (gsi_stmt (da_gsi))) + && *da_op_p == gimple_assign_rhs1 (gsi_stmt (da_gsi)))) + break; + + new_gsi = da_dimension_peel (da_ctx, da_gsi, orig_da, + da_op_p, &da_type, &da_dims); + if (!started) + { + /* Point 'stmt' to the start of the newly added + sequence. */ + started = true; + *gsi_p = new_gsi; + stmt = gsi_stmt (*gsi_p); + } + if (!da_dims) + break; + + next_da_op = gimple_assign_lhs (gsi_stmt (da_gsi)); + + do { + gsi_next (&da_gsi); + da_op_p = scan_for_reference (gsi_stmt (da_gsi), next_da_op); + } + while (!da_op_p); + } + } + if ((ctx || task_shared_vars) && walk_gimple_op (stmt, lower_omp_regimplify_p, ctx ? NULL : &wi)) Index: include/gomp-constants.h =================================================================== --- include/gomp-constants.h (revision 244258) +++ include/gomp-constants.h (revision 244259) @@ -40,6 +40,7 @@ #define GOMP_MAP_FLAG_SPECIAL_0 (1 << 2) #define GOMP_MAP_FLAG_SPECIAL_1 (1 << 3) #define GOMP_MAP_FLAG_SPECIAL_2 (1 << 4) +#define GOMP_MAP_FLAG_SPECIAL_3 (1 << 5) #define GOMP_MAP_FLAG_SPECIAL (GOMP_MAP_FLAG_SPECIAL_1 \ | GOMP_MAP_FLAG_SPECIAL_0) /* Flag to force a specific behavior (or else, trigger a run-time error). */ @@ -128,7 +129,26 @@ /* Decrement usage count and deallocate if zero. */ GOMP_MAP_RELEASE = (GOMP_MAP_FLAG_SPECIAL_2 | GOMP_MAP_DELETE), - + /* Mapping kinds for dynamic arrays. */ + GOMP_MAP_DYNAMIC_ARRAY = (GOMP_MAP_FLAG_SPECIAL_3), + GOMP_MAP_DYNAMIC_ARRAY_TO = (GOMP_MAP_DYNAMIC_ARRAY + | GOMP_MAP_TO), + GOMP_MAP_DYNAMIC_ARRAY_FROM = (GOMP_MAP_DYNAMIC_ARRAY + | GOMP_MAP_FROM), + GOMP_MAP_DYNAMIC_ARRAY_TOFROM = (GOMP_MAP_DYNAMIC_ARRAY + | GOMP_MAP_TOFROM), + GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO = (GOMP_MAP_DYNAMIC_ARRAY_TO + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM = (GOMP_MAP_DYNAMIC_ARRAY_FROM + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM = (GOMP_MAP_DYNAMIC_ARRAY_TOFROM + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_DYNAMIC_ARRAY_ALLOC = (GOMP_MAP_DYNAMIC_ARRAY + | GOMP_MAP_ALLOC), + GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC = (GOMP_MAP_DYNAMIC_ARRAY + | GOMP_MAP_FORCE_ALLOC), + GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT = (GOMP_MAP_DYNAMIC_ARRAY + | GOMP_MAP_FORCE_PRESENT), /* Internal to GCC, not used in libgomp. */ /* Do not map, but pointer assign a pointer instead. */ GOMP_MAP_FIRSTPRIVATE_POINTER = (GOMP_MAP_LAST | 1), @@ -156,6 +176,8 @@ #define GOMP_MAP_ALWAYS_P(X) \ (GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM)) +#define GOMP_MAP_DYNAMIC_ARRAY_P(X) \ + ((X) & GOMP_MAP_DYNAMIC_ARRAY) /* Asynchronous behavior. Keep in sync with libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t. */ Index: libgomp/target.c =================================================================== --- libgomp/target.c (revision 244258) +++ libgomp/target.c (revision 244259) @@ -375,6 +375,140 @@ return tgt->tgt_start + tgt->list[i].offset; } +/* Dynamic array related data structures, interfaces with the compiler. */ + +struct da_dim { + size_t base; + size_t length; + size_t elem_size; + size_t is_array; +}; + +struct da_descr_type { + void *ptr; + size_t ndims; + struct da_dim dims[]; +}; + +/* Internal dynamic array info struct, used only here inside the runtime. */ + +struct da_info +{ + struct da_descr_type *descr; + size_t map_index; + size_t ptrblock_size; + size_t data_row_num; + size_t data_row_size; +}; + +static size_t +gomp_dynamic_array_count_rows (struct da_descr_type *descr) +{ + size_t nrows = 1; + for (size_t d = 0; d < descr->ndims - 1; d++) + nrows *= descr->dims[d].length / sizeof (void *); + return nrows; +} + +static void +gomp_dynamic_array_compute_info (struct da_info *da) +{ + size_t d, n = 1; + struct da_descr_type *descr = da->descr; + + da->ptrblock_size = 0; + for (d = 0; d < descr->ndims - 1; d++) + { + size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size; + size_t dim_ptrblock_size = (descr->dims[d + 1].is_array + ? 0 : descr->dims[d].length * n); + da->ptrblock_size += dim_ptrblock_size; + n *= dim_count; + } + da->data_row_num = n; + da->data_row_size = descr->dims[d].length; +} + +static void +gomp_dynamic_array_fill_rows_1 (struct da_descr_type *descr, void *da, + size_t d, void ***row_ptr, size_t *count) +{ + if (d < descr->ndims - 1) + { + size_t elsize = descr->dims[d].elem_size; + size_t n = descr->dims[d].length / elsize; + void *p = da + descr->dims[d].base; + for (size_t i = 0; i < n; i++) + { + void *ptr = p + i * elsize; + /* Deref if next dimension is not array. */ + if (!descr->dims[d + 1].is_array) + ptr = *((void **) ptr); + gomp_dynamic_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count); + } + } + else + { + **row_ptr = da + descr->dims[d].base; + *row_ptr += 1; + *count += 1; + } +} + +static size_t +gomp_dynamic_array_fill_rows (struct da_descr_type *descr, void *rows[]) +{ + size_t count = 0; + void **p = rows; + gomp_dynamic_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count); + return count; +} + +static void * +gomp_dynamic_array_create_ptrblock (struct da_info *da, + void *tgt_addr, void *tgt_data_rows[]) +{ + struct da_descr_type *descr = da->descr; + void *ptrblock = gomp_malloc (da->ptrblock_size); + void **curr_dim_ptrblock = (void **) ptrblock; + size_t n = 1; + + for (size_t d = 0; d < descr->ndims - 1; d++) + { + int curr_dim_len = descr->dims[d].length; + int next_dim_len = descr->dims[d + 1].length; + int curr_dim_num = curr_dim_len / sizeof (void *); + + void *next_dim_ptrblock + = (void *)(curr_dim_ptrblock + n * curr_dim_num); + + for (int b = 0; b < n; b++) + for (int i = 0; i < curr_dim_num; i++) + { + if (d < descr->ndims - 2) + { + void *ptr = (next_dim_ptrblock + + b * curr_dim_num * next_dim_len + + i * next_dim_len); + void *tgt_ptr = tgt_addr + (ptr - ptrblock); + curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr; + } + else + { + curr_dim_ptrblock[b * curr_dim_num + i] + = tgt_data_rows[b * curr_dim_num + i]; + } + void *addr = &curr_dim_ptrblock[b * curr_dim_num + i]; + assert (ptrblock <= addr && addr < ptrblock + da->ptrblock_size); + } + + n *= curr_dim_num; + curr_dim_ptrblock = next_dim_ptrblock; + } + assert (n == da->data_row_num); + return ptrblock; +} + attribute_hidden struct target_mem_desc * gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds, @@ -386,9 +520,29 @@ const int typemask = short_mapkind ? 0xff : 0x7; struct splay_tree_s *mem_map = &devicep->mem_map; struct splay_tree_key_s cur_node; - struct target_mem_desc *tgt - = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum); - tgt->list_count = mapnum; + struct target_mem_desc *tgt; + + size_t da_data_row_num = 0, row_start = 0; + size_t da_info_num = 0, da_index; + struct da_info *da_info = NULL; + struct target_var_desc *row_desc; + uintptr_t target_row_addr; + void **host_data_rows = NULL, **target_data_rows = NULL; + void *row; + + for (i = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + { + da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]); + da_info_num += 1; + } + } + + tgt = gomp_malloc (sizeof (*tgt) + + sizeof (tgt->list[0]) * (mapnum + da_data_row_num)); + tgt->list_count = mapnum + da_data_row_num; tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1; tgt->device_descr = devicep; @@ -399,6 +553,14 @@ return tgt; } + if (da_info_num) + da_info = gomp_alloca (sizeof (struct da_info) * da_info_num); + if (da_data_row_num) + { + host_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num); + target_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num); + } + tgt_align = sizeof (void *); tgt_size = 0; if (pragma_kind == GOMP_MAP_VARS_TARGET) @@ -416,7 +578,7 @@ return NULL; } - for (i = 0; i < mapnum; i++) + for (i = 0, da_index = 0; i < mapnum; i++) { int kind = get_kind (short_mapkind, kinds, i); if (hostaddrs[i] == NULL @@ -482,6 +644,20 @@ has_firstprivate = true; continue; } + else if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + { + /* Ignore dynamic arrays for now, we process them together + later. */ + tgt->list[i].key = NULL; + tgt->list[i].offset = 0; + not_found_cnt++; + + struct da_info *da = &da_info[da_index++]; + da->descr = (struct da_descr_type *) hostaddrs[i]; + da->map_index = i; + continue; + } + cur_node.host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) cur_node.host_end = cur_node.host_start + sizes[i]; @@ -545,6 +721,55 @@ } } + /* For dynamic arrays. Each data row is one target item, separated from + the normal map clause items, hence we order them after mapnum. */ + for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + continue; + + struct da_info *da = &da_info[da_index++]; + struct da_descr_type *descr = da->descr; + size_t nr; + + gomp_dynamic_array_compute_info (da); + + /* We have allocated space in host/target_data_rows to place all the + row data block pointers, now we can start filling them in. */ + nr = gomp_dynamic_array_fill_rows (descr, &host_data_rows[row_start]); + assert (nr == da->data_row_num); + + size_t align = (size_t) 1 << (kind >> rshift); + if (tgt_align < align) + tgt_align = align; + tgt_size = (tgt_size + align - 1) & ~(align - 1); + tgt_size += da->ptrblock_size; + + for (size_t j = 0; j < da->data_row_num; j++) + { + row = host_data_rows[row_start + j]; + row_desc = &tgt->list[mapnum + row_start + j]; + + cur_node.host_start = (uintptr_t) row; + cur_node.host_end = cur_node.host_start + da->data_row_size; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n) + { + assert (n->refcount != REFCOUNT_LINK); + gomp_map_vars_existing (devicep, n, &cur_node, row_desc, + kind & typemask); + } + else + { + tgt_size = (tgt_size + align - 1) & ~(align - 1); + tgt_size += da->data_row_size; + not_found_cnt++; + } + } + row_start += da->data_row_num; + } + if (devaddrs) { if (mapnum != 1) @@ -675,6 +900,15 @@ default: break; } + + if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + { + tgt->list[i].key = &array->key; + tgt->list[i].key->tgt = tgt; + array++; + continue; + } + splay_tree_key k = &array->key; k->host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) @@ -825,8 +1059,110 @@ array++; } } + + /* Processing of dynamic array rows. */ + for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + continue; + + struct da_info *da = &da_info[da_index++]; + assert (da->descr == hostaddrs[i]); + + /* The map for the dynamic array itself is never copied from during + unmapping, its the data rows that count. Set copy from flags are + set to false here. */ + tgt->list[i].copy_from = false; + tgt->list[i].always_copy_from = false; + + size_t align = (size_t) 1 << (kind >> rshift); + tgt_size = (tgt_size + align - 1) & ~(align - 1); + + /* For the map of the dynamic array itself, adjust so that the passed + device address points to the beginning of the ptrblock. */ + tgt->list[i].key->tgt_offset = tgt_size; + + void *target_ptrblock = (void*) tgt->tgt_start + tgt_size; + tgt_size += da->ptrblock_size; + + /* Add splay key for each data row in current DA. */ + for (size_t j = 0; j < da->data_row_num; j++) + { + row = host_data_rows[row_start + j]; + row_desc = &tgt->list[mapnum + row_start + j]; + + cur_node.host_start = (uintptr_t) row; + cur_node.host_end = cur_node.host_start + da->data_row_size; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n) + { + assert (n->refcount != REFCOUNT_LINK); + gomp_map_vars_existing (devicep, n, &cur_node, row_desc, + kind & typemask); + target_row_addr = n->tgt->tgt_start + n->tgt_offset; + } + else + { + tgt->refcount++; + + splay_tree_key k = &array->key; + k->host_start = (uintptr_t) row; + k->host_end = k->host_start + da->data_row_size; + + k->tgt = tgt; + k->refcount = 1; + k->link_key = NULL; + tgt_size = (tgt_size + align - 1) & ~(align - 1); + target_row_addr = tgt->tgt_start + tgt_size; + k->tgt_offset = tgt_size; + tgt_size += da->data_row_size; + + row_desc->key = k; + row_desc->copy_from + = GOMP_MAP_COPY_FROM_P (kind & typemask); + row_desc->always_copy_from + = GOMP_MAP_COPY_FROM_P (kind & typemask); + row_desc->offset = 0; + row_desc->length = da->data_row_size; + + array->left = NULL; + array->right = NULL; + splay_tree_insert (mem_map, array); + + if (GOMP_MAP_COPY_TO_P (kind & typemask)) + gomp_copy_host2dev (devicep, + (void *) tgt->tgt_start + k->tgt_offset, + (void *) k->host_start, + da->data_row_size); + array++; + } + target_data_rows[row_start + j] = (void *) target_row_addr; + } + + /* Now we have the target memory allocated, and target offsets of all + row blocks assigned and calculated, we can construct the + accelerator side ptrblock and copy it in. */ + if (da->ptrblock_size) + { + void *ptrblock = gomp_dynamic_array_create_ptrblock + (da, target_ptrblock, target_data_rows + row_start); + gomp_copy_host2dev (devicep, target_ptrblock, ptrblock, + da->ptrblock_size); + free (ptrblock); + } + + row_start += da->data_row_num; + } + assert (row_start == da_data_row_num && da_index == da_info_num); } + if (da_data_row_num) + { + free (host_data_rows); + free (target_data_rows); + } + if (pragma_kind == GOMP_MAP_VARS_TARGET) { for (i = 0; i < mapnum; i++) Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c (revision 0) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c (revision 244259) @@ -0,0 +1,45 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include "da-utils.h" + +int main (void) +{ + int n = 20, x = 5, y = 12; + int *****a = (int *****) create_da (sizeof (int), n, 5); + + int sum1 = 0, sum2 = 0, sum3 = 0; + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + for (int l = 0; l < n; l++) + for (int m = 0; m < n; m++) + { + a[i][j][k][l][m] = 1; + sum1++; + } + + #pragma acc parallel copy (a[x:y][x:y][x:y][x:y][x:y]) copy(sum2) + { + for (int i = x; i < x + y; i++) + for (int j = x; j < x + y; j++) + for (int k = x; k < x + y; k++) + for (int l = x; l < x + y; l++) + for (int m = x; m < x + y; m++) + { + a[i][j][k][l][m] = 0; + sum2++; + } + } + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + for (int l = 0; l < n; l++) + for (int m = 0; m < n; m++) + sum3 += a[i][j][k][l][m]; + + assert (sum1 == sum2 + sum3); + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c (revision 0) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c (revision 244259) @@ -0,0 +1,36 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include "da-utils.h" + +int main (void) +{ + int n = 128; + double ***a = (double ***) create_da (sizeof (double), n, 3); + double ***b = (double ***) create_da (sizeof (double), n, 3); + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + a[i][j][k] = i + j + k + i * j * k; + + /* This test exercises async copyout of dynamic array rows. */ + #pragma acc parallel copyin(a[0:n][0:n][0:n]) copyout(b[0:n][0:n][0:n]) async(5) + { + #pragma acc loop gang + for (int i = 0; i < n; i++) + #pragma acc loop vector + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + b[i][j][k] = a[i][j][k] * 2.0; + } + + #pragma acc wait (5) + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + assert (b[i][j][k] == a[i][j][k] * 2.0); + + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h (revision 0) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h (revision 244259) @@ -0,0 +1,44 @@ +#include +#include +#include +#include + +/* Allocate and create a pointer based NDIMS-dimensional array, + each dimension DIMLEN long, with ELSIZE sized data elements. */ +void * +create_da (size_t elsize, int dimlen, int ndims) +{ + size_t blk_size = 0; + size_t n = 1; + + for (int i = 0; i < ndims - 1; i++) + { + n *= dimlen; + blk_size += sizeof (void *) * n; + } + size_t data_rows_num = n; + size_t data_rows_offset = blk_size; + blk_size += elsize * n * dimlen; + + void *blk = (void *) malloc (blk_size); + memset (blk, 0, blk_size); + void **curr_dim = (void **) blk; + n = 1; + + for (int d = 0; d < ndims - 1; d++) + { + uintptr_t next_dim = (uintptr_t) (curr_dim + n * dimlen); + size_t next_dimlen = dimlen * (d < ndims - 2 ? sizeof (void *) : elsize); + + for (int b = 0; b < n; b++) + for (int i = 0; i < dimlen; i++) + if (d < ndims - 1) + curr_dim[b * dimlen + i] + = (void*) (next_dim + b * dimlen * next_dimlen + i * next_dimlen); + + n *= dimlen; + curr_dim = (void**) next_dim; + } + assert (n == data_rows_num); + return blk; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c (revision 0) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c (revision 244259) @@ -0,0 +1,103 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include + +#define n 100 +#define m 100 + +int b[n][m]; + +void +test1 (void) +{ + int i, j, *a[100]; + + /* Array of pointers form test. */ + for (i = 0; i < n; i++) + { + a[i] = (int *)malloc (sizeof (int) * m); + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + { + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + /* Clean up. */ + free (a[i]); + } +} + +void +test2 (void) +{ + int i, j, **a = (int **) malloc (sizeof (int *) * n); + + /* Separately allocated blocks. */ + for (i = 0; i < n; i++) + { + a[i] = (int *)malloc (sizeof (int) * m); + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + { + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + /* Clean up. */ + free (a[i]); + } + free (a); +} + +void +test3 (void) +{ + int i, j, **a = (int **) malloc (sizeof (int *) * n); + a[0] = (int *) malloc (sizeof (int) * n * m); + + /* Rows allocated in one contiguous block. */ + for (i = 0; i < n; i++) + { + a[i] = *a + i * m; + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + + free (a[0]); + free (a); +} + +int +main (void) +{ + test1 (); + test2 (); + test3 (); + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c (revision 0) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c (revision 244259) @@ -0,0 +1,37 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include "da-utils.h" + +int +main (void) +{ + int n = 10; + int ***a = (int ***) create_da (sizeof (int), n, 3); + int ***b = (int ***) create_da (sizeof (int), n, 3); + int ***c = (int ***) create_da (sizeof (int), n, 3); + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + { + a[i][j][k] = i + j * k + k; + b[i][j][k] = j + k * i + i * j; + c[i][j][k] = a[i][j][k]; + } + + #pragma acc parallel copy (a[0:n][0:n][0:n]) copyin (b[0:n][0:n][0:n]) + { + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + a[i][j][k] += b[k][j][i] + i + j + k; + } + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + assert (a[i][j][k] == c[i][j][k] + b[k][j][i] + i + j + k); + + return 0; +}