Message ID | 877eskygfu.fsf@linaro.org |
---|---|
State | Accepted |
Commit | 01b9bf0615e178e6c31f8d010cb8df9382b3c2d4 |
Headers | show |
Series | inline_small_functions speedup | expand |
> After inlining A into B, inline_small_functions updates the information > for (most) callees and callers of the new B: > > update_callee_keys (&edge_heap, where, updated_nodes); > [...] > /* Our profitability metric can depend on local properties > such as number of inlinable calls and size of the function body. > After inlining these properties might change for the function we > inlined into (since it's body size changed) and for the functions > called by function we inlined (since number of it inlinable callers > might change). */ > update_caller_keys (&edge_heap, where, updated_nodes, NULL); > > These functions in turn call can_inline_edge_p for most of the associated > edges: > > if (can_inline_edge_p (edge, false) > && want_inline_small_function_p (edge, false)) > update_edge_key (heap, edge); > > can_inline_edge_p indirectly calls estimate_calls_size_and_time > on the caller node, which seems to recursively process all callee > edges rooted at the node. It looks from this like algorithm can be > at least quadratic in the worst case. Yep, I have patch to add overall size into summary that makes it constant time. Will need to clean it up for mainline. > > Maybe there's something we can do to make can_inline_edge_p cheaper, but > since neither of these two calls is responsible for reporting an inline > failure reason, it seems cheaper to test want_inline_small_function_p > first, so that we don't calculate an estimate for something that we > already know isn't a "small function". I think the only change > needed to make that work is to check for CIF_FINAL_ERROR in > want_inline_small_function_p; at the moment we rely on can_inline_edge_p > to make that check. > > This cuts the time to build optabs.ii by over 4% with an > --enable-checking=release compiler on x86_64-linux-gnu. I've seen more > dramatic wins on aarch64-linux-gnu due to the NUM_POLY_INT_COEFFS==2 > thing. The patch doesn't affect the output code. > > Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. > OK to install? Anyway this seems to make sense to do. > > Richard > > > 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> > > gcc/ > * ipa-inline.c (want_inline_small_function_p): Return false if > inlining has already failed with CIF_FINAL_ERROR. > (update_caller_keys): Call want_inline_small_function_p before > can_inline_edge_p. > (update_callee_keys): Likewise. OK, thanks! Honza
Index: gcc/ipa-inline.c =================================================================== --- gcc/ipa-inline.c 2018-01-09 14:29:35.151550415 +0000 +++ gcc/ipa-inline.c 2018-01-14 08:43:35.653122186 +0000 @@ -706,7 +706,11 @@ want_inline_small_function_p (struct cgr bool want_inline = true; struct cgraph_node *callee = e->callee->ultimate_alias_target (); - if (DECL_DISREGARD_INLINE_LIMITS (callee->decl)) + /* Allow this function to be called before can_inline_edge_p, + since it's usually cheaper. */ + if (cgraph_inline_failed_type (e->inline_failed) == CIF_FINAL_ERROR) + want_inline = false; + else if (DECL_DISREGARD_INLINE_LIMITS (callee->decl)) ; else if (!DECL_DECLARED_INLINE_P (callee->decl) && !opt_for_fn (e->caller->decl, flag_inline_small_functions)) @@ -1312,8 +1316,8 @@ update_caller_keys (edge_heap_t *heap, s if (!check_inlinablity_for || check_inlinablity_for == edge) { - if (can_inline_edge_p (edge, false) - && want_inline_small_function_p (edge, false)) + if (want_inline_small_function_p (edge, false) + && can_inline_edge_p (edge, false)) update_edge_key (heap, edge); else if (edge->aux) { @@ -1356,8 +1360,8 @@ update_callee_keys (edge_heap_t *heap, s && avail >= AVAIL_AVAILABLE && !bitmap_bit_p (updated_nodes, callee->uid)) { - if (can_inline_edge_p (e, false) - && want_inline_small_function_p (e, false)) + if (want_inline_small_function_p (e, false) + && can_inline_edge_p (e, false)) update_edge_key (heap, e); else if (e->aux) {