diff mbox series

[v2] fs/epoll: Enable non-blocking busypoll when epoll timeout is 0

Message ID 1593027056-43779-1-git-send-email-sridhar.samudrala@intel.com
State New
Headers show
Series [v2] fs/epoll: Enable non-blocking busypoll when epoll timeout is 0 | expand

Commit Message

Samudrala, Sridhar June 24, 2020, 7:30 p.m. UTC
This patch triggers non-blocking busy poll when busy_poll is enabled,
epoll is called with a timeout of 0 and is associated with a napi_id.
This enables an app thread to go through napi poll routine once by
calling epoll with a 0 timeout.

poll/select with a 0 timeout behave in a similar manner.

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>

v2:
Added net_busy_loop_on() check (Eric)

---
 fs/eventpoll.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Samudrala, Sridhar July 6, 2020, 7:50 p.m. UTC | #1
Resending.

Dave, Eric,

Can we get this in via net-next as this is targeted for networking use 
case using epoll/busypoll.

Thanks
Sridhar

On 6/24/2020 12:30 PM, Sridhar Samudrala wrote:
> This patch triggers non-blocking busy poll when busy_poll is enabled,

> epoll is called with a timeout of 0 and is associated with a napi_id.

> This enables an app thread to go through napi poll routine once by

> calling epoll with a 0 timeout.

> 

> poll/select with a 0 timeout behave in a similar manner.

> 

> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>

> 

> v2:

> Added net_busy_loop_on() check (Eric)

> 

> ---

>   fs/eventpoll.c | 13 +++++++++++++

>   1 file changed, 13 insertions(+)

> 

> diff --git a/fs/eventpoll.c b/fs/eventpoll.c

> index 12eebcdea9c8..c33cc98d3848 100644

> --- a/fs/eventpoll.c

> +++ b/fs/eventpoll.c

> @@ -1847,6 +1847,19 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,

>   		eavail = ep_events_available(ep);

>   		write_unlock_irq(&ep->lock);

>   

> +		/*

> +		 * Trigger non-blocking busy poll if timeout is 0 and there are

> +		 * no events available. Passing timed_out(1) to ep_busy_loop

> +		 * will make sure that busy polling is triggered only once.

> +		 */

> +		if (!eavail && net_busy_loop_on()) {

> +			ep_busy_loop(ep, timed_out);

> +			write_lock_irq(&ep->lock);

> +			eavail = ep_events_available(ep);

> +			write_unlock_irq(&ep->lock);

> +		}

> +

>   		goto send_events;

>   	}

>   

>
Alexander Duyck July 6, 2020, 8:36 p.m. UTC | #2
On Wed, Jun 24, 2020 at 4:03 PM Sridhar Samudrala
<sridhar.samudrala@intel.com> wrote:
>

> This patch triggers non-blocking busy poll when busy_poll is enabled,

> epoll is called with a timeout of 0 and is associated with a napi_id.

> This enables an app thread to go through napi poll routine once by

> calling epoll with a 0 timeout.

>

> poll/select with a 0 timeout behave in a similar manner.

>

> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>

>

> v2:

> Added net_busy_loop_on() check (Eric)

>

> ---

>  fs/eventpoll.c | 13 +++++++++++++

>  1 file changed, 13 insertions(+)

>

> diff --git a/fs/eventpoll.c b/fs/eventpoll.c

> index 12eebcdea9c8..c33cc98d3848 100644

> --- a/fs/eventpoll.c

> +++ b/fs/eventpoll.c

> @@ -1847,6 +1847,19 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,

>                 eavail = ep_events_available(ep);

>                 write_unlock_irq(&ep->lock);

>

> +               /*

> +                * Trigger non-blocking busy poll if timeout is 0 and there are

> +                * no events available. Passing timed_out(1) to ep_busy_loop

> +                * will make sure that busy polling is triggered only once.

> +                */

> +               if (!eavail && net_busy_loop_on()) {

> +                       ep_busy_loop(ep, timed_out);

> +                       write_lock_irq(&ep->lock);

> +                       eavail = ep_events_available(ep);

> +                       write_unlock_irq(&ep->lock);

> +               }

> +

>                 goto send_events;

>         }


Doesn't this create a scenario where the NAPI ID will not be
disassociated if the polling fails?

It seems like in order to keep parity with existing busy poll code you
should need to check for !eavail after you release the lock and if
that is true you should be calling ep_reset_busy_poll_napi_id so that
you disassociate the NAPI ID from the eventpoll.
Samudrala, Sridhar July 6, 2020, 10:33 p.m. UTC | #3
On 7/6/2020 1:36 PM, Alexander Duyck wrote:
> On Wed, Jun 24, 2020 at 4:03 PM Sridhar Samudrala

> <sridhar.samudrala@intel.com> wrote:

>>

>> This patch triggers non-blocking busy poll when busy_poll is enabled,

>> epoll is called with a timeout of 0 and is associated with a napi_id.

>> This enables an app thread to go through napi poll routine once by

>> calling epoll with a 0 timeout.

>>

>> poll/select with a 0 timeout behave in a similar manner.

>>

>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>

>>

>> v2:

>> Added net_busy_loop_on() check (Eric)

>>

>> ---

>>   fs/eventpoll.c | 13 +++++++++++++

>>   1 file changed, 13 insertions(+)

>>

>> diff --git a/fs/eventpoll.c b/fs/eventpoll.c

>> index 12eebcdea9c8..c33cc98d3848 100644

>> --- a/fs/eventpoll.c

>> +++ b/fs/eventpoll.c

>> @@ -1847,6 +1847,19 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,

>>                  eavail = ep_events_available(ep);

>>                  write_unlock_irq(&ep->lock);

>>

>> +               /*

>> +                * Trigger non-blocking busy poll if timeout is 0 and there are

>> +                * no events available. Passing timed_out(1) to ep_busy_loop

>> +                * will make sure that busy polling is triggered only once.

>> +                */

>> +               if (!eavail && net_busy_loop_on()) {

>> +                       ep_busy_loop(ep, timed_out);

>> +                       write_lock_irq(&ep->lock);

>> +                       eavail = ep_events_available(ep);

>> +                       write_unlock_irq(&ep->lock);

>> +               }

>> +

>>                  goto send_events;

>>          }

> 

> Doesn't this create a scenario where the NAPI ID will not be

> disassociated if the polling fails?

> 

> It seems like in order to keep parity with existing busy poll code you

> should need to check for !eavail after you release the lock and if

> that is true you should be calling ep_reset_busy_poll_napi_id so that

> you disassociate the NAPI ID from the eventpoll.


We are not going to sleep in this code path. I think napi id needs to be 
reset only if we are going to sleep and a wakeup is expected to set the 
nap_id again.
Alexander Duyck July 6, 2020, 10:57 p.m. UTC | #4
On Mon, Jul 6, 2020 at 3:33 PM Samudrala, Sridhar
<sridhar.samudrala@intel.com> wrote:
>

>

>

> On 7/6/2020 1:36 PM, Alexander Duyck wrote:

> > On Wed, Jun 24, 2020 at 4:03 PM Sridhar Samudrala

> > <sridhar.samudrala@intel.com> wrote:

> >>

> >> This patch triggers non-blocking busy poll when busy_poll is enabled,

> >> epoll is called with a timeout of 0 and is associated with a napi_id.

> >> This enables an app thread to go through napi poll routine once by

> >> calling epoll with a 0 timeout.

> >>

> >> poll/select with a 0 timeout behave in a similar manner.

> >>

> >> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>

> >>

> >> v2:

> >> Added net_busy_loop_on() check (Eric)

> >>

> >> ---

> >>   fs/eventpoll.c | 13 +++++++++++++

> >>   1 file changed, 13 insertions(+)

> >>

> >> diff --git a/fs/eventpoll.c b/fs/eventpoll.c

> >> index 12eebcdea9c8..c33cc98d3848 100644

> >> --- a/fs/eventpoll.c

> >> +++ b/fs/eventpoll.c

> >> @@ -1847,6 +1847,19 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,

> >>                  eavail = ep_events_available(ep);

> >>                  write_unlock_irq(&ep->lock);

> >>

> >> +               /*

> >> +                * Trigger non-blocking busy poll if timeout is 0 and there are

> >> +                * no events available. Passing timed_out(1) to ep_busy_loop

> >> +                * will make sure that busy polling is triggered only once.

> >> +                */

> >> +               if (!eavail && net_busy_loop_on()) {

> >> +                       ep_busy_loop(ep, timed_out);

> >> +                       write_lock_irq(&ep->lock);

> >> +                       eavail = ep_events_available(ep);

> >> +                       write_unlock_irq(&ep->lock);

> >> +               }

> >> +

> >>                  goto send_events;

> >>          }

> >

> > Doesn't this create a scenario where the NAPI ID will not be

> > disassociated if the polling fails?

> >

> > It seems like in order to keep parity with existing busy poll code you

> > should need to check for !eavail after you release the lock and if

> > that is true you should be calling ep_reset_busy_poll_napi_id so that

> > you disassociate the NAPI ID from the eventpoll.

>

> We are not going to sleep in this code path. I think napi id needs to be

> reset only if we are going to sleep and a wakeup is expected to set the

> nap_id again.


That wasn't my understanding of how that worked. Basically the whole
point of clearing the napi ID from the eventpoll instance is to handle
the fact that the NAPI instance may no longer be routing packets to
us. So when the NAPI instance failed to provide us with a packet when
we made a call to busy poll we would clear the napi_id which would
then allow a new napi_id to be assigned. Otherwise what you end up
with is an eventpoll with a static napi_id value since it will never
be cleared or updated once it is set, assuming all the calls to this
function have a timeout of 0 in your test case.

The problem is there isn't a 1:1 mapping between eventpolls and
sockets or NAPI IDs. It is a many:1 and as a result we have queues go
idle while others become busy so we need a way to shift from an idle
one to a busy one in the event of the traffic shifting from one port
to another.
diff mbox series

Patch

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 12eebcdea9c8..c33cc98d3848 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1847,6 +1847,19 @@  static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 		eavail = ep_events_available(ep);
 		write_unlock_irq(&ep->lock);
 
+		/*
+		 * Trigger non-blocking busy poll if timeout is 0 and there are
+		 * no events available. Passing timed_out(1) to ep_busy_loop
+		 * will make sure that busy polling is triggered only once.
+		 */
+		if (!eavail && net_busy_loop_on()) {
+			ep_busy_loop(ep, timed_out);
+			write_lock_irq(&ep->lock);
+			eavail = ep_events_available(ep);
+			write_unlock_irq(&ep->lock);
+		}
+
 		goto send_events;
 	}
 
-- 
2.25.4