Message ID | 20200909150818.313699-3-nitesh@redhat.com |
---|---|
State | New |
Headers | show |
Series | isolation: limit msix vectors based on housekeeping CPUs | expand |
Nitesh Narayan Lal wrote: > In a realtime environment, it is essential to isolate unwanted IRQs from > isolated CPUs to prevent latency overheads. Creating MSIX vectors only > based on the online CPUs could lead to a potential issue on an RT setup > that has several isolated CPUs but a very few housekeeping CPUs. This is > because in these kinds of setups an attempt to move the IRQs to the > limited housekeeping CPUs from isolated CPUs might fail due to the per > CPU vector limit. This could eventually result in latency spikes because > of the IRQ threads that we fail to move from isolated CPUs. > > This patch prevents i40e to add vectors only based on available > housekeeping CPUs by using num_housekeeping_cpus(). > > Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> The driver changes are straightforward, but this isn't the only driver with this issue, right? I'm sure ixgbe and ice both have this problem too, you should fix them as well, at a minimum, and probably other vendors drivers: $ rg -c --stats num_online_cpus drivers/net/ethernet ... 50 files contained matches for this patch i40e Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
On 9/17/20 2:23 PM, Jesse Brandeburg wrote: > Nitesh Narayan Lal wrote: > >> In a realtime environment, it is essential to isolate unwanted IRQs from >> isolated CPUs to prevent latency overheads. Creating MSIX vectors only >> based on the online CPUs could lead to a potential issue on an RT setup >> that has several isolated CPUs but a very few housekeeping CPUs. This is >> because in these kinds of setups an attempt to move the IRQs to the >> limited housekeeping CPUs from isolated CPUs might fail due to the per >> CPU vector limit. This could eventually result in latency spikes because >> of the IRQ threads that we fail to move from isolated CPUs. >> >> This patch prevents i40e to add vectors only based on available >> housekeeping CPUs by using num_housekeeping_cpus(). >> >> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> > The driver changes are straightforward, but this isn't the only driver > with this issue, right? Indeed, I was hoping to modify them over the time with some testing. > I'm sure ixgbe and ice both have this problem > too, you should fix them as well, at a minimum, and probably other > vendors drivers: Sure, I can atleast include ixgbe and ice in the next posting if that makes sense. The reason I skipped them is that I was not very sure about the right way to test these changes. > > $ rg -c --stats num_online_cpus drivers/net/ethernet > ... > 50 files contained matches > > for this patch i40e > Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> >
On Thu, Sep 17, 2020 at 11:23:59AM -0700, Jesse Brandeburg wrote: > Nitesh Narayan Lal wrote: > > > In a realtime environment, it is essential to isolate unwanted IRQs from > > isolated CPUs to prevent latency overheads. Creating MSIX vectors only > > based on the online CPUs could lead to a potential issue on an RT setup > > that has several isolated CPUs but a very few housekeeping CPUs. This is > > because in these kinds of setups an attempt to move the IRQs to the > > limited housekeeping CPUs from isolated CPUs might fail due to the per > > CPU vector limit. This could eventually result in latency spikes because > > of the IRQ threads that we fail to move from isolated CPUs. > > > > This patch prevents i40e to add vectors only based on available > > housekeeping CPUs by using num_housekeeping_cpus(). > > > > Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> > > The driver changes are straightforward, but this isn't the only driver > with this issue, right? I'm sure ixgbe and ice both have this problem > too, you should fix them as well, at a minimum, and probably other > vendors drivers: > > $ rg -c --stats num_online_cpus drivers/net/ethernet > ... > 50 files contained matches Ouch, I was indeed surprised that these MSI vector allocations were done at the driver level and not at some $SUBSYSTEM level. The logic is already there in the driver so I wouldn't oppose to this very patch but would a shared infrastructure make sense for this? Something that would also handle hotplug operations? Does it possibly go even beyond networking drivers? Thanks. > > for this patch i40e > Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
On 9/21/20 6:58 PM, Frederic Weisbecker wrote: > On Thu, Sep 17, 2020 at 11:23:59AM -0700, Jesse Brandeburg wrote: >> Nitesh Narayan Lal wrote: >> >>> In a realtime environment, it is essential to isolate unwanted IRQs from >>> isolated CPUs to prevent latency overheads. Creating MSIX vectors only >>> based on the online CPUs could lead to a potential issue on an RT setup >>> that has several isolated CPUs but a very few housekeeping CPUs. This is >>> because in these kinds of setups an attempt to move the IRQs to the >>> limited housekeeping CPUs from isolated CPUs might fail due to the per >>> CPU vector limit. This could eventually result in latency spikes because >>> of the IRQ threads that we fail to move from isolated CPUs. >>> >>> This patch prevents i40e to add vectors only based on available >>> housekeeping CPUs by using num_housekeeping_cpus(). >>> >>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> >> The driver changes are straightforward, but this isn't the only driver >> with this issue, right? I'm sure ixgbe and ice both have this problem >> too, you should fix them as well, at a minimum, and probably other >> vendors drivers: >> >> $ rg -c --stats num_online_cpus drivers/net/ethernet >> ... >> 50 files contained matches > Ouch, I was indeed surprised that these MSI vector allocations were done > at the driver level and not at some $SUBSYSTEM level. > > The logic is already there in the driver so I wouldn't oppose to this very patch > but would a shared infrastructure make sense for this? Something that would > also handle hotplug operations? > > Does it possibly go even beyond networking drivers? From a generic solution perspective, I think it makes sense to come up with a shared infrastructure. Something that can be consumed by all the drivers and maybe hotplug operations as well (I will have to further explore the hotplug part). However, there are RT workloads that are getting affected because of this issue, so does it make sense to go ahead with this per-driver basis approach for now? Since a generic solution will require a fair amount of testing and understanding of different drivers. Having said that, I can definetly start looking in that direction. Thanks for reviewing. > Thanks. > >> for this patch i40e >> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> -- Nitesh
On Mon, Sep 21, 2020 at 11:08:20PM -0400, Nitesh Narayan Lal wrote: > > On 9/21/20 6:58 PM, Frederic Weisbecker wrote: > > On Thu, Sep 17, 2020 at 11:23:59AM -0700, Jesse Brandeburg wrote: > >> Nitesh Narayan Lal wrote: > >> > >>> In a realtime environment, it is essential to isolate unwanted IRQs from > >>> isolated CPUs to prevent latency overheads. Creating MSIX vectors only > >>> based on the online CPUs could lead to a potential issue on an RT setup > >>> that has several isolated CPUs but a very few housekeeping CPUs. This is > >>> because in these kinds of setups an attempt to move the IRQs to the > >>> limited housekeeping CPUs from isolated CPUs might fail due to the per > >>> CPU vector limit. This could eventually result in latency spikes because > >>> of the IRQ threads that we fail to move from isolated CPUs. > >>> > >>> This patch prevents i40e to add vectors only based on available > >>> housekeeping CPUs by using num_housekeeping_cpus(). > >>> > >>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> > >> The driver changes are straightforward, but this isn't the only driver > >> with this issue, right? I'm sure ixgbe and ice both have this problem > >> too, you should fix them as well, at a minimum, and probably other > >> vendors drivers: > >> > >> $ rg -c --stats num_online_cpus drivers/net/ethernet > >> ... > >> 50 files contained matches > > Ouch, I was indeed surprised that these MSI vector allocations were done > > at the driver level and not at some $SUBSYSTEM level. > > > > The logic is already there in the driver so I wouldn't oppose to this very patch > > but would a shared infrastructure make sense for this? Something that would > > also handle hotplug operations? > > > > Does it possibly go even beyond networking drivers? > > From a generic solution perspective, I think it makes sense to come up with a > shared infrastructure. > Something that can be consumed by all the drivers and maybe hotplug operations > as well (I will have to further explore the hotplug part). That would be great. I'm completely clueless about those MSI things and the actual needs of those drivers. Now it seems to me that if several CPUs become offline, or as is planned in the future, CPU isolation gets enabled/disabled through cpuset, then the vectors may need some reorganization. But I don't also want to push toward a complicated solution to handle CPU hotplug if there is no actual problem to solve there. So I let you guys judge. > However, there are RT workloads that are getting affected because of this > issue, so does it make sense to go ahead with this per-driver basis approach > for now? Yep that sounds good. > > Since a generic solution will require a fair amount of testing and > understanding of different drivers. Having said that, I can definetly start > looking in that direction. Thanks a lot!
On 9/22/20 5:54 AM, Frederic Weisbecker wrote: > On Mon, Sep 21, 2020 at 11:08:20PM -0400, Nitesh Narayan Lal wrote: >> On 9/21/20 6:58 PM, Frederic Weisbecker wrote: >>> On Thu, Sep 17, 2020 at 11:23:59AM -0700, Jesse Brandeburg wrote: >>>> Nitesh Narayan Lal wrote: >>>> >>>>> In a realtime environment, it is essential to isolate unwanted IRQs from >>>>> isolated CPUs to prevent latency overheads. Creating MSIX vectors only >>>>> based on the online CPUs could lead to a potential issue on an RT setup >>>>> that has several isolated CPUs but a very few housekeeping CPUs. This is >>>>> because in these kinds of setups an attempt to move the IRQs to the >>>>> limited housekeeping CPUs from isolated CPUs might fail due to the per >>>>> CPU vector limit. This could eventually result in latency spikes because >>>>> of the IRQ threads that we fail to move from isolated CPUs. >>>>> >>>>> This patch prevents i40e to add vectors only based on available >>>>> housekeeping CPUs by using num_housekeeping_cpus(). >>>>> >>>>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> >>>> The driver changes are straightforward, but this isn't the only driver >>>> with this issue, right? I'm sure ixgbe and ice both have this problem >>>> too, you should fix them as well, at a minimum, and probably other >>>> vendors drivers: >>>> >>>> $ rg -c --stats num_online_cpus drivers/net/ethernet >>>> ... >>>> 50 files contained matches >>> Ouch, I was indeed surprised that these MSI vector allocations were done >>> at the driver level and not at some $SUBSYSTEM level. >>> >>> The logic is already there in the driver so I wouldn't oppose to this very patch >>> but would a shared infrastructure make sense for this? Something that would >>> also handle hotplug operations? >>> >>> Does it possibly go even beyond networking drivers? >> From a generic solution perspective, I think it makes sense to come up with a >> shared infrastructure. >> Something that can be consumed by all the drivers and maybe hotplug operations >> as well (I will have to further explore the hotplug part). > That would be great. I'm completely clueless about those MSI things and the > actual needs of those drivers. Now it seems to me that if several CPUs become > offline, or as is planned in the future, CPU isolation gets enabled/disabled > through cpuset, then the vectors may need some reorganization. +1 > > But I don't also want to push toward a complicated solution to handle CPU hotplug > if there is no actual problem to solve there. Sure, even I am not particularly sure about the hotplug scenarios. > So I let you guys judge. > >> However, there are RT workloads that are getting affected because of this >> issue, so does it make sense to go ahead with this per-driver basis approach >> for now? > Yep that sounds good. Thank you for confirming. > >> Since a generic solution will require a fair amount of testing and >> understanding of different drivers. Having said that, I can definetly start >> looking in that direction. > Thanks a lot! >
On Tue, Sep 22, 2020 at 09:34:02AM -0400, Nitesh Narayan Lal wrote: > On 9/22/20 5:54 AM, Frederic Weisbecker wrote: > > But I don't also want to push toward a complicated solution to handle CPU hotplug > > if there is no actual problem to solve there. > > Sure, even I am not particularly sure about the hotplug scenarios. Surely when isolation is something that will be configurable through cpuset, it will become interesting. For now it's probably not worth adding hundreds lines of code if nobody steps into such issue yet. What is probably more worthy for now is some piece of code to consolidate the allocation of those MSI vectors on top of the number of housekeeping CPUs. Thanks.
On 9/22/20 4:44 PM, Frederic Weisbecker wrote: > On Tue, Sep 22, 2020 at 09:34:02AM -0400, Nitesh Narayan Lal wrote: >> On 9/22/20 5:54 AM, Frederic Weisbecker wrote: >>> But I don't also want to push toward a complicated solution to handle CPU hotplug >>> if there is no actual problem to solve there. >> Sure, even I am not particularly sure about the hotplug scenarios. > Surely when isolation is something that will be configurable through > cpuset, it will become interesting. For now it's probably not worth > adding hundreds lines of code if nobody steps into such issue yet. > > What is probably more worthy for now is some piece of code to consolidate > the allocation of those MSI vectors on top of the number of housekeeping > CPUs. +1 > > Thanks. > -- Thanks Nitesh
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 2e433fdbf2c3..3b4cd4b3de85 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -5,6 +5,7 @@ #include <linux/of_net.h> #include <linux/pci.h> #include <linux/bpf.h> +#include <linux/sched/isolation.h> #include <generated/utsrelease.h> /* Local includes */ @@ -11002,7 +11003,7 @@ static int i40e_init_msix(struct i40e_pf *pf) * will use any remaining vectors to reach as close as we can to the * number of online CPUs. */ - cpus = num_online_cpus(); + cpus = num_housekeeping_cpus(); pf->num_lan_msix = min_t(int, cpus, vectors_left / 2); vectors_left -= pf->num_lan_msix;
In a realtime environment, it is essential to isolate unwanted IRQs from isolated CPUs to prevent latency overheads. Creating MSIX vectors only based on the online CPUs could lead to a potential issue on an RT setup that has several isolated CPUs but a very few housekeeping CPUs. This is because in these kinds of setups an attempt to move the IRQs to the limited housekeeping CPUs from isolated CPUs might fail due to the per CPU vector limit. This could eventually result in latency spikes because of the IRQ threads that we fail to move from isolated CPUs. This patch prevents i40e to add vectors only based on available housekeeping CPUs by using num_housekeeping_cpus(). Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> --- drivers/net/ethernet/intel/i40e/i40e_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)