Message ID | 20210112040146.1.I9aa2b9dd39a6ac9235ec55a8c56020538aa212fb@changeid |
---|---|
State | New |
Headers | show |
Series | [1/2] PCI/ASPM: Disable ASPM until its LTR and L1ss state is restored | expand |
Hi Victor, Thanks for the patch. Improving suspend/resume performance is always a good thing! On Tue, Jan 12, 2021 at 04:02:04AM +0000, Victor Ding wrote: > Right after powering up, the device may have ASPM enabled; however, > its LTR and/or L1ss controls may not be in the desired states; hence, > the device may enter L1.2 undesirably and cause resume performance > penalty. This is especially problematic if ASPM related control > registers are modified before a suspension. s/L1ss/L1SS/ (several occurrences) to match existing usage. > Therefore, ASPM should disabled until its LTR and L1ss states are > fully restored. > > Signed-off-by: Victor Ding <victording@google.com> > --- > > drivers/pci/pci.c | 11 +++++++++++ > drivers/pci/pci.h | 2 ++ > drivers/pci/pcie/aspm.c | 2 +- > 3 files changed, 14 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index eb323af34f1e..428de433f2e6 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1660,6 +1660,17 @@ void pci_restore_state(struct pci_dev *dev) > if (!dev->state_saved) > return; > > + /* > + * Right after powering up, the device may have ASPM enabled; I think this could be stated more precisely. "Right after powering up," ASPM should be *disabled* because ASPM Control in the Link Control register powers up as zero (disabled). > + * however, its LTR and/or L1ss controls may not be in the desired > + * states; as a result, the device may enter L1.2 undesirably and > + * cause resume performance penalty. > + * Therefore, ASPM is disabled until its LTR and L1ss states are > + * fully restored. > + * (enabling ASPM is part of pci_restore_pcie_state) If we're enabling L1.2 before LTR has been configured, that's definitely a bug. But I don't see how that happens. The current code looks like: pci_restore_state pci_restore_ltr_state pci_write_config_word(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, *cap++) pci_write_config_word(dev, ltr + PCI_LTR_MAX_NOSNOOP_LAT, *cap++) pci_restore_aspm_l1ss_state pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, *cap++) pci_restore_pcie_state pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]) We *do* restore PCI_L1SS_CTL1, which contains "ASPM L1.2 Enable", before we restore PCI_EXP_LNKCTL, which contains "ASPM Control", but I don't think "ASPM L1.2 Enable" by itself should be enough to allow hardware to enter L1.2. Reading PCIe r5.0, sec 5.5.1, it seems like hardware should ignore the PCI_L1SS_CTL1 bits unless ASPM L1 entry is enabled in PCI_EXP_LNKCTL. What am I missing? > + */ > + pcie_config_aspm_dev(dev, 0); > + > /* > * Restore max latencies (in the LTR capability) before enabling > * LTR itself (in the PCIe capability). > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index e9ea5dfaa3e0..f774bd6d2555 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -564,6 +564,7 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev); > void pcie_aspm_exit_link_state(struct pci_dev *pdev); > void pcie_aspm_pm_state_change(struct pci_dev *pdev); > void pcie_aspm_powersave_config_link(struct pci_dev *pdev); > +void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val); > void pci_save_aspm_l1ss_state(struct pci_dev *dev); > void pci_restore_aspm_l1ss_state(struct pci_dev *dev); > #else > @@ -571,6 +572,7 @@ static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } > static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } > static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { } > static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { } > +static inline void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) { } > static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { } > static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { } > #endif > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index a08e7d6dc248..45535b4e1595 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -778,7 +778,7 @@ void pci_restore_aspm_l1ss_state(struct pci_dev *dev) > pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL2, *cap++); > } > > -static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) > +void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) > { > pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL, > PCI_EXP_LNKCTL_ASPMC, val); > -- > 2.30.0.284.gd98b1dd5eaa7-goog >
Hi Bjorn, On Wed, Jan 13, 2021 at 9:32 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > Hi Victor, > > Thanks for the patch. Improving suspend/resume performance is always > a good thing! > > On Tue, Jan 12, 2021 at 04:02:04AM +0000, Victor Ding wrote: > > Right after powering up, the device may have ASPM enabled; however, > > its LTR and/or L1ss controls may not be in the desired states; hence, > > the device may enter L1.2 undesirably and cause resume performance > > penalty. This is especially problematic if ASPM related control > > registers are modified before a suspension. > > s/L1ss/L1SS/ (several occurrences) to match existing usage. > I'll update it in the next version. > > > Therefore, ASPM should disabled until its LTR and L1ss states are > > fully restored. > > > > Signed-off-by: Victor Ding <victording@google.com> > > --- > > > > drivers/pci/pci.c | 11 +++++++++++ > > drivers/pci/pci.h | 2 ++ > > drivers/pci/pcie/aspm.c | 2 +- > > 3 files changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index eb323af34f1e..428de433f2e6 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -1660,6 +1660,17 @@ void pci_restore_state(struct pci_dev *dev) > > if (!dev->state_saved) > > return; > > > > + /* > > + * Right after powering up, the device may have ASPM enabled; > > I think this could be stated more precisely. "Right after powering > up," ASPM should be *disabled* because ASPM Control in the Link > Control register powers up as zero (disabled). > Good suggestion; I'll reword in the next version. However, ASPM should be *disabled* for the opposite reason - ASPM Control in the Link Control register *may* power as non-zero (enabled). (More answered in the next section) > > > + * however, its LTR and/or L1ss controls may not be in the desired > > + * states; as a result, the device may enter L1.2 undesirably and > > + * cause resume performance penalty. > > + * Therefore, ASPM is disabled until its LTR and L1ss states are > > + * fully restored. > > + * (enabling ASPM is part of pci_restore_pcie_state) > > If we're enabling L1.2 before LTR has been configured, that's > definitely a bug. But I don't see how that happens. The current code > looks like: > > pci_restore_state > pci_restore_ltr_state > pci_write_config_word(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, *cap++) > pci_write_config_word(dev, ltr + PCI_LTR_MAX_NOSNOOP_LAT, *cap++) > pci_restore_aspm_l1ss_state > pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, *cap++) > pci_restore_pcie_state > pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]) > > We *do* restore PCI_L1SS_CTL1, which contains "ASPM L1.2 Enable", > before we restore PCI_EXP_LNKCTL, which contains "ASPM Control", but I > don't think "ASPM L1.2 Enable" by itself should be enough to allow > hardware to enter L1.2. > > Reading PCIe r5.0, sec 5.5.1, it seems like hardware should ignore the > PCI_L1SS_CTL1 bits unless ASPM L1 entry is enabled in PCI_EXP_LNKCTL. > > What am I missing? > Correct; however, PCI_EXP_LNKCTL may power up as non-zero, i.e. has ASPM Control enabled. BIOS may set PCI_EXP_LNKCTL (and PCI_L1SS_CTL1) before Kernel takes control. When BIOS does so, there is a brief period between powering up and Kernel restoring LTR state that L1.2 is enabled but LTR isn't configured. > > > + */ > > + pcie_config_aspm_dev(dev, 0); > > + > > /* > > * Restore max latencies (in the LTR capability) before enabling > > * LTR itself (in the PCIe capability). > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > > index e9ea5dfaa3e0..f774bd6d2555 100644 > > --- a/drivers/pci/pci.h > > +++ b/drivers/pci/pci.h > > @@ -564,6 +564,7 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev); > > void pcie_aspm_exit_link_state(struct pci_dev *pdev); > > void pcie_aspm_pm_state_change(struct pci_dev *pdev); > > void pcie_aspm_powersave_config_link(struct pci_dev *pdev); > > +void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val); > > void pci_save_aspm_l1ss_state(struct pci_dev *dev); > > void pci_restore_aspm_l1ss_state(struct pci_dev *dev); > > #else > > @@ -571,6 +572,7 @@ static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } > > static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } > > static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { } > > static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { } > > +static inline void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) { } > > static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { } > > static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { } > > #endif > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index a08e7d6dc248..45535b4e1595 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -778,7 +778,7 @@ void pci_restore_aspm_l1ss_state(struct pci_dev *dev) > > pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL2, *cap++); > > } > > > > -static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) > > +void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) > > { > > pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL, > > PCI_EXP_LNKCTL_ASPMC, val); > > -- > > 2.30.0.284.gd98b1dd5eaa7-goog > >
On Wed, Jan 13, 2021 at 01:16:05PM +1100, Victor Ding wrote: > On Wed, Jan 13, 2021 at 9:32 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Tue, Jan 12, 2021 at 04:02:04AM +0000, Victor Ding wrote: > > > Right after powering up, the device may have ASPM enabled; however, > > > its LTR and/or L1ss controls may not be in the desired states; hence, > > > the device may enter L1.2 undesirably and cause resume performance > > > penalty. This is especially problematic if ASPM related control > > > registers are modified before a suspension. > > > > s/L1ss/L1SS/ (several occurrences) to match existing usage. > > > I'll update it in the next version. > > > > > Therefore, ASPM should disabled until its LTR and L1ss states are > > > fully restored. > > > > > > Signed-off-by: Victor Ding <victording@google.com> > > > --- > > > > > > drivers/pci/pci.c | 11 +++++++++++ > > > drivers/pci/pci.h | 2 ++ > > > drivers/pci/pcie/aspm.c | 2 +- > > > 3 files changed, 14 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > index eb323af34f1e..428de433f2e6 100644 > > > --- a/drivers/pci/pci.c > > > +++ b/drivers/pci/pci.c > > > @@ -1660,6 +1660,17 @@ void pci_restore_state(struct pci_dev *dev) > > > if (!dev->state_saved) > > > return; > > > > > > + /* > > > + * Right after powering up, the device may have ASPM enabled; > > > > I think this could be stated more precisely. "Right after powering > > up," ASPM should be *disabled* because ASPM Control in the Link > > Control register powers up as zero (disabled). > > > Good suggestion; I'll reword in the next version. > However, ASPM should be *disabled* for the opposite reason - ASPM Control > in the Link Control register *may* power as non-zero (enabled). > (More answered in the next section) > > > > > + * however, its LTR and/or L1ss controls may not be in the desired > > > + * states; as a result, the device may enter L1.2 undesirably and > > > + * cause resume performance penalty. > > > + * Therefore, ASPM is disabled until its LTR and L1ss states are > > > + * fully restored. > > > + * (enabling ASPM is part of pci_restore_pcie_state) > > > > If we're enabling L1.2 before LTR has been configured, that's > > definitely a bug. But I don't see how that happens. The current code > > looks like: > > > > pci_restore_state > > pci_restore_ltr_state > > pci_write_config_word(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, *cap++) > > pci_write_config_word(dev, ltr + PCI_LTR_MAX_NOSNOOP_LAT, *cap++) > > pci_restore_aspm_l1ss_state > > pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, *cap++) > > pci_restore_pcie_state > > pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]) > > > > We *do* restore PCI_L1SS_CTL1, which contains "ASPM L1.2 Enable", > > before we restore PCI_EXP_LNKCTL, which contains "ASPM Control", but I > > don't think "ASPM L1.2 Enable" by itself should be enough to allow > > hardware to enter L1.2. > > > > Reading PCIe r5.0, sec 5.5.1, it seems like hardware should ignore the > > PCI_L1SS_CTL1 bits unless ASPM L1 entry is enabled in PCI_EXP_LNKCTL. > > > > What am I missing? > > > Correct; however, PCI_EXP_LNKCTL may power up as non-zero, i.e. has > ASPM Control enabled. BIOS may set PCI_EXP_LNKCTL (and > PCI_L1SS_CTL1) before Kernel takes control. When BIOS does so, there > is a brief period between powering up and Kernel restoring LTR state > that L1.2 is enabled but LTR isn't configured. IIUC you're saying that ASPM Control powers up as zero, but BIOS enables ASPM before transferring control to the OS. That makes sense, but I wouldn't describe that as "the device powering up with ASPM enabled." If BIOS enables L1SS (specifically, if it enables L1.2) when LTR hasn't been configured, that sounds like a BIOS defect.
On Thu, Jan 14, 2021 at 7:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Wed, Jan 13, 2021 at 01:16:05PM +1100, Victor Ding wrote: > > On Wed, Jan 13, 2021 at 9:32 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > > On Tue, Jan 12, 2021 at 04:02:04AM +0000, Victor Ding wrote: > > > > Right after powering up, the device may have ASPM enabled; however, > > > > its LTR and/or L1ss controls may not be in the desired states; hence, > > > > the device may enter L1.2 undesirably and cause resume performance > > > > penalty. This is especially problematic if ASPM related control > > > > registers are modified before a suspension. > > > > > > s/L1ss/L1SS/ (several occurrences) to match existing usage. > > > > > I'll update it in the next version. > > > > > > > Therefore, ASPM should disabled until its LTR and L1ss states are > > > > fully restored. > > > > > > > > Signed-off-by: Victor Ding <victording@google.com> > > > > --- > > > > > > > > drivers/pci/pci.c | 11 +++++++++++ > > > > drivers/pci/pci.h | 2 ++ > > > > drivers/pci/pcie/aspm.c | 2 +- > > > > 3 files changed, 14 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > > index eb323af34f1e..428de433f2e6 100644 > > > > --- a/drivers/pci/pci.c > > > > +++ b/drivers/pci/pci.c > > > > @@ -1660,6 +1660,17 @@ void pci_restore_state(struct pci_dev *dev) > > > > if (!dev->state_saved) > > > > return; > > > > > > > > + /* > > > > + * Right after powering up, the device may have ASPM enabled; > > > > > > I think this could be stated more precisely. "Right after powering > > > up," ASPM should be *disabled* because ASPM Control in the Link > > > Control register powers up as zero (disabled). > > > > > Good suggestion; I'll reword in the next version. > > However, ASPM should be *disabled* for the opposite reason - ASPM Control > > in the Link Control register *may* power as non-zero (enabled). > > (More answered in the next section) > > > > > > > + * however, its LTR and/or L1ss controls may not be in the desired > > > > + * states; as a result, the device may enter L1.2 undesirably and > > > > + * cause resume performance penalty. > > > > + * Therefore, ASPM is disabled until its LTR and L1ss states are > > > > + * fully restored. > > > > + * (enabling ASPM is part of pci_restore_pcie_state) > > > > > > If we're enabling L1.2 before LTR has been configured, that's > > > definitely a bug. But I don't see how that happens. The current code > > > looks like: > > > > > > pci_restore_state > > > pci_restore_ltr_state > > > pci_write_config_word(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, *cap++) > > > pci_write_config_word(dev, ltr + PCI_LTR_MAX_NOSNOOP_LAT, *cap++) > > > pci_restore_aspm_l1ss_state > > > pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, *cap++) > > > pci_restore_pcie_state > > > pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]) > > > > > > We *do* restore PCI_L1SS_CTL1, which contains "ASPM L1.2 Enable", > > > before we restore PCI_EXP_LNKCTL, which contains "ASPM Control", but I > > > don't think "ASPM L1.2 Enable" by itself should be enough to allow > > > hardware to enter L1.2. > > > > > > Reading PCIe r5.0, sec 5.5.1, it seems like hardware should ignore the > > > PCI_L1SS_CTL1 bits unless ASPM L1 entry is enabled in PCI_EXP_LNKCTL. > > > > > > What am I missing? > > > > > Correct; however, PCI_EXP_LNKCTL may power up as non-zero, i.e. has > > ASPM Control enabled. BIOS may set PCI_EXP_LNKCTL (and > > PCI_L1SS_CTL1) before Kernel takes control. When BIOS does so, there > > is a brief period between powering up and Kernel restoring LTR state > > that L1.2 is enabled but LTR isn't configured. > > IIUC you're saying that ASPM Control powers up as zero, but BIOS > enables ASPM before transferring control to the OS. That makes sense, > but I wouldn't describe that as "the device powering up with ASPM > enabled." > Correct. > > If BIOS enables L1SS (specifically, if it enables L1.2) when LTR > hasn't been configured, that sounds like a BIOS defect. > Very good point. I'll withdraw this patch then.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index eb323af34f1e..428de433f2e6 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1660,6 +1660,17 @@ void pci_restore_state(struct pci_dev *dev) if (!dev->state_saved) return; + /* + * Right after powering up, the device may have ASPM enabled; + * however, its LTR and/or L1ss controls may not be in the desired + * states; as a result, the device may enter L1.2 undesirably and + * cause resume performance penalty. + * Therefore, ASPM is disabled until its LTR and L1ss states are + * fully restored. + * (enabling ASPM is part of pci_restore_pcie_state) + */ + pcie_config_aspm_dev(dev, 0); + /* * Restore max latencies (in the LTR capability) before enabling * LTR itself (in the PCIe capability). diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index e9ea5dfaa3e0..f774bd6d2555 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -564,6 +564,7 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev); void pcie_aspm_exit_link_state(struct pci_dev *pdev); void pcie_aspm_pm_state_change(struct pci_dev *pdev); void pcie_aspm_powersave_config_link(struct pci_dev *pdev); +void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val); void pci_save_aspm_l1ss_state(struct pci_dev *dev); void pci_restore_aspm_l1ss_state(struct pci_dev *dev); #else @@ -571,6 +572,7 @@ static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { } static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { } +static inline void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) { } static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { } static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { } #endif diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index a08e7d6dc248..45535b4e1595 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -778,7 +778,7 @@ void pci_restore_aspm_l1ss_state(struct pci_dev *dev) pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL2, *cap++); } -static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) +void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) { pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC, val);
Right after powering up, the device may have ASPM enabled; however, its LTR and/or L1ss controls may not be in the desired states; hence, the device may enter L1.2 undesirably and cause resume performance penalty. This is especially problematic if ASPM related control registers are modified before a suspension. Therefore, ASPM should disabled until its LTR and L1ss states are fully restored. Signed-off-by: Victor Ding <victording@google.com> --- drivers/pci/pci.c | 11 +++++++++++ drivers/pci/pci.h | 2 ++ drivers/pci/pcie/aspm.c | 2 +- 3 files changed, 14 insertions(+), 1 deletion(-)