Message ID | 20240327213108.2384666-2-yuanchu@google.com |
---|---|
State | Superseded |
Headers | show |
Series | mm: workingset reporting | expand |
Yuanchu Xie <yuanchu@google.com> writes: > When non-leaf pmd accessed bits are available, MGLRU page table walks > can clear the accessed bit and promptly ignore the accessed bit on the > pte because it's on a different node, so the walk does not update the > generation of said page. When the next scan comes around on the right > node, the non-leaf pmd accessed bit might remain cleared and the pte > accessed bits won't be checked. While this is sufficient for > reclaim-driven aging, where the goal is to select a reasonably cold > page, the access can be missed when aging proactively for measuring the > working set size of a node/memcg. > > Since force_scan disables various other optimizations, we check > force_scan to ignore the non-leaf pmd accessed bit. > > Signed-off-by: Yuanchu Xie <yuanchu@google.com> > --- > mm/vmscan.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 4f9c854ce6cc..1a7c7d537db6 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3522,7 +3522,7 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, > > walk->mm_stats[MM_NONLEAF_TOTAL]++; > > - if (should_clear_pmd_young()) { > + if (!walk->force_scan && should_clear_pmd_young()) { > if (!pmd_young(val)) > continue; Sorry, I don't understand why we need this. If !pmd_young(val), we don't need to update the generation. If pmd_young(val), the bloom filter will be ignored if force_scan == true. Or do I miss something? -- Best Regards, Huang, Ying
On Mon, Apr 8, 2024 at 11:52 PM Huang, Ying <ying.huang@intel.com> wrote: > > Yuanchu Xie <yuanchu@google.com> writes: > > > When non-leaf pmd accessed bits are available, MGLRU page table walks > > can clear the accessed bit and promptly ignore the accessed bit on the > > pte because it's on a different node, so the walk does not update the > > generation of said page. When the next scan comes around on the right > > node, the non-leaf pmd accessed bit might remain cleared and the pte > > accessed bits won't be checked. While this is sufficient for > > reclaim-driven aging, where the goal is to select a reasonably cold > > page, the access can be missed when aging proactively for measuring the > > working set size of a node/memcg. > > > > Since force_scan disables various other optimizations, we check > > force_scan to ignore the non-leaf pmd accessed bit. > > > > Signed-off-by: Yuanchu Xie <yuanchu@google.com> > > --- > > mm/vmscan.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 4f9c854ce6cc..1a7c7d537db6 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -3522,7 +3522,7 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, > > > > walk->mm_stats[MM_NONLEAF_TOTAL]++; > > > > - if (should_clear_pmd_young()) { > > + if (!walk->force_scan && should_clear_pmd_young()) { > > if (!pmd_young(val)) > > continue; > > Sorry, I don't understand why we need this. If !pmd_young(val), we > don't need to update the generation. If pmd_young(val), the bloom > filter will be ignored if force_scan == true. Or do I miss something? If !pmd_young(val), we still might need to update the generation. The get_pfn_folio function returns NULL if the folio's nid != node under scanning, so the pte accessed bit does not get cleared and the generation is not updated. Now the pmd_young flag of this pmd is cleared, and if none of the pte's are accessed before another round of scanning occurs on the folio's node, the pmd_young check fails and the pte accessed bit is skipped. This is fine for kswapd but can introduce inaccuracies when scanning proactively for workingset estimation. Thanks, Yuanchu
diff --git a/mm/vmscan.c b/mm/vmscan.c index 4f9c854ce6cc..1a7c7d537db6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3522,7 +3522,7 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, walk->mm_stats[MM_NONLEAF_TOTAL]++; - if (should_clear_pmd_young()) { + if (!walk->force_scan && should_clear_pmd_young()) { if (!pmd_young(val)) continue;
When non-leaf pmd accessed bits are available, MGLRU page table walks can clear the accessed bit and promptly ignore the accessed bit on the pte because it's on a different node, so the walk does not update the generation of said page. When the next scan comes around on the right node, the non-leaf pmd accessed bit might remain cleared and the pte accessed bits won't be checked. While this is sufficient for reclaim-driven aging, where the goal is to select a reasonably cold page, the access can be missed when aging proactively for measuring the working set size of a node/memcg. Since force_scan disables various other optimizations, we check force_scan to ignore the non-leaf pmd accessed bit. Signed-off-by: Yuanchu Xie <yuanchu@google.com> --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)