[master] gen-lockedsig-cache: Replace glob lookup with hash to filename lookup

Message ID 20190927185642.29251-1-konrad.scherer@windriver.com
State New
Headers show
Series
  • [master] gen-lockedsig-cache: Replace glob lookup with hash to filename lookup
Related show

Commit Message

Konrad Scherer Sept. 27, 2019, 6:56 p.m.
From: Konrad Scherer <Konrad.Scherer@windriver.com>


Using the glob function to map signatures to sstate files is very slow
when the sstate is large and accessed over nfs. The lookup now only
loads the necessary prefixes and doesn't use glob as all.

Unfortunately I don't have access to the systems where the performance
isse was noticed and on my test system the glob is fast enough that
the performance numbers aren't useful. I could verify that file list
returned by the new code is the same.

[YOCTO #13539]

Signed-off-by: Konrad Scherer <Konrad.Scherer@windriver.com>

---
 meta/lib/oe/copy_buildsystem.py |  3 ++-
 scripts/gen-lockedsig-cache     | 44 +++++++++++++++++++++++++++++----
 2 files changed, 41 insertions(+), 6 deletions(-)

-- 
2.23.0

-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core

Comments

Richard Purdie Oct. 2, 2019, 10:17 a.m. | #1
On Fri, 2019-09-27 at 14:56 -0400, Konrad Scherer wrote:
> From: Konrad Scherer <Konrad.Scherer@windriver.com>

> 

> Using the glob function to map signatures to sstate files is very

> slow

> when the sstate is large and accessed over nfs. The lookup now only

> loads the necessary prefixes and doesn't use glob as all.

> 

> Unfortunately I don't have access to the systems where the

> performance

> isse was noticed and on my test system the glob is fast enough that

> the performance numbers aren't useful. I could verify that file list

> returned by the new code is the same.

> 

> [YOCTO #13539]

> 

> Signed-off-by: Konrad Scherer <Konrad.Scherer@windriver.com>

> ---

>  meta/lib/oe/copy_buildsystem.py |  3 ++-

>  scripts/gen-lockedsig-cache     | 44 +++++++++++++++++++++++++++++

> ----

>  2 files changed, 41 insertions(+), 6 deletions(-)


Thanks for this, its hugely appreciated!

I aimed to do some profile measurements to show the difference in speed
but when testing it failed with:

Filtering out gdb-cross-x86_64:do_patch
Filtering out gdb-cross-x86_64:do_prepare_recipe_sysroot
Filtering out gdb-cross-x86_64:do_unpack
Gathering file list
Traceback (most recent call last):
  File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 77, in <module>
    sstate_content_cache[prefix] = build_sha_cache(prefix)
  File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 39, in build_sha_cache
    map_sha_to_files(sstate_dir, prefix, sha_map)
  File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 29, in map_sha_to_files
    sha = extract_sha(f)
  File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 21, in extract_sha
    return filename.split(':')[7].split('_')[0]
IndexError: list index out of range

and then when I fixed that by ignoring files which don't match the pattern:

Gathering file list
Traceback (most recent call last):
  File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 80, in <module>
    sstate_content_cache[prefix] = build_sha_cache(prefix)
  File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 45, in build_sha_cache
    map_sha_to_files(native_sstate_dir, prefix, sha_map)
  File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 27, in map_sha_to_files
    sstate_files = os.listdir(sstate_prefix_path)
FileNotFoundError: [Errno 2] No such file or directory: '/srv/autobuilder/autobuilder.yoctoproject.org/pub/sstateuniversal/b6/'


I therefore added:

diff --git a/scripts/gen-lockedsig-cache b/scripts/gen-lockedsig-cache
index ae5e09d89f..48cb67112f 100755
--- a/scripts/gen-lockedsig-cache
+++ b/scripts/gen-lockedsig-cache
@@ -26,10 +26,13 @@ def map_sha_to_files(dir_, prefix, sha_map):
     sstate_prefix_path = dir_ + '/' + prefix + '/'
     sstate_files = os.listdir(sstate_prefix_path)
     for f in sstate_files:
-        sha = extract_sha(f)
-        if sha not in sha_map:
-            sha_map[sha] = []
-        sha_map[sha].append(sstate_prefix_path + f)
+        try:
+            sha = extract_sha(f)
+            if sha not in sha_map:
+                sha_map[sha] = []
+            sha_map[sha].append(sstate_prefix_path + f)
+        except IndexError:
+            continue
 
 # given a prefix build a map of hash to list of files
 def build_sha_cache(prefix):
@@ -38,7 +41,7 @@ def build_sha_cache(prefix):
     sstate_dir = sys.argv[2]
     map_sha_to_files(sstate_dir, prefix, sha_map)
 
-    native_sstate_dir = sys.argv[2] + sys.argv[4]
+    native_sstate_dir = sys.argv[2] + '/' + sys.argv[4]
     map_sha_to_files(native_sstate_dir, prefix, sha_map)
 
     return sha_map


My benchmark before was seeing it spend over 30 minutes in bitbake
core-image-minimal:do_populate_sdk_ext on an otherwise idle autobuilder
cluster/NAS (35 minutes from a clean tmpdir).

With the patch applied and my above tweak, I saw:

real	6m58.120s

and I'd note this was with a full build running on the other workers so
the NAS was under load. I could try and get an exact time for the above
but didn't really see the point in spending another 30 minutes on it.

This is the time for the whole SDK image, not just the time this script
takes but its enough for me to say its a vast improvement! :)

Konrad: Mind if I squash in the above tweaks?

Also, the new code is a bit too chatty and leads to a lot of log
output, we'll need to tweak that too.

Cheers,

Richard


-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core
Konrad Scherer Oct. 2, 2019, 2:56 p.m. | #2
On 10/2/19 6:17 AM, Richard Purdie wrote:
> On Fri, 2019-09-27 at 14:56 -0400, Konrad Scherer wrote:

>> From: Konrad Scherer <Konrad.Scherer@windriver.com>

>>

>> Using the glob function to map signatures to sstate files is very

>> slow

>> when the sstate is large and accessed over nfs. The lookup now only

>> loads the necessary prefixes and doesn't use glob as all.

>>

>> Unfortunately I don't have access to the systems where the

>> performance

>> isse was noticed and on my test system the glob is fast enough that

>> the performance numbers aren't useful. I could verify that file list

>> returned by the new code is the same.

>>

>> [YOCTO #13539]

>>

>> Signed-off-by: Konrad Scherer <Konrad.Scherer@windriver.com>

>> ---

>>   meta/lib/oe/copy_buildsystem.py |  3 ++-

>>   scripts/gen-lockedsig-cache     | 44 +++++++++++++++++++++++++++++

>> ----

>>   2 files changed, 41 insertions(+), 6 deletions(-)

> 

> Thanks for this, its hugely appreciated!

> 

> I aimed to do some profile measurements to show the difference in speed

> but when testing it failed with:

> 

> Filtering out gdb-cross-x86_64:do_patch

> Filtering out gdb-cross-x86_64:do_prepare_recipe_sysroot

> Filtering out gdb-cross-x86_64:do_unpack

> Gathering file list

> Traceback (most recent call last):

>    File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 77, in <module>

>      sstate_content_cache[prefix] = build_sha_cache(prefix)

>    File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 39, in build_sha_cache

>      map_sha_to_files(sstate_dir, prefix, sha_map)

>    File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 29, in map_sha_to_files

>      sha = extract_sha(f)

>    File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 21, in extract_sha

>      return filename.split(':')[7].split('_')[0]

> IndexError: list index out of range

> 

> and then when I fixed that by ignoring files which don't match the pattern:

> 

> Gathering file list

> Traceback (most recent call last):

>    File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 80, in <module>

>      sstate_content_cache[prefix] = build_sha_cache(prefix)

>    File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 45, in build_sha_cache

>      map_sha_to_files(native_sstate_dir, prefix, sha_map)

>    File "/home/pokybuild/yocto-worker/qa-extras/build/scripts/gen-lockedsig-cache", line 27, in map_sha_to_files

>      sstate_files = os.listdir(sstate_prefix_path)

> FileNotFoundError: [Errno 2] No such file or directory: '/srv/autobuilder/autobuilder.yoctoproject.org/pub/sstateuniversal/b6/'

> 

> 

> I therefore added:

> 

> diff --git a/scripts/gen-lockedsig-cache b/scripts/gen-lockedsig-cache

> index ae5e09d89f..48cb67112f 100755

> --- a/scripts/gen-lockedsig-cache

> +++ b/scripts/gen-lockedsig-cache

> @@ -26,10 +26,13 @@ def map_sha_to_files(dir_, prefix, sha_map):

>       sstate_prefix_path = dir_ + '/' + prefix + '/'

>       sstate_files = os.listdir(sstate_prefix_path)

>       for f in sstate_files:

> -        sha = extract_sha(f)

> -        if sha not in sha_map:

> -            sha_map[sha] = []

> -        sha_map[sha].append(sstate_prefix_path + f)

> +        try:

> +            sha = extract_sha(f)

> +            if sha not in sha_map:

> +                sha_map[sha] = []

> +            sha_map[sha].append(sstate_prefix_path + f)

> +        except IndexError:

> +            continue

>   

>   # given a prefix build a map of hash to list of files

>   def build_sha_cache(prefix):

> @@ -38,7 +41,7 @@ def build_sha_cache(prefix):

>       sstate_dir = sys.argv[2]

>       map_sha_to_files(sstate_dir, prefix, sha_map)

>   

> -    native_sstate_dir = sys.argv[2] + sys.argv[4]

> +    native_sstate_dir = sys.argv[2] + '/' + sys.argv[4]

>       map_sha_to_files(native_sstate_dir, prefix, sha_map)

>   

>       return sha_map

> 

> 

> My benchmark before was seeing it spend over 30 minutes in bitbake

> core-image-minimal:do_populate_sdk_ext on an otherwise idle autobuilder

> cluster/NAS (35 minutes from a clean tmpdir).

> 

> With the patch applied and my above tweak, I saw:

> 

> real	6m58.120s

> 

> and I'd note this was with a full build running on the other workers so

> the NAS was under load. I could try and get an exact time for the above

> but didn't really see the point in spending another 30 minutes on it.

> 

> This is the time for the whole SDK image, not just the time this script

> takes but its enough for me to say its a vast improvement! :)


Great! It was frustrating to develop the code without a good reproducer.

> Konrad: Mind if I squash in the above tweaks?


Not at all. Sorry that you had to spend time debugging my code. That sha extraction code should have been more robust.

> Also, the new code is a bit too chatty and leads to a lot of log

> output, we'll need to tweak that too.


Please do.

-- 
Konrad Scherer, MTS, Linux Products Group, Wind River
-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core
Richard Purdie Oct. 2, 2019, 3:25 p.m. | #3
On Wed, 2019-10-02 at 10:56 -0400, Konrad Scherer wrote:
> On 10/2/19 6:17 AM, Richard Purdie wrote:

> > My benchmark before was seeing it spend over 30 minutes in bitbake

> > core-image-minimal:do_populate_sdk_ext on an otherwise idle

> > autobuilder

> > cluster/NAS (35 minutes from a clean tmpdir).

> > 

> > With the patch applied and my above tweak, I saw:

> > 

> > real	6m58.120s

> > 

> > and I'd note this was with a full build running on the other

> > workers so

> > the NAS was under load. I could try and get an exact time for the

> > above

> > but didn't really see the point in spending another 30 minutes on

> > it.

> > 

> > This is the time for the whole SDK image, not just the time this

> > script

> > takes but its enough for me to say its a vast improvement! :)

> 

> Great! It was frustrating to develop the code without a good

> reproducer.


Unfortunately I think we may be celebrating prematurely. On an
otherwise unloaded system this looked ok however in real world builds
the sdk creation is still excruciatingly slow.

We don't even need to use the script to understand why:

pokybuild@debian9-ty-2:~$ time ls /srv/autobuilder/autobuilder.yoctoproject.org/pub/sstate/c6

real	9m3.140s
user	0m4.432s
sys	0m5.164s

So we'd expect the script to take 2*9*255 minutes :(

(214054 files in there incidentally, I got 1m49 the second try)

I've cc'd our sysadmin.

> > Konrad: Mind if I squash in the above tweaks?

> 

> Not at all. Sorry that you had to spend time debugging my code. That

> sha extraction code should have been more robust.

> 

> > Also, the new code is a bit too chatty and leads to a lot of log

> > output, we'll need to tweak that too.

> 

> Please do.


I've queued such a patch in master-next and am testing, its an
improvement but I'm worried its not resolving the problem.

Options are to prune the sstate cache, or to teach the code to generate
the full filenames (would mean refactoring)...

We could give you access to the autobuilder to help reproduce the
problem but I think its clear where the delays are...

Cheers,

Richard



-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core
Richard Purdie Oct. 3, 2019, 2:25 p.m. | #4
On Wed, 2019-10-02 at 16:25 +0100, richard.purdie@linuxfoundation.org
wrote:
> I've queued such a patch in master-next and am testing, its an

> improvement but I'm worried its not resolving the problem.

> 

> Options are to prune the sstate cache, or to teach the code to

> generate

> the full filenames (would mean refactoring)...

> 

> We could give you access to the autobuilder to help reproduce the

> problem but I think its clear where the delays are...


We've cleaned out the sstate on the autobuilder to try and reduce the
impact of this issue as this patch, whilst helping, didn't solve the
problem.

I'm wondering about a different approach.


Just to elaborate on what teaching the code would entail, as well as
using the 'hashfn' data as part of the hash validate function in from
runqueue.py:

sq_data['hashfn'][tid] = self.rqdata.dataCaches[mc].hashfn[taskfn]

we could inject that data into BB_TASKDEPDATA. If we did that, the
oecore code could probably create the full sstate tasknames for the
tasks it needs using logic extracted out of sstate_checkhashes() from
sstate.bbclass.

This would mean the script wouldn't have to guess at filenames, it
could access them directly.

Worth some further thought...

Cheers,

Richard

-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core

Patch

diff --git a/meta/lib/oe/copy_buildsystem.py b/meta/lib/oe/copy_buildsystem.py
index 31a84f5b06..c515683608 100644
--- a/meta/lib/oe/copy_buildsystem.py
+++ b/meta/lib/oe/copy_buildsystem.py
@@ -259,7 +259,8 @@  def create_locked_sstate_cache(lockedsigs, input_sstate_cache, output_sstate_cac
     bb.note('Generating sstate-cache...')
 
     nativelsbstring = d.getVar('NATIVELSBSTRING')
-    bb.process.run("gen-lockedsig-cache %s %s %s %s %s" % (lockedsigs, input_sstate_cache, output_sstate_cache, nativelsbstring, filterfile or ''))
+    stdout, _ = bb.process.run("gen-lockedsig-cache %s %s %s %s %s" % (lockedsigs, input_sstate_cache, output_sstate_cache, nativelsbstring, filterfile or ''))
+    bb.debug(stdout)
     if fixedlsbstring and nativelsbstring != fixedlsbstring:
         nativedir = output_sstate_cache + '/' + nativelsbstring
         if os.path.isdir(nativedir):
diff --git a/scripts/gen-lockedsig-cache b/scripts/gen-lockedsig-cache
index e3076e11a5..ae5e09d89f 100755
--- a/scripts/gen-lockedsig-cache
+++ b/scripts/gen-lockedsig-cache
@@ -5,9 +5,9 @@ 
 
 import os
 import sys
-import glob
 import shutil
 import errno
+import time
 
 def mkdir(d):
     try:
@@ -16,6 +16,33 @@  def mkdir(d):
         if e.errno != errno.EEXIST:
             raise e
 
+# extract the hash from past the last colon to last underscore
+def extract_sha(filename):
+    return filename.split(':')[7].split('_')[0]
+
+# get all files in a directory, extract hash and make
+# a map from hash to list of file with that hash
+def map_sha_to_files(dir_, prefix, sha_map):
+    sstate_prefix_path = dir_ + '/' + prefix + '/'
+    sstate_files = os.listdir(sstate_prefix_path)
+    for f in sstate_files:
+        sha = extract_sha(f)
+        if sha not in sha_map:
+            sha_map[sha] = []
+        sha_map[sha].append(sstate_prefix_path + f)
+
+# given a prefix build a map of hash to list of files
+def build_sha_cache(prefix):
+    sha_map = {}
+
+    sstate_dir = sys.argv[2]
+    map_sha_to_files(sstate_dir, prefix, sha_map)
+
+    native_sstate_dir = sys.argv[2] + sys.argv[4]
+    map_sha_to_files(native_sstate_dir, prefix, sha_map)
+
+    return sha_map
+
 if len(sys.argv) < 5:
     print("Incorrect number of arguments specified")
     print("syntax: gen-lockedsig-cache <locked-sigs.inc> <input-cachedir> <output-cachedir> <nativelsbstring> [filterfile]")
@@ -41,12 +68,19 @@  with open(sys.argv[1]) as f:
                 sigs.append(sig)
 
 print('Gathering file list')
+start_time = time.perf_counter()
 files = set()
+sstate_content_cache = {}
 for s in sigs:
-    p = sys.argv[2] + "/" + s[:2] + "/*" + s + "*"
-    files |= set(glob.glob(p))
-    p = sys.argv[2] + "/%s/" % sys.argv[4] + s[:2] + "/*" + s + "*"
-    files |= set(glob.glob(p))
+    prefix = s[:2]
+    if prefix not in sstate_content_cache:
+        sstate_content_cache[prefix] = build_sha_cache(prefix)
+
+    for f in sstate_content_cache[prefix][s]:
+        files.add(f)
+
+elapsed = time.perf_counter() - start_time
+print("Gathering file list took %.1fs" % elapsed)
 
 print('Processing files')
 for f in files: