glibc: use the host locale archive in nativesdk builds

Message ID 1468514260-5560-1-git-send-email-ross.burton@intel.com
State Accepted
Commit 75321b6b0f2c0ac667b9350b387b01a188e195c8
Headers show

Commit Message

Ross Burton July 14, 2016, 4:37 p.m.
The nativesdk libc when used by buildtools has a hard requirement on supporting
a UTF-8 locale because Python 3 needs a UTF-8 locale.  However we currently only
ship the C locale, which means that Python attempts to lookup the user's locale
(for example, en_NZ.UTF-8) in the locale archive under it's prefix it fails and
falls back to C.  This the results in Python using ASCII instead of UTF-8 for
file encoding, and bitbake breaks.

Th obvious solution would be to ship all locales, but this would add
approximately 250MB to the size of the buildtools tarball (which is currently
around 30MB).  Generating a binary locale archive reduces this down to 100MB,
but this is still a drastic increase in footprint.  If we ship a subset of
locales in the tarball then there will be users whose locale isn't in the
tarball, and they'll have to change their locale to an "approved" one, which
isn't the best of messages to send to new users.

The alternative is to tell the nativesdk libc that the locale archive isn't
under it own prefix but is in fact at /usr/lib/locale/locale-archive, so the
buildtools libc uses the host locale archive. The locale archive format appears
to be at least fairly stable: our glibc 2.24 can read the locale archive
generated by glibc 2.17 (Centos 7).

[ YOCTO #9775 ]

Signed-off-by: Ross Burton <ross.burton@intel.com>

---
 meta/recipes-core/glibc/glibc_2.24.bb | 6 ++++++
 1 file changed, 6 insertions(+)

-- 
2.8.1

-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core

Comments

Khem Raj July 14, 2016, 4:40 p.m. | #1
> On Jul 14, 2016, at 9:37 AM, Ross Burton <ross.burton@intel.com> wrote:

> 

> The nativesdk libc when used by buildtools has a hard requirement on supporting

> a UTF-8 locale because Python 3 needs a UTF-8 locale.  However we currently only

> ship the C locale, which means that Python attempts to lookup the user's locale

> (for example, en_NZ.UTF-8) in the locale archive under it's prefix it fails and

> falls back to C.  This the results in Python using ASCII instead of UTF-8 for

> file encoding, and bitbake breaks.

> 

> Th obvious solution would be to ship all locales, but this would add

> approximately 250MB to the size of the buildtools tarball (which is currently

> around 30MB).  Generating a binary locale archive reduces this down to 100MB,

> but this is still a drastic increase in footprint.  If we ship a subset of

> locales in the tarball then there will be users whose locale isn't in the

> tarball, and they'll have to change their locale to an "approved" one, which

> isn't the best of messages to send to new users.

> 

> The alternative is to tell the nativesdk libc that the locale archive isn't

> under it own prefix but is in fact at /usr/lib/locale/locale-archive, so the

> buildtools libc uses the host locale archive. The locale archive format appears

> to be at least fairly stable: our glibc 2.24 can read the locale archive

> generated by glibc 2.17 (Centos 7).


I think this patch is good. Although, there might be issues with SDKs when tried
on different distros. But as long as we keep the tested distros in shape we are ok

> 

> [ YOCTO #9775 ]

> 

> Signed-off-by: Ross Burton <ross.burton@intel.com>

> ---

> meta/recipes-core/glibc/glibc_2.24.bb | 6 ++++++

> 1 file changed, 6 insertions(+)

> 

> diff --git a/meta/recipes-core/glibc/glibc_2.24.bb b/meta/recipes-core/glibc/glibc_2.24.bb

> index 456f206..4bc6443 100644

> --- a/meta/recipes-core/glibc/glibc_2.24.bb

> +++ b/meta/recipes-core/glibc/glibc_2.24.bb

> @@ -129,6 +129,12 @@ do_compile () {

> 

> }

> 

> +# Use the host locale archive when built for nativesdk so that we don't need to

> +# ship a complete (100MB) locale set.

> +do_compile_prepend_class-nativesdk() {

> +    echo "complocaledir=/usr/lib/locale" >> ${S}/configparms

> +}

> +

> require glibc-package.inc

> 

> BBCLASSEXTEND = "nativesdk"

> --

> 2.8.1

> 

> --

> _______________________________________________

> Openembedded-core mailing list

> Openembedded-core@lists.openembedded.org

> http://lists.openembedded.org/mailman/listinfo/openembedded-core
-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core
Ross Burton July 14, 2016, 7:11 p.m. | #2
On 14 July 2016 at 19:58, Mark Hatle <mark.hatle@windriver.com> wrote:

> Patching glibc to use the host system as a backup for the locales makes

> sense to

> me as well.  However it is definitely more work and may not be worth it...



Agreed and agreed, which is why I went with this patch.


> In the buildtools case, do any of the other components we ship actually

> have

> localized content?  If we're shipping message catalogs for non-english

> systems

> -- then we definitely need non-english locale to work.  But if we're only

> shipping english -- C.utf-8 might be enough.

>


The problem for buildtools shipping just C.UTF-8 is that the moment you
have a system that needs buildtools and use them, you're forced into using
C.UTF-8 for everything in that session unless you keep that terminal for
just bitbake invocations.  If you're like me and often run
git/emacs/meld/etc in the same terminal as bitbake and don't speak English
then your experience is going to suffer.

Ross
-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core
Ross Burton July 14, 2016, 7:13 p.m. | #3
On 14 July 2016 at 20:11, Burton, Ross <ross.burton@intel.com> wrote:

> The problem for buildtools shipping just C.UTF-8 is that the moment you

> have a system that needs buildtools and use them, you're forced into using

> C.UTF-8 for everything in that session unless you keep that terminal for

> just bitbake invocations.  If you're like me and often run

> git/emacs/meld/etc in the same terminal as bitbake and don't speak English

> then your experience is going to suffer.



Clarification: whilst bitbake currently enforces en_US.UTF-8 internally to
ensure that it has a UTF-8 locale this is internal only.  The moment you
use buildtools then a number of binaries - such as git - now link to our
libc and if that doesn't understand your locale then you don't get
translations.

Ross
-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core

Patch

diff --git a/meta/recipes-core/glibc/glibc_2.24.bb b/meta/recipes-core/glibc/glibc_2.24.bb
index 456f206..4bc6443 100644
--- a/meta/recipes-core/glibc/glibc_2.24.bb
+++ b/meta/recipes-core/glibc/glibc_2.24.bb
@@ -129,6 +129,12 @@  do_compile () {
 
 }
 
+# Use the host locale archive when built for nativesdk so that we don't need to
+# ship a complete (100MB) locale set.
+do_compile_prepend_class-nativesdk() {
+    echo "complocaledir=/usr/lib/locale" >> ${S}/configparms
+}
+
 require glibc-package.inc
 
 BBCLASSEXTEND = "nativesdk"