diff mbox series

scripts/decodecode: Fix decoding for AArch64 (arm64) instructions

Message ID 1506596147-23630-1-git-send-email-will.deacon@arm.com
State New
Headers show
Series scripts/decodecode: Fix decoding for AArch64 (arm64) instructions | expand

Commit Message

Will Deacon Sept. 28, 2017, 10:55 a.m. UTC
There are a couple of problems with the decodecode script and arm64:

1. AArch64 objdump refuses to disassemble .4byte directives as instructions,
   insisting that they are data values and displaying them as:

	a94153f3	.word	0xa94153f3		<-- trapping instruction

   This is resolved by using the .inst directive instead.

2. Disassembly of branch instructions attempts to provide the target as
   an offset from a symbol, e.g.:

   0:	34000082	cbz	w2, 10 <.text+0x10>

  however this falls foul of the grep -v, which matches lines containing
  ".text" and ends up removing all branch instructions from the dump.

This patch resolves both issues by using the .inst directive for 4-byte
quantities on arm64 and stripping the resulting binaries (as is done on
arm already) to remove the mapping symbols.

Signed-off-by: Will Deacon <will.deacon@arm.com>

---
 scripts/decodecode | 8 ++++++++
 1 file changed, 8 insertions(+)

-- 
2.1.4

Comments

Dave Martin Sept. 28, 2017, 12:42 p.m. UTC | #1
On Thu, Sep 28, 2017 at 11:55:47AM +0100, Will Deacon wrote:
> There are a couple of problems with the decodecode script and arm64:

> 

> 1. AArch64 objdump refuses to disassemble .4byte directives as instructions,

>    insisting that they are data values and displaying them as:

> 

> 	a94153f3	.word	0xa94153f3		<-- trapping instruction

> 

>    This is resolved by using the .inst directive instead.

> 

> 2. Disassembly of branch instructions attempts to provide the target as

>    an offset from a symbol, e.g.:

> 

>    0:	34000082	cbz	w2, 10 <.text+0x10>

> 

>   however this falls foul of the grep -v, which matches lines containing

>   ".text" and ends up removing all branch instructions from the dump.


Any idea why this doesn't affect other arches too ... or does it?

> This patch resolves both issues by using the .inst directive for 4-byte

> quantities on arm64 and stripping the resulting binaries (as is done on

> arm already) to remove the mapping symbols.

> 

> Signed-off-by: Will Deacon <will.deacon@arm.com>

> 

> ---

>  scripts/decodecode | 8 ++++++++

>  1 file changed, 8 insertions(+)

> 

> diff --git a/scripts/decodecode b/scripts/decodecode

> index d8824f37acce..67214ec5b2cb 100755

> --- a/scripts/decodecode

> +++ b/scripts/decodecode

> @@ -58,6 +58,14 @@ disas() {

>  		${CROSS_COMPILE}strip $1.o

>  	fi

>  

> +	if [ "$ARCH" = "arm64" ]; then

> +		if [ $width -eq 4 ]; then

> +			type=inst


Can we merge with arm here, or does arm still support toolchains that
don't have .inst?  Anyway, no big deal.

> +		fi

> +

> +		${CROSS_COMPILE}strip $1.o

> +	fi

> +

>  	${CROSS_COMPILE}objdump $OBJDUMPFLAGS -S $1.o | \

>  		grep -v "/tmp\|Disassembly\|\.text\|^$" > $1.dis 2>&1


FWIW,

Reviewed-by: Dave Martin <Dave.Martin@arm.com>


Here's hoping someone runs this as a CGI script somewhere ;) 

Cheers
---Dave
Will Deacon Sept. 28, 2017, 2:14 p.m. UTC | #2
On Thu, Sep 28, 2017 at 01:42:31PM +0100, Dave Martin wrote:
> On Thu, Sep 28, 2017 at 11:55:47AM +0100, Will Deacon wrote:

> > There are a couple of problems with the decodecode script and arm64:

> > 

> > 1. AArch64 objdump refuses to disassemble .4byte directives as instructions,

> >    insisting that they are data values and displaying them as:

> > 

> > 	a94153f3	.word	0xa94153f3		<-- trapping instruction

> > 

> >    This is resolved by using the .inst directive instead.

> > 

> > 2. Disassembly of branch instructions attempts to provide the target as

> >    an offset from a symbol, e.g.:

> > 

> >    0:	34000082	cbz	w2, 10 <.text+0x10>

> > 

> >   however this falls foul of the grep -v, which matches lines containing

> >   ".text" and ends up removing all branch instructions from the dump.

> 

> Any idea why this doesn't affect other arches too ... or does it?


I'm not sure, although I don't know how .inst works for architectures
with variable-length instructions and I *guess* the disassembly is less
fussy about data vs text for those targets.

> > This patch resolves both issues by using the .inst directive for 4-byte

> > quantities on arm64 and stripping the resulting binaries (as is done on

> > arm already) to remove the mapping symbols.

> > 

> > Signed-off-by: Will Deacon <will.deacon@arm.com>

> > 

> > ---

> >  scripts/decodecode | 8 ++++++++

> >  1 file changed, 8 insertions(+)

> > 

> > diff --git a/scripts/decodecode b/scripts/decodecode

> > index d8824f37acce..67214ec5b2cb 100755

> > --- a/scripts/decodecode

> > +++ b/scripts/decodecode

> > @@ -58,6 +58,14 @@ disas() {

> >  		${CROSS_COMPILE}strip $1.o

> >  	fi

> >  

> > +	if [ "$ARCH" = "arm64" ]; then

> > +		if [ $width -eq 4 ]; then

> > +			type=inst

> 

> Can we merge with arm here, or does arm still support toolchains that

> don't have .inst?  Anyway, no big deal.


I thought we still supported those, so I'd be reluctant to merge the
clauses unless it's broken (the script works as-is for me with arm). I'm
also not sure what we should do for 16-bit Thumb-2 encodings, where we
have inst.n and inst.w to contend with.

> > +		fi

> > +

> > +		${CROSS_COMPILE}strip $1.o

> > +	fi

> > +

> >  	${CROSS_COMPILE}objdump $OBJDUMPFLAGS -S $1.o | \

> >  		grep -v "/tmp\|Disassembly\|\.text\|^$" > $1.dis 2>&1

> 

> FWIW,

> 

> Reviewed-by: Dave Martin <Dave.Martin@arm.com>


Thanks!

Will
Dave Martin Sept. 28, 2017, 2:37 p.m. UTC | #3
On Thu, Sep 28, 2017 at 03:14:47PM +0100, Will Deacon wrote:
> On Thu, Sep 28, 2017 at 01:42:31PM +0100, Dave Martin wrote:

> > On Thu, Sep 28, 2017 at 11:55:47AM +0100, Will Deacon wrote:

> > > There are a couple of problems with the decodecode script and arm64:

> > > 

> > > 1. AArch64 objdump refuses to disassemble .4byte directives as instructions,

> > >    insisting that they are data values and displaying them as:

> > > 

> > > 	a94153f3	.word	0xa94153f3		<-- trapping instruction

> > > 

> > >    This is resolved by using the .inst directive instead.

> > > 

> > > 2. Disassembly of branch instructions attempts to provide the target as

> > >    an offset from a symbol, e.g.:

> > > 

> > >    0:	34000082	cbz	w2, 10 <.text+0x10>

> > > 

> > >   however this falls foul of the grep -v, which matches lines containing

> > >   ".text" and ends up removing all branch instructions from the dump.

> > 

> > Any idea why this doesn't affect other arches too ... or does it?

> 

> I'm not sure, although I don't know how .inst works for architectures

> with variable-length instructions and I *guess* the disassembly is less

> fussy about data vs text for those targets.


I rather meant the target disassembly for relative branches in the
absence of labels.

Anyway, I think this is at least harmless to other arches, and possibly
helpful to them (if they disassemble those branch targets in the same
sort of way).

> > > This patch resolves both issues by using the .inst directive for 4-byte

> > > quantities on arm64 and stripping the resulting binaries (as is done on

> > > arm already) to remove the mapping symbols.

> > > 

> > > Signed-off-by: Will Deacon <will.deacon@arm.com>

> > > 

> > > ---

> > >  scripts/decodecode | 8 ++++++++

> > >  1 file changed, 8 insertions(+)

> > > 

> > > diff --git a/scripts/decodecode b/scripts/decodecode

> > > index d8824f37acce..67214ec5b2cb 100755

> > > --- a/scripts/decodecode

> > > +++ b/scripts/decodecode

> > > @@ -58,6 +58,14 @@ disas() {

> > >  		${CROSS_COMPILE}strip $1.o

> > >  	fi

> > >  

> > > +	if [ "$ARCH" = "arm64" ]; then

> > > +		if [ $width -eq 4 ]; then

> > > +			type=inst

> > 

> > Can we merge with arm here, or does arm still support toolchains that

> > don't have .inst?  Anyway, no big deal.

> 

> I thought we still supported those, so I'd be reluctant to merge the

> clauses unless it's broken (the script works as-is for me with arm). I'm

> also not sure what we should do for 16-bit Thumb-2 encodings, where we

> have inst.n and inst.w to contend with.


Fair enough.  I suspected as much, too.

Cheers
---Dave
 
> > > +		fi

> > > +

> > > +		${CROSS_COMPILE}strip $1.o

> > > +	fi

> > > +

> > >  	${CROSS_COMPILE}objdump $OBJDUMPFLAGS -S $1.o | \

> > >  		grep -v "/tmp\|Disassembly\|\.text\|^$" > $1.dis 2>&1

> > 

> > FWIW,

> > 

> > Reviewed-by: Dave Martin <Dave.Martin@arm.com>

> 

> Thanks!

> 

> Will

> 

> _______________________________________________

> linux-arm-kernel mailing list

> linux-arm-kernel@lists.infradead.org

> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Will Deacon Sept. 28, 2017, 6:01 p.m. UTC | #4
On Thu, Sep 28, 2017 at 03:37:04PM +0100, Dave Martin wrote:
> On Thu, Sep 28, 2017 at 03:14:47PM +0100, Will Deacon wrote:

> > On Thu, Sep 28, 2017 at 01:42:31PM +0100, Dave Martin wrote:

> > > On Thu, Sep 28, 2017 at 11:55:47AM +0100, Will Deacon wrote:

> > > > There are a couple of problems with the decodecode script and arm64:

> > > > 

> > > > 1. AArch64 objdump refuses to disassemble .4byte directives as instructions,

> > > >    insisting that they are data values and displaying them as:

> > > > 

> > > > 	a94153f3	.word	0xa94153f3		<-- trapping instruction

> > > > 

> > > >    This is resolved by using the .inst directive instead.

> > > > 

> > > > 2. Disassembly of branch instructions attempts to provide the target as

> > > >    an offset from a symbol, e.g.:

> > > > 

> > > >    0:	34000082	cbz	w2, 10 <.text+0x10>

> > > > 

> > > >   however this falls foul of the grep -v, which matches lines containing

> > > >   ".text" and ends up removing all branch instructions from the dump.

> > > 

> > > Any idea why this doesn't affect other arches too ... or does it?

> > 

> > I'm not sure, although I don't know how .inst works for architectures

> > with variable-length instructions and I *guess* the disassembly is less

> > fussy about data vs text for those targets.

> 

> I rather meant the target disassembly for relative branches in the

> absence of labels.

> 

> Anyway, I think this is at least harmless to other arches, and possibly

> helpful to them (if they disassemble those branch targets in the same

> sort of way).


Ah, I see what you mean. Something like the fixup below on top.

Will

--->8

diff --git a/scripts/decodecode b/scripts/decodecode
index 67214ec5b2cb..f1ec57c3cbf7 100755
--- a/scripts/decodecode
+++ b/scripts/decodecode
@@ -49,21 +49,14 @@ esac
 
 disas() {
 	${CROSS_COMPILE}as $AFLAGS -o $1.o $1.s > /dev/null 2>&1
+	${CROSS_COMPILE}strip $1.o
 
-	if [ "$ARCH" = "arm" ]; then
-		if [ $width -eq 2 ]; then
-			OBJDUMPFLAGS="-M force-thumb"
-		fi
-
-		${CROSS_COMPILE}strip $1.o
+	if [ "$ARCH" = "arm" -a $width -eq 2 ]; then
+		OBJDUMPFLAGS="-M force-thumb"
 	fi
 
-	if [ "$ARCH" = "arm64" ]; then
-		if [ $width -eq 4 ]; then
-			type=inst
-		fi
-
-		${CROSS_COMPILE}strip $1.o
+	if [ "$ARCH" = "arm64" -a $width -eq 4 ]; then
+		type=inst
 	fi
 
 	${CROSS_COMPILE}objdump $OBJDUMPFLAGS -S $1.o | \
Dave Martin Sept. 29, 2017, 10:07 a.m. UTC | #5
On Thu, Sep 28, 2017 at 07:01:35PM +0100, Will Deacon wrote:
> On Thu, Sep 28, 2017 at 03:37:04PM +0100, Dave Martin wrote:

> > On Thu, Sep 28, 2017 at 03:14:47PM +0100, Will Deacon wrote:

> > > On Thu, Sep 28, 2017 at 01:42:31PM +0100, Dave Martin wrote:

> > > > On Thu, Sep 28, 2017 at 11:55:47AM +0100, Will Deacon wrote:

> > > > > There are a couple of problems with the decodecode script and arm64:

> > > > > 

> > > > > 1. AArch64 objdump refuses to disassemble .4byte directives as instructions,

> > > > >    insisting that they are data values and displaying them as:

> > > > > 

> > > > > 	a94153f3	.word	0xa94153f3		<-- trapping instruction

> > > > > 

> > > > >    This is resolved by using the .inst directive instead.

> > > > > 

> > > > > 2. Disassembly of branch instructions attempts to provide the target as

> > > > >    an offset from a symbol, e.g.:

> > > > > 

> > > > >    0:	34000082	cbz	w2, 10 <.text+0x10>

> > > > > 

> > > > >   however this falls foul of the grep -v, which matches lines containing

> > > > >   ".text" and ends up removing all branch instructions from the dump.

> > > > 

> > > > Any idea why this doesn't affect other arches too ... or does it?

> > > 

> > > I'm not sure, although I don't know how .inst works for architectures

> > > with variable-length instructions and I *guess* the disassembly is less

> > > fussy about data vs text for those targets.

> > 

> > I rather meant the target disassembly for relative branches in the

> > absence of labels.

> > 

> > Anyway, I think this is at least harmless to other arches, and possibly

> > helpful to them (if they disassemble those branch targets in the same

> > sort of way).

> 

> Ah, I see what you mean. Something like the fixup below on top.

> 

> Will

> 

> --->8

> 

> diff --git a/scripts/decodecode b/scripts/decodecode

> index 67214ec5b2cb..f1ec57c3cbf7 100755

> --- a/scripts/decodecode

> +++ b/scripts/decodecode

> @@ -49,21 +49,14 @@ esac

>  

>  disas() {

>  	${CROSS_COMPILE}as $AFLAGS -o $1.o $1.s > /dev/null 2>&1

> +	${CROSS_COMPILE}strip $1.o

>  

> -	if [ "$ARCH" = "arm" ]; then

> -		if [ $width -eq 2 ]; then

> -			OBJDUMPFLAGS="-M force-thumb"

> -		fi

> -

> -		${CROSS_COMPILE}strip $1.o

> +	if [ "$ARCH" = "arm" -a $width -eq 2 ]; then

> +		OBJDUMPFLAGS="-M force-thumb"

>  	fi

>  

> -	if [ "$ARCH" = "arm64" ]; then

> -		if [ $width -eq 4 ]; then

> -			type=inst

> -		fi

> -

> -		${CROSS_COMPILE}strip $1.o

> +	if [ "$ARCH" = "arm64" -a $width -eq 4 ]; then

> +		type=inst

>  	fi


Reasonable, though I guess it doesn't matter unless another arch really
cares -- in which case someone will eventually spot the issue and
probably write the same patch.

Cheers
---Dave
diff mbox series

Patch

diff --git a/scripts/decodecode b/scripts/decodecode
index d8824f37acce..67214ec5b2cb 100755
--- a/scripts/decodecode
+++ b/scripts/decodecode
@@ -58,6 +58,14 @@  disas() {
 		${CROSS_COMPILE}strip $1.o
 	fi
 
+	if [ "$ARCH" = "arm64" ]; then
+		if [ $width -eq 4 ]; then
+			type=inst
+		fi
+
+		${CROSS_COMPILE}strip $1.o
+	fi
+
 	${CROSS_COMPILE}objdump $OBJDUMPFLAGS -S $1.o | \
 		grep -v "/tmp\|Disassembly\|\.text\|^$" > $1.dis 2>&1
 }