diff mbox

[v3,6/9] target-arm: support QMP dump-guest-memory

Message ID 1450219878-5293-7-git-send-email-drjones@redhat.com
State Superseded
Headers show

Commit Message

Andrew Jones Dec. 15, 2015, 10:51 p.m. UTC
Add the support needed for creating prstatus elf notes. This
allows us to use QMP dump-guest-memory.

Signed-off-by: Andrew Jones <drjones@redhat.com>

---
 target-arm/Makefile.objs |   3 +-
 target-arm/arch_dump.c   | 230 +++++++++++++++++++++++++++++++++++++++++++++++
 target-arm/cpu-qom.h     |   5 ++
 target-arm/cpu.c         |   3 +
 4 files changed, 239 insertions(+), 2 deletions(-)
 create mode 100644 target-arm/arch_dump.c

-- 
2.4.3

Comments

Peter Maydell Dec. 18, 2015, 11:59 a.m. UTC | #1
On 15 December 2015 at 22:51, Andrew Jones <drjones@redhat.com> wrote:
> Add the support needed for creating prstatus elf notes. This

> allows us to use QMP dump-guest-memory.

>

> Signed-off-by: Andrew Jones <drjones@redhat.com>


> +int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,

> +                             int cpuid, void *opaque)

> +{

> +    struct aarch64_note note;

> +    CPUARMState *env = &ARM_CPU(cs)->env;

> +    DumpState *s = opaque;

> +    uint64_t pstate, sp;

> +    int ret, i;

> +

> +    aarch64_note_init(&note, s, "CORE", 5, NT_PRSTATUS, sizeof(note.prstatus));

> +

> +    note.prstatus.pr_pid = cpu_to_dump32(s, cpuid);

> +

> +    if (!is_a64(env)) {

> +        aarch64_sync_32_to_64(env);

> +        pstate = cpsr_read(env);

> +        sp = aarch64_compat_sp(env);


I don't understand why we need to do this. If this is an
AArch64 dump then we should just treat it as an AArch64
dump, and presumably the consumer of the dump knows enough
to know what the "hypervisor view" of a CPU that's currently
in 32-bit mode is. It has to anyway to be able to figure
out where all the other registers are, so why can't it
also figure out what mode the CPU is currently in and thus
where r13 is in the xregs array?

> +    } else {

> +        pstate = pstate_read(env);

> +        sp = env->xregs[31];

> +    }


thanks
-- PMM
Andrew Jones Dec. 18, 2015, 4:05 p.m. UTC | #2
On Fri, Dec 18, 2015 at 11:59:39AM +0000, Peter Maydell wrote:
> On 15 December 2015 at 22:51, Andrew Jones <drjones@redhat.com> wrote:

> > Add the support needed for creating prstatus elf notes. This

> > allows us to use QMP dump-guest-memory.

> >

> > Signed-off-by: Andrew Jones <drjones@redhat.com>

> 

> > +int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,

> > +                             int cpuid, void *opaque)

> > +{

> > +    struct aarch64_note note;

> > +    CPUARMState *env = &ARM_CPU(cs)->env;

> > +    DumpState *s = opaque;

> > +    uint64_t pstate, sp;

> > +    int ret, i;

> > +

> > +    aarch64_note_init(&note, s, "CORE", 5, NT_PRSTATUS, sizeof(note.prstatus));

> > +

> > +    note.prstatus.pr_pid = cpu_to_dump32(s, cpuid);

> > +

> > +    if (!is_a64(env)) {

> > +        aarch64_sync_32_to_64(env);

> > +        pstate = cpsr_read(env);

> > +        sp = aarch64_compat_sp(env);

> 

> I don't understand why we need to do this. If this is an

> AArch64 dump then we should just treat it as an AArch64

> dump, and presumably the consumer of the dump knows enough

> to know what the "hypervisor view" of a CPU that's currently

> in 32-bit mode is. It has to anyway to be able to figure

> out where all the other registers are, so why can't it

> also figure out what mode the CPU is currently in and thus

> where r13 is in the xregs array?


You're probably right that this shouldn't be necessary. But, in order for
it not to be necessary, I'll need to write another crash patch. Currently,
if you do a dump-guest-memory on a running guest, i.e. one where the kernel
has not called panic(), and thus the cpus are actually in 32-bit usermode,
rather than in the 64-bit cpu-stop IPI handler, then the crash utility
segfaults if sp == xregs[31]. crash does properly decode the registers
it digs out of the stack frame on a panic'ed cpu though, and setting sp
to aarch64_compat_sp here also allows crash to work properly in the non-
panic'd case.

So, I could teach crash to do what I'm doing here in qemu instead, but
there's still one more reason why it may make sense to do it here. That
reason is that I don't know what else to put in prstatus.pr_reg.sp. Does
xregs[31] make the most sense? or just zero? prstatus.pr_reg.pc is the
correct 32-bit userspace pc, prstatus.pr_reg.pstate is the correct cpsr,
so why not set sp to the correct userspace sp?

Thanks,
drew

> 

> > +    } else {

> > +        pstate = pstate_read(env);

> > +        sp = env->xregs[31];

> > +    }

> 

> thanks

> -- PMM

>
Peter Maydell Dec. 18, 2015, 4:31 p.m. UTC | #3
On 18 December 2015 at 16:05, Andrew Jones <drjones@redhat.com> wrote:
> On Fri, Dec 18, 2015 at 11:59:39AM +0000, Peter Maydell wrote:

>> On 15 December 2015 at 22:51, Andrew Jones <drjones@redhat.com> wrote:

>> > Add the support needed for creating prstatus elf notes. This

>> > allows us to use QMP dump-guest-memory.

>> >

>> > Signed-off-by: Andrew Jones <drjones@redhat.com>

>>

>> > +int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,

>> > +                             int cpuid, void *opaque)

>> > +{

>> > +    struct aarch64_note note;

>> > +    CPUARMState *env = &ARM_CPU(cs)->env;

>> > +    DumpState *s = opaque;

>> > +    uint64_t pstate, sp;

>> > +    int ret, i;

>> > +

>> > +    aarch64_note_init(&note, s, "CORE", 5, NT_PRSTATUS, sizeof(note.prstatus));

>> > +

>> > +    note.prstatus.pr_pid = cpu_to_dump32(s, cpuid);

>> > +

>> > +    if (!is_a64(env)) {

>> > +        aarch64_sync_32_to_64(env);

>> > +        pstate = cpsr_read(env);

>> > +        sp = aarch64_compat_sp(env);

>>

>> I don't understand why we need to do this. If this is an

>> AArch64 dump then we should just treat it as an AArch64

>> dump, and presumably the consumer of the dump knows enough

>> to know what the "hypervisor view" of a CPU that's currently

>> in 32-bit mode is. It has to anyway to be able to figure

>> out where all the other registers are, so why can't it

>> also figure out what mode the CPU is currently in and thus

>> where r13 is in the xregs array?

>

> You're probably right that this shouldn't be necessary. But, in order for

> it not to be necessary, I'll need to write another crash patch. Currently,

> if you do a dump-guest-memory on a running guest, i.e. one where the kernel

> has not called panic(), and thus the cpus are actually in 32-bit usermode,

> rather than in the 64-bit cpu-stop IPI handler, then the crash utility

> segfaults if sp == xregs[31]. crash does properly decode the registers

> it digs out of the stack frame on a panic'ed cpu though, and setting sp

> to aarch64_compat_sp here also allows crash to work properly in the non-

> panic'd case.


If crash segfaults then that's clearly a bug in crash...
What is it expecting to see in the SP field?

> So, I could teach crash to do what I'm doing here in qemu instead, but

> there's still one more reason why it may make sense to do it here. That

> reason is that I don't know what else to put in prstatus.pr_reg.sp. Does

> xregs[31] make the most sense? or just zero? prstatus.pr_reg.pc is the

> correct 32-bit userspace pc, prstatus.pr_reg.pstate is the correct cpsr,

> so why not set sp to the correct userspace sp?


Well, what spec are we following here? The most logical view
to me seems to be to say "you get the view of the system
that you get from a JTAG debugger or a hypervisor", which
is to say you see a 64-bit set of registers and it's the
debugger's job to decide which bits of those might be
interesting to view as 32-bit and what is actually "live"
32 bit state.

From that point of view there is no valid AArch64 SP register
at this point in execution (xregs[31] for QEMU would be stale
state, so not a good choice I think).

thanks
-- PMM
Andrew Jones Dec. 18, 2015, 6:05 p.m. UTC | #4
On Fri, Dec 18, 2015 at 04:31:13PM +0000, Peter Maydell wrote:
> On 18 December 2015 at 16:05, Andrew Jones <drjones@redhat.com> wrote:

> > On Fri, Dec 18, 2015 at 11:59:39AM +0000, Peter Maydell wrote:

> >> On 15 December 2015 at 22:51, Andrew Jones <drjones@redhat.com> wrote:

> >> > Add the support needed for creating prstatus elf notes. This

> >> > allows us to use QMP dump-guest-memory.

> >> >

> >> > Signed-off-by: Andrew Jones <drjones@redhat.com>

> >>

> >> > +int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,

> >> > +                             int cpuid, void *opaque)

> >> > +{

> >> > +    struct aarch64_note note;

> >> > +    CPUARMState *env = &ARM_CPU(cs)->env;

> >> > +    DumpState *s = opaque;

> >> > +    uint64_t pstate, sp;

> >> > +    int ret, i;

> >> > +

> >> > +    aarch64_note_init(&note, s, "CORE", 5, NT_PRSTATUS, sizeof(note.prstatus));

> >> > +

> >> > +    note.prstatus.pr_pid = cpu_to_dump32(s, cpuid);

> >> > +

> >> > +    if (!is_a64(env)) {

> >> > +        aarch64_sync_32_to_64(env);

> >> > +        pstate = cpsr_read(env);

> >> > +        sp = aarch64_compat_sp(env);

> >>

> >> I don't understand why we need to do this. If this is an

> >> AArch64 dump then we should just treat it as an AArch64

> >> dump, and presumably the consumer of the dump knows enough

> >> to know what the "hypervisor view" of a CPU that's currently

> >> in 32-bit mode is. It has to anyway to be able to figure

> >> out where all the other registers are, so why can't it

> >> also figure out what mode the CPU is currently in and thus

> >> where r13 is in the xregs array?

> >

> > You're probably right that this shouldn't be necessary. But, in order for

> > it not to be necessary, I'll need to write another crash patch. Currently,

> > if you do a dump-guest-memory on a running guest, i.e. one where the kernel

> > has not called panic(), and thus the cpus are actually in 32-bit usermode,

> > rather than in the 64-bit cpu-stop IPI handler, then the crash utility

> > segfaults if sp == xregs[31]. crash does properly decode the registers

> > it digs out of the stack frame on a panic'ed cpu though, and setting sp

> > to aarch64_compat_sp here also allows crash to work properly in the non-

> > panic'd case.

> 

> If crash segfaults then that's clearly a bug in crash...

> What is it expecting to see in the SP field?


A valid stack pointer

> 

> > So, I could teach crash to do what I'm doing here in qemu instead, but

> > there's still one more reason why it may make sense to do it here. That

> > reason is that I don't know what else to put in prstatus.pr_reg.sp. Does

> > xregs[31] make the most sense? or just zero? prstatus.pr_reg.pc is the

> > correct 32-bit userspace pc, prstatus.pr_reg.pstate is the correct cpsr,

> > so why not set sp to the correct userspace sp?

> 

> Well, what spec are we following here? The most logical view


With the shoehorning approach we're following the aarch64 ptrace spec,
but putting arm32 registers in it. We then rely on the analysis tools
to do the right thing when they see pstate has PSR_MODE32_BIT set, which
all cpsr modes do.

The ptrace code in the kernel would return a real aarch32 view, i.e.
only registers up to r15 and a cpsr. We can't do that here because we've
committed to an EM_AARCH64 formatted core, and we only have one type of
PRSTATUS note for that core type. Furthermore, we want to be able to
return all the registers to handle dumps of 64-bit EL2 with 32-bit EL1s.

> to me seems to be to say "you get the view of the system

> that you get from a JTAG debugger or a hypervisor", which

> is to say you see a 64-bit set of registers and it's the

> debugger's job to decide which bits of those might be

> interesting to view as 32-bit and what is actually "live"

> 32 bit state.


It appears that PRSTATUS aware tools don't currently work this way.

> 

> From that point of view there is no valid AArch64 SP register

> at this point in execution (xregs[31] for QEMU would be stale

> state, so not a good choice I think).


I suppose zero or all 1's are the safest choices for "undefined", but
being undefined actually gives us freedom to use aarch64_compat_sp as
well, which, to me, looks like a safer and more useful value.

Thanks,
drew
Peter Maydell Dec. 18, 2015, 6:46 p.m. UTC | #5
On 18 December 2015 at 18:05, Andrew Jones <drjones@redhat.com> wrote:
> On Fri, Dec 18, 2015 at 04:31:13PM +0000, Peter Maydell wrote:

>> On 18 December 2015 at 16:05, Andrew Jones <drjones@redhat.com> wrote:

>> > On Fri, Dec 18, 2015 at 11:59:39AM +0000, Peter Maydell wrote:

>> >> I don't understand why we need to do this. If this is an

>> >> AArch64 dump then we should just treat it as an AArch64

>> >> dump, and presumably the consumer of the dump knows enough

>> >> to know what the "hypervisor view" of a CPU that's currently

>> >> in 32-bit mode is. It has to anyway to be able to figure

>> >> out where all the other registers are, so why can't it

>> >> also figure out what mode the CPU is currently in and thus

>> >> where r13 is in the xregs array?

>> >

>> > You're probably right that this shouldn't be necessary. But, in order for

>> > it not to be necessary, I'll need to write another crash patch. Currently,

>> > if you do a dump-guest-memory on a running guest, i.e. one where the kernel

>> > has not called panic(), and thus the cpus are actually in 32-bit usermode,

>> > rather than in the 64-bit cpu-stop IPI handler, then the crash utility

>> > segfaults if sp == xregs[31]. crash does properly decode the registers

>> > it digs out of the stack frame on a panic'ed cpu though, and setting sp

>> > to aarch64_compat_sp here also allows crash to work properly in the non-

>> > panic'd case.

>>

>> If crash segfaults then that's clearly a bug in crash...

>> What is it expecting to see in the SP field?

>

> A valid stack pointer


But why? What does it do with it? (In any case a dump of a crashed
system could have any random rubbish in SP so the tool has to handle
it being something weird.)

>> > So, I could teach crash to do what I'm doing here in qemu instead, but

>> > there's still one more reason why it may make sense to do it here. That

>> > reason is that I don't know what else to put in prstatus.pr_reg.sp. Does

>> > xregs[31] make the most sense? or just zero? prstatus.pr_reg.pc is the

>> > correct 32-bit userspace pc, prstatus.pr_reg.pstate is the correct cpsr,

>> > so why not set sp to the correct userspace sp?

>>

>> Well, what spec are we following here? The most logical view

>

> With the shoehorning approach we're following the aarch64 ptrace spec,

> but putting arm32 registers in it. We then rely on the analysis tools

> to do the right thing when they see pstate has PSR_MODE32_BIT set, which

> all cpsr modes do.


Right, so the analysis tool already has to cope with the MODE32 bit
being set, and it can also figure out where the SP is (and which
other registers are currently live for 32-bit).

> The ptrace code in the kernel would return a real aarch32 view, i.e.

> only registers up to r15 and a cpsr. We can't do that here because we've

> committed to an EM_AARCH64 formatted core, and we only have one type of

> PRSTATUS note for that core type. Furthermore, we want to be able to

> return all the registers to handle dumps of 64-bit EL2 with 32-bit EL1s.

>

>> to me seems to be to say "you get the view of the system

>> that you get from a JTAG debugger or a hypervisor", which

>> is to say you see a 64-bit set of registers and it's the

>> debugger's job to decide which bits of those might be

>> interesting to view as 32-bit and what is actually "live"

>> 32 bit state.

>

> It appears that PRSTATUS aware tools don't currently work this way.

>

>>

>> From that point of view there is no valid AArch64 SP register

>> at this point in execution (xregs[31] for QEMU would be stale

>> state, so not a good choice I think).

>

> I suppose zero or all 1's are the safest choices for "undefined", but

> being undefined actually gives us freedom to use aarch64_compat_sp as

> well, which, to me, looks like a safer and more useful value.


It just doesn't really make sense to me to do this one bit of
work for the debug analysis tools when they already have to know
that 32-bit modes are special. We seem to end up with a function
in QEMU whose only purpose is working around a bug in the thing
consuming the coredump.

thanks
-- PMM
Andrew Jones Dec. 18, 2015, 7:57 p.m. UTC | #6
On Fri, Dec 18, 2015 at 06:46:14PM +0000, Peter Maydell wrote:
> On 18 December 2015 at 18:05, Andrew Jones <drjones@redhat.com> wrote:

> > On Fri, Dec 18, 2015 at 04:31:13PM +0000, Peter Maydell wrote:

> >> On 18 December 2015 at 16:05, Andrew Jones <drjones@redhat.com> wrote:

> >> > On Fri, Dec 18, 2015 at 11:59:39AM +0000, Peter Maydell wrote:

> >> >> I don't understand why we need to do this. If this is an

> >> >> AArch64 dump then we should just treat it as an AArch64

> >> >> dump, and presumably the consumer of the dump knows enough

> >> >> to know what the "hypervisor view" of a CPU that's currently

> >> >> in 32-bit mode is. It has to anyway to be able to figure

> >> >> out where all the other registers are, so why can't it

> >> >> also figure out what mode the CPU is currently in and thus

> >> >> where r13 is in the xregs array?

> >> >

> >> > You're probably right that this shouldn't be necessary. But, in order for

> >> > it not to be necessary, I'll need to write another crash patch. Currently,

> >> > if you do a dump-guest-memory on a running guest, i.e. one where the kernel

> >> > has not called panic(), and thus the cpus are actually in 32-bit usermode,

> >> > rather than in the 64-bit cpu-stop IPI handler, then the crash utility

> >> > segfaults if sp == xregs[31]. crash does properly decode the registers

> >> > it digs out of the stack frame on a panic'ed cpu though, and setting sp

> >> > to aarch64_compat_sp here also allows crash to work properly in the non-

> >> > panic'd case.

> >>

> >> If crash segfaults then that's clearly a bug in crash...

> >> What is it expecting to see in the SP field?

> >

> > A valid stack pointer

> 

> But why? What does it do with it? (In any case a dump of a crashed

> system could have any random rubbish in SP so the tool has to handle

> it being something weird.)


The reason crash segfaults is because sp (xregs[31]) wasn't a user stack
address, and thus it expected the stack to include at least two frames,
which would mean fp would be non-zero, but it's not, and that leads to the
calculation of a bad stack pointer which then gets used as an offset into
the stack buffer, overflowing it. This is definitely a crash bug that
should be fixed. I'll report it.

Also, crash's arm64_get_dumpfile_stackframe() simply assumes sp will be
the stack pointer, and it doesn't check pstate for the MODE32 bit first.
I think it should be easy to fix this, as I only found two places that
should be changed. I'll send a patch for that.

> 

> >> > So, I could teach crash to do what I'm doing here in qemu instead, but

> >> > there's still one more reason why it may make sense to do it here. That

> >> > reason is that I don't know what else to put in prstatus.pr_reg.sp. Does

> >> > xregs[31] make the most sense? or just zero? prstatus.pr_reg.pc is the

> >> > correct 32-bit userspace pc, prstatus.pr_reg.pstate is the correct cpsr,

> >> > so why not set sp to the correct userspace sp?

> >>

> >> Well, what spec are we following here? The most logical view

> >

> > With the shoehorning approach we're following the aarch64 ptrace spec,

> > but putting arm32 registers in it. We then rely on the analysis tools

> > to do the right thing when they see pstate has PSR_MODE32_BIT set, which

> > all cpsr modes do.

> 

> Right, so the analysis tool already has to cope with the MODE32 bit

> being set, and it can also figure out where the SP is (and which

> other registers are currently live for 32-bit).

> 

> > The ptrace code in the kernel would return a real aarch32 view, i.e.

> > only registers up to r15 and a cpsr. We can't do that here because we've

> > committed to an EM_AARCH64 formatted core, and we only have one type of

> > PRSTATUS note for that core type. Furthermore, we want to be able to

> > return all the registers to handle dumps of 64-bit EL2 with 32-bit EL1s.

> >

> >> to me seems to be to say "you get the view of the system

> >> that you get from a JTAG debugger or a hypervisor", which

> >> is to say you see a 64-bit set of registers and it's the

> >> debugger's job to decide which bits of those might be

> >> interesting to view as 32-bit and what is actually "live"

> >> 32 bit state.

> >

> > It appears that PRSTATUS aware tools don't currently work this way.

> >

> >>

> >> From that point of view there is no valid AArch64 SP register

> >> at this point in execution (xregs[31] for QEMU would be stale

> >> state, so not a good choice I think).

> >

> > I suppose zero or all 1's are the safest choices for "undefined", but

> > being undefined actually gives us freedom to use aarch64_compat_sp as

> > well, which, to me, looks like a safer and more useful value.

> 

> It just doesn't really make sense to me to do this one bit of

> work for the debug analysis tools when they already have to know

> that 32-bit modes are special. We seem to end up with a function

> in QEMU whose only purpose is working around a bug in the thing

> consuming the coredump.


OK, I've come around to your point of view. What value would you like
me to put in sp? 0, 1's, or ??

Thanks,
drew
diff mbox

Patch

diff --git a/target-arm/Makefile.objs b/target-arm/Makefile.objs
index 9460b409a5a1c..a80eb39743a78 100644
--- a/target-arm/Makefile.objs
+++ b/target-arm/Makefile.objs
@@ -1,5 +1,5 @@ 
 obj-y += arm-semi.o
-obj-$(CONFIG_SOFTMMU) += machine.o
+obj-$(CONFIG_SOFTMMU) += machine.o psci.o arch_dump.o
 obj-$(CONFIG_KVM) += kvm.o
 obj-$(call land,$(CONFIG_KVM),$(call lnot,$(TARGET_AARCH64))) += kvm32.o
 obj-$(call land,$(CONFIG_KVM),$(TARGET_AARCH64)) += kvm64.o
@@ -7,6 +7,5 @@  obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
 obj-y += translate.o op_helper.o helper.o cpu.o
 obj-y += neon_helper.o iwmmxt_helper.o
 obj-y += gdbstub.o
-obj-$(CONFIG_SOFTMMU) += psci.o
 obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o
 obj-y += crypto_helper.o
diff --git a/target-arm/arch_dump.c b/target-arm/arch_dump.c
new file mode 100644
index 0000000000000..dc32d98101004
--- /dev/null
+++ b/target-arm/arch_dump.c
@@ -0,0 +1,230 @@ 
+/* Support for writing ELF notes for ARM architectures
+ *
+ * Copyright (C) 2015 Red Hat Inc.
+ *
+ * Author: Andrew Jones <drjones@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "cpu.h"
+#include "elf.h"
+#include "sysemu/dump.h"
+
+/* struct user_pt_regs from arch/arm64/include/uapi/asm/ptrace.h */
+struct aarch64_user_regs {
+    uint64_t regs[31];
+    uint64_t sp;
+    uint64_t pc;
+    uint64_t pstate;
+} QEMU_PACKED;
+
+QEMU_BUILD_BUG_ON(sizeof(struct aarch64_user_regs) != 272);
+
+/* struct elf_prstatus from include/uapi/linux/elfcore.h */
+struct aarch64_elf_prstatus {
+    char pad1[32]; /* 32 == offsetof(struct elf_prstatus, pr_pid) */
+    uint32_t pr_pid;
+    char pad2[76]; /* 76 == offsetof(struct elf_prstatus, pr_reg) -
+                            offsetof(struct elf_prstatus, pr_ppid) */
+    struct aarch64_user_regs pr_reg;
+    uint32_t pr_fpvalid;
+    char pad3[4];
+} QEMU_PACKED;
+
+QEMU_BUILD_BUG_ON(sizeof(struct aarch64_elf_prstatus) != 392);
+
+struct aarch64_note {
+    Elf64_Nhdr hdr;
+    char name[8]; /* align_up(sizeof("CORE"), 4) */
+    struct aarch64_elf_prstatus prstatus;
+} QEMU_PACKED;
+
+QEMU_BUILD_BUG_ON(sizeof(struct aarch64_note) != 412);
+
+static void aarch64_note_init(struct aarch64_note *note, DumpState *s,
+                              const char *name, Elf64_Word namesz,
+                              Elf64_Word type, Elf64_Word descsz)
+{
+    memset(note, 0, sizeof(*note));
+
+    note->hdr.n_namesz = cpu_to_dump32(s, namesz);
+    note->hdr.n_descsz = cpu_to_dump32(s, descsz);
+    note->hdr.n_type = cpu_to_dump32(s, type);
+
+    memcpy(note->name, name, namesz);
+}
+
+int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
+                             int cpuid, void *opaque)
+{
+    struct aarch64_note note;
+    CPUARMState *env = &ARM_CPU(cs)->env;
+    DumpState *s = opaque;
+    uint64_t pstate, sp;
+    int ret, i;
+
+    aarch64_note_init(&note, s, "CORE", 5, NT_PRSTATUS, sizeof(note.prstatus));
+
+    note.prstatus.pr_pid = cpu_to_dump32(s, cpuid);
+
+    if (!is_a64(env)) {
+        aarch64_sync_32_to_64(env);
+        pstate = cpsr_read(env);
+        sp = aarch64_compat_sp(env);
+    } else {
+        pstate = pstate_read(env);
+        sp = env->xregs[31];
+    }
+
+    for (i = 0; i < 31; ++i) {
+        note.prstatus.pr_reg.regs[i] = cpu_to_dump64(s, env->xregs[i]);
+    }
+    note.prstatus.pr_reg.sp = cpu_to_dump64(s, sp);
+    note.prstatus.pr_reg.pc = cpu_to_dump64(s, env->pc);
+    note.prstatus.pr_reg.pstate = cpu_to_dump64(s, pstate);
+
+    ret = f(&note, sizeof(note), s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+/* struct pt_regs from arch/arm/include/asm/ptrace.h */
+struct arm_user_regs {
+    uint32_t regs[17];
+    char pad[4];
+} QEMU_PACKED;
+
+QEMU_BUILD_BUG_ON(sizeof(struct arm_user_regs) != 72);
+
+/* struct elf_prstatus from include/uapi/linux/elfcore.h */
+struct arm_elf_prstatus {
+    char pad1[24]; /* 24 == offsetof(struct elf_prstatus, pr_pid) */
+    uint32_t pr_pid;
+    char pad2[44]; /* 44 == offsetof(struct elf_prstatus, pr_reg) -
+                            offsetof(struct elf_prstatus, pr_ppid) */
+    struct arm_user_regs pr_reg;
+    uint32_t pr_fpvalid;
+} QEMU_PACKED arm_elf_prstatus;
+
+QEMU_BUILD_BUG_ON(sizeof(struct arm_elf_prstatus) != 148);
+
+struct arm_note {
+    Elf32_Nhdr hdr;
+    char name[8]; /* align_up(sizeof("CORE"), 4) */
+    struct arm_elf_prstatus prstatus;
+} QEMU_PACKED;
+
+QEMU_BUILD_BUG_ON(sizeof(struct arm_note) != 168);
+
+static void arm_note_init(struct arm_note *note, DumpState *s,
+                          const char *name, Elf32_Word namesz,
+                          Elf32_Word type, Elf32_Word descsz)
+{
+    memset(note, 0, sizeof(*note));
+
+    note->hdr.n_namesz = cpu_to_dump32(s, namesz);
+    note->hdr.n_descsz = cpu_to_dump32(s, descsz);
+    note->hdr.n_type = cpu_to_dump32(s, type);
+
+    memcpy(note->name, name, namesz);
+}
+
+int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
+                             int cpuid, void *opaque)
+{
+    struct arm_note note;
+    CPUARMState *env = &ARM_CPU(cs)->env;
+    DumpState *s = opaque;
+    int ret, i;
+
+    arm_note_init(&note, s, "CORE", 5, NT_PRSTATUS, sizeof(note.prstatus));
+
+    note.prstatus.pr_pid = cpu_to_dump32(s, cpuid);
+
+    for (i = 0; i < 16; ++i) {
+        note.prstatus.pr_reg.regs[i] = cpu_to_dump32(s, env->regs[i]);
+    }
+    note.prstatus.pr_reg.regs[16] = cpu_to_dump32(s, cpsr_read(env));
+
+    ret = f(&note, sizeof(note), s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+int cpu_get_dump_info(ArchDumpInfo *info,
+                      const GuestPhysBlockList *guest_phys_blocks)
+{
+    ARMCPU *cpu = ARM_CPU(first_cpu);
+    CPUARMState *env = &cpu->env;
+    GuestPhysBlock *block;
+    hwaddr lowest_addr = ULLONG_MAX;
+
+    /* Take a best guess at the phys_base. If we get it wrong then crash
+     * will need '--machdep phys_offset=<phys-offset>' added to its command
+     * line, which isn't any worse than assuming we can use zero, but being
+     * wrong. This is the same algorithm the crash utility uses when
+     * attempting to guess as it loads non-dumpfile formatted files.
+     */
+    QTAILQ_FOREACH(block, &guest_phys_blocks->head, next) {
+        if (block->target_start < lowest_addr) {
+            lowest_addr = block->target_start;
+        }
+    }
+
+    if (arm_feature(env, ARM_FEATURE_AARCH64)) {
+        info->d_machine = EM_AARCH64;
+        info->d_class = ELFCLASS64;
+        info->page_size = (1 << 16); /* aarch64 max pagesize */
+        if (lowest_addr != ULLONG_MAX) {
+            info->phys_base = lowest_addr;
+        }
+    } else {
+        info->d_machine = EM_ARM;
+        info->d_class = ELFCLASS32;
+        info->page_size = (1 << 12);
+        if (lowest_addr < UINT_MAX) {
+            info->phys_base = lowest_addr;
+        }
+    }
+
+    /* We assume the relevant endianness is that of EL1; this is right
+     * for kernels, but might give the wrong answer if you're trying to
+     * dump a hypervisor that happens to be running an opposite-endian
+     * kernel.
+     */
+    info->d_endian = (env->cp15.sctlr_el[1] & SCTLR_EE) != 0
+                     ? ELFDATA2MSB : ELFDATA2LSB;
+
+    return 0;
+}
+
+ssize_t cpu_get_note_size(int class, int machine, int nr_cpus)
+{
+    size_t note_size;
+
+    if (class == ELFCLASS64) {
+        note_size = sizeof(struct aarch64_note);
+    } else {
+        note_size = sizeof(struct arm_note);
+    }
+
+    return note_size * nr_cpus;
+}
diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index 25fb1ce0f3f3d..5bd9b7bb9fa7e 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -221,6 +221,11 @@  hwaddr arm_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 int arm_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
 int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 
+int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
+                             int cpuid, void *opaque);
+int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
+                             int cpuid, void *opaque);
+
 /* Callback functions for the generic timer's timers. */
 void arm_gt_ptimer_cb(void *opaque);
 void arm_gt_vtimer_cb(void *opaque);
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 30739fc0dfa74..db91a3f9eb467 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -1428,6 +1428,9 @@  static void arm_cpu_class_init(ObjectClass *oc, void *data)
 
     cc->disas_set_info = arm_disas_set_info;
 
+    cc->write_elf64_note = arm_cpu_write_elf64_note;
+    cc->write_elf32_note = arm_cpu_write_elf32_note;
+
     /*
      * Reason: arm_cpu_initfn() calls cpu_exec_init(), which saves
      * the object in cpus -> dangling pointer after final