mbox series

[0/6] qla2xxx driver features

Message ID 20220715060227.23923-1-njavali@marvell.com
Headers show
Series qla2xxx driver features | expand

Message

Nilesh Javali July 15, 2022, 6:02 a.m. UTC
Martin,

Please apply the qla2xxx driver features to the scsi tree
at your earliest convenience.

Thanks,
Nilesh

Anil Gurumurthy (1):
  qla2xxx: Add NVMe parameters support in Auxiliary Image Status

Arun Easi (4):
  qla2xxx: Add debugfs create/delete helpers
  qla2xxx: Add a generic tracing framework
  qla2xxx: Add driver console messages tracing
  qla2xxx: Add srb tracing

Nilesh Javali (1):
  qla2xxx: Update version to 10.02.07.900-k

 drivers/scsi/qla2xxx/qla_bsg.c     |   8 +-
 drivers/scsi/qla2xxx/qla_bsg.h     |   3 +-
 drivers/scsi/qla2xxx/qla_dbg.c     |  26 ++-
 drivers/scsi/qla2xxx/qla_dbg.h     | 286 +++++++++++++++++++++++++++++
 drivers/scsi/qla2xxx/qla_def.h     |  55 +++++-
 drivers/scsi/qla2xxx/qla_dfs.c     | 239 ++++++++++++++++++++++++
 drivers/scsi/qla2xxx/qla_fw.h      |   3 +
 drivers/scsi/qla2xxx/qla_gbl.h     |   1 +
 drivers/scsi/qla2xxx/qla_init.c    |   8 +-
 drivers/scsi/qla2xxx/qla_inline.h  |  12 ++
 drivers/scsi/qla2xxx/qla_iocb.c    |   4 +
 drivers/scsi/qla2xxx/qla_isr.c     |   5 +
 drivers/scsi/qla2xxx/qla_os.c      |  35 ++++
 drivers/scsi/qla2xxx/qla_version.h |   4 +-
 14 files changed, 675 insertions(+), 14 deletions(-)


base-commit: f095c3cd1b694a73a5de276dae919f05a8dd1811

Comments

Steven Rostedt July 18, 2022, 7:22 p.m. UTC | #1
On Mon, 18 Jul 2022 12:02:26 -0700
Arun Easi <aeasi@marvell.com> wrote:

> Many times when a problem was reported on the driver, we had to request 
> for a repro with extended error logging (via driver module parameter) 
> turned on. With this internal tracing in place, log messages that appear 
> only with extended error logging are captured by default in the internal 
> trace buffer.
> 
> AFAIK, kernel tracing requires a user initiated action to be turned on, 
> like enabling individual tracepoints. Though a script (either startup or 
> udev) can do this job, enabling tracepoints by default for a single 
> driver, I think, may not be a preferred choice -- particularly when the 
> trace buffer is shared across the kernel. That also brings the extra 
> overhead of finding how this could be done across various distros.
> 
> For cases where the memory/driver size matters, there is an option to 
> compile out this feature, plus choosing a lower default trace buffer size.

You can enable an ftrace instance from inside the kernel, and make it a
compile time option.

	#include <linux/trace_events.h>
	#include <linux/trace.h>

	struct trace_array *tr;

	tr = trace_array_get_by_name("qla2xxx");
	trace_array_set_clr_event(tr, "qla", NULL, true);

And now you have trace events running:

 # cat /sys/kernel/tracing/instances/qla/trace

-- Steve
Steven Rostedt July 19, 2022, 1:05 p.m. UTC | #2
On Tue, 19 Jul 2022 10:25:14 +0200
Daniel Wagner <dwagner@suse.de> wrote:

> > You can enable an ftrace instance from inside the kernel, and make it a
> > compile time option.
> >
> > 	#include <linux/trace_events.h>
> > 	#include <linux/trace.h>
> > 
> > 	struct trace_array *tr;
> > 
> > 	tr = trace_array_get_by_name("qla2xxx");
> > 	trace_array_set_clr_event(tr, "qla", NULL, true);
> > 
> > And now you have trace events running:
> > 
> >  # cat /sys/kernel/tracing/instances/qla/trace  

The above should be:

     # cat /sys/kernel/tracing/instances/qla2xxx/trace

as the instance name is the string sent to trace_array_get_by_name().

-- Steve
Steven Rostedt July 19, 2022, 9:40 p.m. UTC | #3
On Tue, 19 Jul 2022 14:06:11 -0700
Arun Easi <aeasi@marvell.com> wrote:

> It appears by calling the above two interfaces, I get a separate instance 
> of "qla" only traces, but with only the "qla"-only instance being enabled, 
> leaving the main (/sys/kernel/tracing/events/qla/enable) one disabled. No 
> issues there, but when I enable both of them, I get garbage values on one.
> 
> /sys/kernel/tracing/instances/qla2xxx/trace:
>              cat-56106   [012] ..... 2419873.470098: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14:  Entered (null).
>              cat-56106   [012] ..... 2419873.470101: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14:  Entered ×+<96>²Ü<98>^H.
>              cat-56106   [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14:  Prepare to issue mbox cmd=0xde589000.
> 
> /sys/kernel/tracing/trace:
>              cat-56106   [012] ..... 2419873.470097: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14:  Entered qla2x00_get_firmware_state.
>              cat-56106   [012] ..... 2419873.470100: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14:  Entered qla2x00_mailbox_command.
>              cat-56106   [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14:  Prepare to issue mbox cmd=0x69.
> 
> It appears that only one should be enabled at a time. Per my read of 
> Documentation/trace/ftrace.rst, the main directory and instances have 
> separate trace buffers, so I am a bit confused with the above output.

That's because it uses va_list, and va_list can only be used once.

I just added helpers for va_list use cases:

   https://lore.kernel.org/all/20220705224453.120955146@goodmis.org/

But it will likely suffer the same issue, but I can easily fix that with
this on top:

Not even compiled tested:

-- Steve

diff --git a/include/trace/stages/stage6_event_callback.h b/include/trace/stages/stage6_event_callback.h
index 0f51f6b3ab70..3c554a585320 100644
--- a/include/trace/stages/stage6_event_callback.h
+++ b/include/trace/stages/stage6_event_callback.h
@@ -40,7 +40,12 @@
 
 #undef __assign_vstr
 #define __assign_vstr(dst, fmt, va)					\
-	vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, *(va))
+	do {								\
+		va_list __cp_va;					\
+		va_copy(__cp_va, *(va));				\
+		vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, __cp_va); \
+		va_end(__cp_va);					\
+	} while (0)
 
 #undef __bitmask
 #define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
Arun Easi July 19, 2022, 10:09 p.m. UTC | #4
On Tue, 19 Jul 2022, 2:40pm, Steven Rostedt wrote:

> On Tue, 19 Jul 2022 14:06:11 -0700
> Arun Easi <aeasi@marvell.com> wrote:
> 
> > It appears by calling the above two interfaces, I get a separate instance 
> > of "qla" only traces, but with only the "qla"-only instance being enabled, 
> > leaving the main (/sys/kernel/tracing/events/qla/enable) one disabled. No 
> > issues there, but when I enable both of them, I get garbage values on one.
> > 
> > /sys/kernel/tracing/instances/qla2xxx/trace:
> >              cat-56106   [012] ..... 2419873.470098: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14:  Entered (null).
> >              cat-56106   [012] ..... 2419873.470101: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14:  Entered ×+<96>²Ü<98>^H.
> >              cat-56106   [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14:  Prepare to issue mbox cmd=0xde589000.
> > 
> > /sys/kernel/tracing/trace:
> >              cat-56106   [012] ..... 2419873.470097: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14:  Entered qla2x00_get_firmware_state.
> >              cat-56106   [012] ..... 2419873.470100: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14:  Entered qla2x00_mailbox_command.
> >              cat-56106   [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14:  Prepare to issue mbox cmd=0x69.
> > 
> > It appears that only one should be enabled at a time. Per my read of 
> > Documentation/trace/ftrace.rst, the main directory and instances have 
> > separate trace buffers, so I am a bit confused with the above output.
> 
> That's because it uses va_list, and va_list can only be used once.

Ah, that makes sense.

> 
> I just added helpers for va_list use cases:
> 
>    https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_all_20220705224453.120955146-40goodmis.org_&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=P-q_Qkt75qFy33SvdD2nAxAyN87eO1d-mFO-lqNOomw&m=cDBXba_lNDGk7c1Qm4elDe1Co-RV8zR-c1A9xV7vc4nMcQO7iRle72qNxljHsDNA&s=74PzgMj6nmfRCLXd3oMIqt-_DZbQeukUl0nK7MzfUjs&e= 
> 
> But it will likely suffer the same issue, but I can easily fix that with
> this on top:
> 
> Not even compiled tested:
> 
> -- Steve
> 
> diff --git a/include/trace/stages/stage6_event_callback.h b/include/trace/stages/stage6_event_callback.h
> index 0f51f6b3ab70..3c554a585320 100644
> --- a/include/trace/stages/stage6_event_callback.h
> +++ b/include/trace/stages/stage6_event_callback.h
> @@ -40,7 +40,12 @@
>  
>  #undef __assign_vstr
>  #define __assign_vstr(dst, fmt, va)					\
> -	vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, *(va))
> +	do {								\
> +		va_list __cp_va;					\
> +		va_copy(__cp_va, *(va));				\
> +		vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, __cp_va); \
> +		va_end(__cp_va);					\
> +	} while (0)
>  
>  #undef __bitmask
>  #define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
> 

This will be nice.

Thanks for your help, Steve.

Regards,
-Arun
Arun Easi July 19, 2022, 10:09 p.m. UTC | #5
Hi Daniel,

On Tue, 19 Jul 2022, 1:25am, Daniel Wagner wrote:

> External Email
> 
> ----------------------------------------------------------------------
> Hi Arun,
> 
> On Mon, Jul 18, 2022 at 03:22:43PM -0400, Steven Rostedt wrote:
> > On Mon, 18 Jul 2022 12:02:26 -0700
> > Arun Easi <aeasi@marvell.com> wrote:
> > 
> > > Many times when a problem was reported on the driver, we had to request 
> > > for a repro with extended error logging (via driver module parameter) 
> > > turned on. With this internal tracing in place, log messages that appear 
> > > only with extended error logging are captured by default in the internal 
> > > trace buffer.
> > > 
> > > AFAIK, kernel tracing requires a user initiated action to be turned on, 
> > > like enabling individual tracepoints. Though a script (either startup or 
> > > udev) can do this job, enabling tracepoints by default for a single 
> > > driver, I think, may not be a preferred choice -- particularly when the 
> > > trace buffer is shared across the kernel. That also brings the extra 
> > > overhead of finding how this could be done across various distros.
> > > 
> > > For cases where the memory/driver size matters, there is an option to 
> > > compile out this feature, plus choosing a lower default trace buffer size.
> 
> I am really asking the question why do we need to add special debugging
> code to every single driver? Can't we try to make more use of generic
> code and extend it if necessary?
> 
> Both FC drivers qla2xxx and lpfc are doing their own thing for
> debugging/logging and I really fail to see why we can't not move towards
> a more generic way. Dumping logs to the kernel log was the simplest way
> but as this series shows, this is not something you can do in production
> systems.

No contention here on a generic way. The per instance trace creation and 
enabling from within the kernel looks like such a one. Let me revisit the 
trace patches with this new info.

Regards,
-Arun

> 
> > You can enable an ftrace instance from inside the kernel, and make it a
> > compile time option.
> >
> > 	#include <linux/trace_events.h>
> > 	#include <linux/trace.h>
> > 
> > 	struct trace_array *tr;
> > 
> > 	tr = trace_array_get_by_name("qla2xxx");
> > 	trace_array_set_clr_event(tr, "qla", NULL, true);
> > 
> > And now you have trace events running:
> > 
> >  # cat /sys/kernel/tracing/instances/qla/trace
> 
> Thanks Steve for the tip!
> 
> Daniel
>
Steven Rostedt July 19, 2022, 10:17 p.m. UTC | #6
On Tue, 19 Jul 2022 15:09:30 -0700
Arun Easi <aeasi@marvell.com> wrote:

> This will be nice.
> 
> Thanks for your help, Steve.

No problem.

What a coincidence that I just finished this work earlier this month.

-- Steve
Steffen Maier July 23, 2022, 7:47 a.m. UTC | #7
On 7/20/22 00:09, Arun Easi wrote:
> On Tue, 19 Jul 2022, 1:25am, Daniel Wagner wrote:
>> On Mon, Jul 18, 2022 at 03:22:43PM -0400, Steven Rostedt wrote:
>>> On Mon, 18 Jul 2022 12:02:26 -0700
>>> Arun Easi <aeasi@marvell.com> wrote:
>>>
>>>> Many times when a problem was reported on the driver, we had to request
>>>> for a repro with extended error logging (via driver module parameter)
>>>> turned on. With this internal tracing in place, log messages that appear
>>>> only with extended error logging are captured by default in the internal
>>>> trace buffer.
>>>>
>>>> AFAIK, kernel tracing requires a user initiated action to be turned on,
>>>> like enabling individual tracepoints. Though a script (either startup or
>>>> udev) can do this job, enabling tracepoints by default for a single
>>>> driver, I think, may not be a preferred choice -- particularly when the
>>>> trace buffer is shared across the kernel. That also brings the extra
>>>> overhead of finding how this could be done across various distros.
>>>>
>>>> For cases where the memory/driver size matters, there is an option to
>>>> compile out this feature, plus choosing a lower default trace buffer size.
>>
>> I am really asking the question why do we need to add special debugging
>> code to every single driver? Can't we try to make more use of generic
>> code and extend it if necessary?
>>
>> Both FC drivers qla2xxx and lpfc are doing their own thing for

All three FC drivers:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/s390/scsi/zfcp_dbf.c

We have this flight recorder (multiple topical ring buffers per vHBA to avoid 
some higher frequency topics lose history of others) since a long time [maybe 
pre-dating ftrace], including crash tool support to extract it from post mortem 
kernel dumps. We use binary trace records, like tracepoints, with offline 
decoding/formatting (zfcpdbf in s390-tools package).
Other s390 kernel components share the underlying s390dbf infrastructure.

The crash extension "ftrace" is probably able to do an export from dump for the 
approach Steven suggested. I had used it with kernel function tracer. Very useful.

>> debugging/logging and I really fail to see why we can't not move towards
>> a more generic way. Dumping logs to the kernel log was the simplest way
>> but as this series shows, this is not something you can do in production
>> systems.
> 
> No contention here on a generic way. The per instance trace creation and
> enabling from within the kernel looks like such a one. Let me revisit the
> trace patches with this new info.
> 
> Regards,
> -Arun
> 
>>
>>> You can enable an ftrace instance from inside the kernel, and make it a
>>> compile time option.
>>>
>>> 	#include <linux/trace_events.h>
>>> 	#include <linux/trace.h>
>>>
>>> 	struct trace_array *tr;
>>>
>>> 	tr = trace_array_get_by_name("qla2xxx");
>>> 	trace_array_set_clr_event(tr, "qla", NULL, true);
>>>
>>> And now you have trace events running:
>>>
>>>   # cat /sys/kernel/tracing/instances/qla/trace
>>
>> Thanks Steve for the tip!
>>
>> Daniel
>>
Steven Rostedt July 25, 2022, 1:56 p.m. UTC | #8
On Sat, 23 Jul 2022 09:47:21 +0200
Steffen Maier <maier@linux.ibm.com> wrote:

> The crash extension "ftrace" is probably able to do an export from dump for the 
> approach Steven suggested. I had used it with kernel function tracer. Very useful.

Note, the crash extension is just called "trace" and you can get it here:

  https://github.com/fujitsu/crash-trace

-- Steve