Message ID | 1599117498-30145-1-git-send-email-sundeep.lkml@gmail.com |
---|---|
Headers | show |
Series | Introduce mbox tracepoints for Octeontx2 | expand |
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > Sent: Friday, September 4, 2020 5:41 PM > To: Sunil Kovvuri Goutham <sgoutham@marvell.com> > Cc: Jakub Kicinski <kuba@kernel.org>; sundeep.lkml@gmail.com; > davem@davemloft.net; netdev@vger.kernel.org; Subbaraya Sundeep > Bhatta <sbhatta@marvell.com> > Subject: Re: [EXT] Re: [net-next PATCH 0/2] Introduce mbox tracepoints for > Octeontx2 > > Fri, Sep 04, 2020 at 10:49:45AM CEST, sgoutham@marvell.com wrote: > > > > > >> -----Original Message----- > >> From: Jiri Pirko <jiri@resnulli.us> > >> Sent: Friday, September 4, 2020 2:07 PM > >> To: Sunil Kovvuri Goutham <sgoutham@marvell.com> > >> Cc: Jakub Kicinski <kuba@kernel.org>; sundeep.lkml@gmail.com; > >> davem@davemloft.net; netdev@vger.kernel.org; Subbaraya Sundeep > Bhatta > >> <sbhatta@marvell.com> > >> Subject: Re: [EXT] Re: [net-next PATCH 0/2] Introduce mbox > >> tracepoints for > >> Octeontx2 > >> > >> Fri, Sep 04, 2020 at 07:39:54AM CEST, sgoutham@marvell.com wrote: > >> > > >> > > >> >> -----Original Message----- > >> >> From: Jakub Kicinski <kuba@kernel.org> > >> >> Sent: Friday, September 4, 2020 12:48 AM > >> >> To: sundeep.lkml@gmail.com > >> >> Cc: davem@davemloft.net; netdev@vger.kernel.org; Sunil Kovvuri > >> >> Goutham <sgoutham@marvell.com>; Subbaraya Sundeep Bhatta > >> >> <sbhatta@marvell.com> > >> >> Subject: [EXT] Re: [net-next PATCH 0/2] Introduce mbox tracepoints > >> >> for > >> >> Octeontx2 > >> >> > >> >> External Email > >> >> > >> >> ------------------------------------------------------------------ > >> >> --- > >> >> - On Thu, 3 Sep 2020 12:48:16 +0530 sundeep.lkml@gmail.com wrote: > >> >> > From: Subbaraya Sundeep <sbhatta@marvell.com> > >> >> > > >> >> > This patchset adds tracepoints support for mailbox. > >> >> > In Octeontx2, PFs and VFs need to communicate with AF for > >> >> > allocating and freeing resources. Once all the configuration is > >> >> > done by AF for a PF/VF then packet I/O can happen on PF/VF > queues. > >> >> > When an interface is brought up many mailbox messages are sent > >> >> > to AF for initializing queues. Say a VF is brought up then each > >> >> > message is sent to PF and PF forwards to AF and response also > >> >> > traverses > >> from AF to PF and then VF. > >> >> > To aid debugging, tracepoints are added at places where messages > >> >> > are allocated, sent and message interrupts. > >> >> > Below is the trace of one of the messages from VF to AF and AF > >> >> > response back to VF: > >> >> > >> >> Could you use the devlink tracepoint? trace_devlink_hwmsg() ? > >> > > >> >Thanks for the suggestion. > >> >In our case the mailbox is central to 3 different drivers and there > >> >would be a 4th one once crypto driver is accepted. We cannot add > >> >devlink to all of them inorder to use the devlink trace points. > >> > >> I guess you have 1 pci device, right? Devlink instance is created per > >> pci device. > >> > > > >No, there are 3 drivers registering to 3 PCI device IDs and there can > >be many instances of the same devices. So there can be 10's of instances of > AF, PF and VFs. > > So you can still have per-pci device devlink instance and use the tracepoint > Jakub suggested. > Two things - As I mentioned above, there is a Crypto driver which uses the same mbox APIs which is in the process of upstreaming. There also we would need trace points. Not sure registering to devlink just for the sake of tracepoint is proper. - The devlink trace message is like this TRACE_EVENT(devlink_hwmsg, . . . TP_printk("bus_name=%s dev_name=%s driver_name=%s incoming=%d type=%lu buf=0x[%*phD] len=%zu", __get_str(bus_name), __get_str(dev_name), __get_str(driver_name), __entry->incoming, __entry->type, (int) __entry->len, __get_dynamic_array(buf), __entry->len) ); Whatever debug message we want as output doesn't fit into this. Thanks, Sunil.
Hi Jakub, On Sat, Sep 5, 2020 at 2:07 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Fri, 4 Sep 2020 12:29:04 +0000 Sunil Kovvuri Goutham wrote: > > > >No, there are 3 drivers registering to 3 PCI device IDs and there can > > > >be many instances of the same devices. So there can be 10's of instances of > > > AF, PF and VFs. > > > > > > So you can still have per-pci device devlink instance and use the tracepoint > > > Jakub suggested. > > > > > > > Two things > > - As I mentioned above, there is a Crypto driver which uses the same mbox APIs > > which is in the process of upstreaming. There also we would need trace points. > > Not sure registering to devlink just for the sake of tracepoint is proper. > > > > - The devlink trace message is like this > > > > TRACE_EVENT(devlink_hwmsg, > > . . . > > TP_printk("bus_name=%s dev_name=%s driver_name=%s incoming=%d type=%lu buf=0x[%*phD] len=%zu", > > __get_str(bus_name), __get_str(dev_name), > > __get_str(driver_name), __entry->incoming, __entry->type, > > (int) __entry->len, __get_dynamic_array(buf), __entry->len) > > ); > > > > Whatever debug message we want as output doesn't fit into this. > > Make use of the standard devlink tracepoint wherever applicable, and you > can keep your extra ones if you want (as long as Jiri don't object). Sure and noted. I have tried to use devlink tracepoints and since it could not fit our purpose I used these. Thanks, Sundeep
Hi Jiri, On Mon, Sep 7, 2020 at 4:29 PM sundeep subbaraya <sundeep.lkml@gmail.com> wrote: > > Hi Jakub, > > On Sat, Sep 5, 2020 at 2:07 AM Jakub Kicinski <kuba@kernel.org> wrote: > > > > On Fri, 4 Sep 2020 12:29:04 +0000 Sunil Kovvuri Goutham wrote: > > > > >No, there are 3 drivers registering to 3 PCI device IDs and there can > > > > >be many instances of the same devices. So there can be 10's of instances of > > > > AF, PF and VFs. > > > > > > > > So you can still have per-pci device devlink instance and use the tracepoint > > > > Jakub suggested. > > > > > > > > > > Two things > > > - As I mentioned above, there is a Crypto driver which uses the same mbox APIs > > > which is in the process of upstreaming. There also we would need trace points. > > > Not sure registering to devlink just for the sake of tracepoint is proper. > > > > > > - The devlink trace message is like this > > > > > > TRACE_EVENT(devlink_hwmsg, > > > . . . > > > TP_printk("bus_name=%s dev_name=%s driver_name=%s incoming=%d type=%lu buf=0x[%*phD] len=%zu", > > > __get_str(bus_name), __get_str(dev_name), > > > __get_str(driver_name), __entry->incoming, __entry->type, > > > (int) __entry->len, __get_dynamic_array(buf), __entry->len) > > > ); > > > > > > Whatever debug message we want as output doesn't fit into this. > > > > Make use of the standard devlink tracepoint wherever applicable, and you > > can keep your extra ones if you want (as long as Jiri don't object). > > Sure and noted. I have tried to use devlink tracepoints and since it > could not fit our purpose I used these. > Can you please comment. > Thanks, > Sundeep
On Tue, 15 Sep 2020 21:22:21 +0530 sundeep subbaraya wrote: > > > Make use of the standard devlink tracepoint wherever applicable, and you > > > can keep your extra ones if you want (as long as Jiri don't object). > > > > Sure and noted. I have tried to use devlink tracepoints and since it > > could not fit our purpose I used these. > > Can you please comment. Comment on what? Restate what I already said? Add the standard tracepoint, you can add extra ones where needed.
On Tue, Sep 15, 2020 at 9:42 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Tue, 15 Sep 2020 21:22:21 +0530 sundeep subbaraya wrote: > > > > Make use of the standard devlink tracepoint wherever applicable, and you > > > > can keep your extra ones if you want (as long as Jiri don't object). > > > > > > Sure and noted. I have tried to use devlink tracepoints and since it > > > could not fit our purpose I used these. > > > > Can you please comment. > > Comment on what? Restate what I already said? Add the standard > tracepoint, you can add extra ones where needed. We did look at using the devlink tracepoint for our purpose and found it not suitable for our current requirement. As and when we want to add new tracepoints we will keep this in mind to see if we can use the devlink one. So was just checking if Jiri is okay with this.
On Tue, 15 Sep 2020 22:06:45 +0530 sundeep subbaraya wrote: > On Tue, Sep 15, 2020 at 9:42 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > On Tue, 15 Sep 2020 21:22:21 +0530 sundeep subbaraya wrote: > > > > > Make use of the standard devlink tracepoint wherever applicable, and you > > > > > can keep your extra ones if you want (as long as Jiri don't object). > > > > > > > > Sure and noted. I have tried to use devlink tracepoints and since it > > > > could not fit our purpose I used these. > > > > > > Can you please comment. > > > > Comment on what? Restate what I already said? Add the standard > > tracepoint, you can add extra ones where needed. > > We did look at using the devlink tracepoint for our purpose and found > it not suitable for our current requirement. > As and when we want to add new tracepoints we will keep this in mind > to see if we can use the devlink one. > > So was just checking if Jiri is okay with this. Please make sure you adjust the To: field of the email to the person you're asking your question.
Mon, Sep 07, 2020 at 12:59:45PM CEST, sundeep.lkml@gmail.com wrote: >Hi Jakub, > >On Sat, Sep 5, 2020 at 2:07 AM Jakub Kicinski <kuba@kernel.org> wrote: >> >> On Fri, 4 Sep 2020 12:29:04 +0000 Sunil Kovvuri Goutham wrote: >> > > >No, there are 3 drivers registering to 3 PCI device IDs and there can >> > > >be many instances of the same devices. So there can be 10's of instances of >> > > AF, PF and VFs. >> > > >> > > So you can still have per-pci device devlink instance and use the tracepoint >> > > Jakub suggested. >> > > >> > >> > Two things >> > - As I mentioned above, there is a Crypto driver which uses the same mbox APIs >> > which is in the process of upstreaming. There also we would need trace points. >> > Not sure registering to devlink just for the sake of tracepoint is proper. >> > >> > - The devlink trace message is like this >> > >> > TRACE_EVENT(devlink_hwmsg, >> > . . . >> > TP_printk("bus_name=%s dev_name=%s driver_name=%s incoming=%d type=%lu buf=0x[%*phD] len=%zu", >> > __get_str(bus_name), __get_str(dev_name), >> > __get_str(driver_name), __entry->incoming, __entry->type, >> > (int) __entry->len, __get_dynamic_array(buf), __entry->len) >> > ); >> > >> > Whatever debug message we want as output doesn't fit into this. >> >> Make use of the standard devlink tracepoint wherever applicable, and you >> can keep your extra ones if you want (as long as Jiri don't object). > >Sure and noted. I have tried to use devlink tracepoints and since it >could not fit our purpose I used these. Why exactly the existing TP didn't fit your need? > >Thanks, >Sundeep
On Wed, Sep 16, 2020 at 4:04 PM Jiri Pirko <jiri@resnulli.us> wrote: > > Mon, Sep 07, 2020 at 12:59:45PM CEST, sundeep.lkml@gmail.com wrote: > >Hi Jakub, > > > >On Sat, Sep 5, 2020 at 2:07 AM Jakub Kicinski <kuba@kernel.org> wrote: > >> > >> On Fri, 4 Sep 2020 12:29:04 +0000 Sunil Kovvuri Goutham wrote: > >> > > >No, there are 3 drivers registering to 3 PCI device IDs and there can > >> > > >be many instances of the same devices. So there can be 10's of instances of > >> > > AF, PF and VFs. > >> > > > >> > > So you can still have per-pci device devlink instance and use the tracepoint > >> > > Jakub suggested. > >> > > > >> > > >> > Two things > >> > - As I mentioned above, there is a Crypto driver which uses the same mbox APIs > >> > which is in the process of upstreaming. There also we would need trace points. > >> > Not sure registering to devlink just for the sake of tracepoint is proper. > >> > > >> > - The devlink trace message is like this > >> > > >> > TRACE_EVENT(devlink_hwmsg, > >> > . . . > >> > TP_printk("bus_name=%s dev_name=%s driver_name=%s incoming=%d type=%lu buf=0x[%*phD] len=%zu", > >> > __get_str(bus_name), __get_str(dev_name), > >> > __get_str(driver_name), __entry->incoming, __entry->type, > >> > (int) __entry->len, __get_dynamic_array(buf), __entry->len) > >> > ); > >> > > >> > Whatever debug message we want as output doesn't fit into this. > >> > >> Make use of the standard devlink tracepoint wherever applicable, and you > >> can keep your extra ones if you want (as long as Jiri don't object). > > > >Sure and noted. I have tried to use devlink tracepoints and since it > >could not fit our purpose I used these. > > Why exactly the existing TP didn't fit your need? > Existing TP has provision to dump skb and trace error strings with error code but we are trying to trace the entire mailbox flow of the AF/PF and VF drivers. In particular we trace the below: message allocation with message id and size at initiator. number of messages sent and total size. check message requester id, response id and response code after reply is received. interrupts happened on behalf of mailboxes in the entire process with source and receiver of interrupt along with isr status. error like initiator timeout waiting for response. All the above are relevant and are required for Octeontx2 only hence used own tracepoints. Thanks, Sundeep > > > >Thanks, > >Sundeep
Wed, Sep 16, 2020 at 07:19:36PM CEST, sundeep.lkml@gmail.com wrote: >On Wed, Sep 16, 2020 at 4:04 PM Jiri Pirko <jiri@resnulli.us> wrote: >> >> Mon, Sep 07, 2020 at 12:59:45PM CEST, sundeep.lkml@gmail.com wrote: >> >Hi Jakub, >> > >> >On Sat, Sep 5, 2020 at 2:07 AM Jakub Kicinski <kuba@kernel.org> wrote: >> >> >> >> On Fri, 4 Sep 2020 12:29:04 +0000 Sunil Kovvuri Goutham wrote: >> >> > > >No, there are 3 drivers registering to 3 PCI device IDs and there can >> >> > > >be many instances of the same devices. So there can be 10's of instances of >> >> > > AF, PF and VFs. >> >> > > >> >> > > So you can still have per-pci device devlink instance and use the tracepoint >> >> > > Jakub suggested. >> >> > > >> >> > >> >> > Two things >> >> > - As I mentioned above, there is a Crypto driver which uses the same mbox APIs >> >> > which is in the process of upstreaming. There also we would need trace points. >> >> > Not sure registering to devlink just for the sake of tracepoint is proper. >> >> > >> >> > - The devlink trace message is like this >> >> > >> >> > TRACE_EVENT(devlink_hwmsg, >> >> > . . . >> >> > TP_printk("bus_name=%s dev_name=%s driver_name=%s incoming=%d type=%lu buf=0x[%*phD] len=%zu", >> >> > __get_str(bus_name), __get_str(dev_name), >> >> > __get_str(driver_name), __entry->incoming, __entry->type, >> >> > (int) __entry->len, __get_dynamic_array(buf), __entry->len) >> >> > ); >> >> > >> >> > Whatever debug message we want as output doesn't fit into this. >> >> >> >> Make use of the standard devlink tracepoint wherever applicable, and you >> >> can keep your extra ones if you want (as long as Jiri don't object). >> > >> >Sure and noted. I have tried to use devlink tracepoints and since it >> >could not fit our purpose I used these. >> >> Why exactly the existing TP didn't fit your need? >> >Existing TP has provision to dump skb and trace error strings with >error code but >we are trying to trace the entire mailbox flow of the AF/PF and VF >drivers. In particular >we trace the below: > message allocation with message id and size at initiator. > number of messages sent and total size. > check message requester id, response id and response code after >reply is received. > interrupts happened on behalf of mailboxes in the entire process >with source and receiver of interrupt along with isr status. > error like initiator timeout waiting for response. > All the above are relevant and are required for Octeontx2 only hence >used own tracepoints. You can still use devlink_hwmsg for the actual data exchanged between the driver and hw. For the rest, you can have driver-specific TPs. > >Thanks, >Sundeep > >> > >> >Thanks, >> >Sundeep
On Thu, Sep 17, 2020 at 11:34 AM Jiri Pirko <jiri@resnulli.us> wrote: > > Wed, Sep 16, 2020 at 07:19:36PM CEST, sundeep.lkml@gmail.com wrote: > >On Wed, Sep 16, 2020 at 4:04 PM Jiri Pirko <jiri@resnulli.us> wrote: > >> > >> Mon, Sep 07, 2020 at 12:59:45PM CEST, sundeep.lkml@gmail.com wrote: > >> >Hi Jakub, > >> > > >> >On Sat, Sep 5, 2020 at 2:07 AM Jakub Kicinski <kuba@kernel.org> wrote: > >> >> > >> >> On Fri, 4 Sep 2020 12:29:04 +0000 Sunil Kovvuri Goutham wrote: > >> >> > > >No, there are 3 drivers registering to 3 PCI device IDs and there can > >> >> > > >be many instances of the same devices. So there can be 10's of instances of > >> >> > > AF, PF and VFs. > >> >> > > > >> >> > > So you can still have per-pci device devlink instance and use the tracepoint > >> >> > > Jakub suggested. > >> >> > > > >> >> > > >> >> > Two things > >> >> > - As I mentioned above, there is a Crypto driver which uses the same mbox APIs > >> >> > which is in the process of upstreaming. There also we would need trace points. > >> >> > Not sure registering to devlink just for the sake of tracepoint is proper. > >> >> > > >> >> > - The devlink trace message is like this > >> >> > > >> >> > TRACE_EVENT(devlink_hwmsg, > >> >> > . . . > >> >> > TP_printk("bus_name=%s dev_name=%s driver_name=%s incoming=%d type=%lu buf=0x[%*phD] len=%zu", > >> >> > __get_str(bus_name), __get_str(dev_name), > >> >> > __get_str(driver_name), __entry->incoming, __entry->type, > >> >> > (int) __entry->len, __get_dynamic_array(buf), __entry->len) > >> >> > ); > >> >> > > >> >> > Whatever debug message we want as output doesn't fit into this. > >> >> > >> >> Make use of the standard devlink tracepoint wherever applicable, and you > >> >> can keep your extra ones if you want (as long as Jiri don't object). > >> > > >> >Sure and noted. I have tried to use devlink tracepoints and since it > >> >could not fit our purpose I used these. > >> > >> Why exactly the existing TP didn't fit your need? > >> > >Existing TP has provision to dump skb and trace error strings with > >error code but > >we are trying to trace the entire mailbox flow of the AF/PF and VF > >drivers. In particular > >we trace the below: > > message allocation with message id and size at initiator. > > number of messages sent and total size. > > check message requester id, response id and response code after > >reply is received. > > interrupts happened on behalf of mailboxes in the entire process > >with source and receiver of interrupt along with isr status. > > error like initiator timeout waiting for response. > > All the above are relevant and are required for Octeontx2 only hence > >used own tracepoints. > > You can still use devlink_hwmsg for the actual data exchanged between > the driver and hw. For the rest, you can have driver-specific TPs. > > I totally got your point and adding devlink to our drivers is work in progress since we got a similar comment from Jakub for a patch previously: https://www.mail-archive.com/netdev@vger.kernel.org/msg341414.html All the errors in the drivers will be turned to devlink TP in future. This patchset is a bit different since it traces mailbox messages state machine at low level and does not even trace message data exchanged between driver and hw. Thanks, Sundeep > > > >Thanks, > >Sundeep > > > >> > > >> >Thanks, > >> >Sundeep
From: Subbaraya Sundeep <sbhatta@marvell.com> This patchset adds tracepoints support for mailbox. In Octeontx2, PFs and VFs need to communicate with AF for allocating and freeing resources. Once all the configuration is done by AF for a PF/VF then packet I/O can happen on PF/VF queues. When an interface is brought up many mailbox messages are sent to AF for initializing queues. Say a VF is brought up then each message is sent to PF and PF forwards to AF and response also traverses from AF to PF and then VF. To aid debugging, tracepoints are added at places where messages are allocated, sent and message interrupts. Below is the trace of one of the messages from VF to AF and AF response back to VF: ~ # echo 1 > /sys/kernel/tracing/events/rvu/enable ~ # ifconfig eth20 up [ 279.379559] eth20 NIC Link is UP 10000 Mbps Full duplex ~ # cat /sys/kernel/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 880/880 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | ifconfig-171 [000] .... 275.753345: otx2_msg_alloc: [0002:02:00.1] msg:(0x400) size:40 ifconfig-171 [000] ...1 275.753347: otx2_msg_send: [0002:02:00.1] sent 1 msg(s) of size:48 <idle>-0 [001] dNh1 275.753356: otx2_msg_interrupt: [0002:02:00.0] mbox interrupt VF(s) to PF (0x1) kworker/u9:1-90 [001] ...1 275.753364: otx2_msg_send: [0002:02:00.0] sent 1 msg(s) of size:48 kworker/u9:1-90 [001] d.h. 275.753367: otx2_msg_interrupt: [0002:01:00.0] mbox interrupt PF(s) to AF (0x2) kworker/u9:2-167 [002] .... 275.753535: otx2_msg_process: [0002:01:00.0] msg:(0x400) error:0 kworker/u9:2-167 [002] ...1 275.753537: otx2_msg_send: [0002:01:00.0] sent 1 msg(s) of size:32 <idle>-0 [003] d.h1 275.753543: otx2_msg_interrupt: [0002:02:00.0] mbox interrupt AF to PF (0x1) <idle>-0 [001] d.h2 275.754376: otx2_msg_interrupt: [0002:02:00.1] mbox interrupt PF to VF (0x1) Subbaraya Sundeep (2): octeontx2-af: Introduce tracepoints for mailbox octeontx2-pf: Add tracepoints for PF/VF mailbox drivers/net/ethernet/marvell/octeontx2/af/Makefile | 3 +- drivers/net/ethernet/marvell/octeontx2/af/mbox.c | 14 ++- drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 7 ++ .../net/ethernet/marvell/octeontx2/af/rvu_cgx.c | 2 + .../net/ethernet/marvell/octeontx2/af/rvu_trace.c | 15 +++ .../net/ethernet/marvell/octeontx2/af/rvu_trace.h | 115 +++++++++++++++++++++ .../ethernet/marvell/octeontx2/nic/otx2_common.h | 2 + .../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 6 ++ .../net/ethernet/marvell/octeontx2/nic/otx2_vf.c | 2 + 9 files changed, 162 insertions(+), 4 deletions(-) create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_trace.c create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_trace.h