diff mbox series

[RFC,net-next,7/9] net: dsa: microchip: ksz9477: add hardware time stamping support

Message ID 20201019172435.4416-8-ceggers@arri.de
State New
Headers show
Series net: dsa: microchip: PTP support for KSZ956x | expand

Commit Message

Christian Eggers Oct. 19, 2020, 5:24 p.m. UTC
Add routines required for TX hardware time stamping.

The KSZ9563 only supports one step time stamping
(HWTSTAMP_TX_ONESTEP_P2P), which requires linuxptp-2.0 or later. PTP
mode is permanently enabled (changes tail tag; depends on
CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP).TX time stamps are reported via an
interrupt / device registers whilst RX time stamps are reported via an
additional tail tag.

One step TX time stamping of PDelay_Resp requires the RX time stamp from
the associated PDelay_Req message. linuxptp assumes that the RX time
stamp has already been subtracted from the PDelay_Req correction field
(as done by the TI PHYTER). linuxptp will echo back the value of the
correction field in the PDelay_Resp message.

In order to be compatible to this already established interface, the
KSZ9563 code emulates this behavior. When processing the PDelay_Resp
message, the time stamp is moved back from the correction field to the
tail tag, as the hardware doesn't support negative values on this field.
Of course, the UDP checksums (if any) have to be corrected after this
(for both directions).

The PTP hardware performs internal detection of PTP frames (likely
similar as ptp_classify_raw() and ptp_parse_header()). As these filters
cannot be disabled, the current delay mode (E2E/P2P) and the clock mode
(master/slave) must be configured via sysfs attributes. Time stamping
will only be performed on PTP packets matching the current mode
settings.

Everything has been tested on a Microchip KSZ9563 switch.

Signed-off-by: Christian Eggers <ceggers@arri.de>
---
 drivers/net/dsa/microchip/ksz9477_main.c |   9 +-
 drivers/net/dsa/microchip/ksz9477_ptp.c  | 629 +++++++++++++++++++++++
 drivers/net/dsa/microchip/ksz9477_ptp.h  |  27 +
 include/linux/dsa/ksz_common.h           |  30 ++
 net/dsa/tag_ksz.c                        | 206 +++++++-
 5 files changed, 893 insertions(+), 8 deletions(-)

Comments

Vladimir Oltean Oct. 20, 2020, 12:10 a.m. UTC | #1
On Mon, Oct 19, 2020 at 07:24:33PM +0200, Christian Eggers wrote:
> Add routines required for TX hardware time stamping.
> 
> The KSZ9563 only supports one step time stamping
> (HWTSTAMP_TX_ONESTEP_P2P), which requires linuxptp-2.0 or later. PTP
> mode is permanently enabled (changes tail tag; depends on
> CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP).TX time stamps are reported via an
> interrupt / device registers whilst RX time stamps are reported via an
> additional tail tag.
> 
> One step TX time stamping of PDelay_Resp requires the RX time stamp from
> the associated PDelay_Req message. linuxptp assumes that the RX time
> stamp has already been subtracted from the PDelay_Req correction field
> (as done by the TI PHYTER). linuxptp will echo back the value of the
> correction field in the PDelay_Resp message.
> 
> In order to be compatible to this already established interface, the
> KSZ9563 code emulates this behavior. When processing the PDelay_Resp
> message, the time stamp is moved back from the correction field to the
> tail tag, as the hardware doesn't support negative values on this field.
> Of course, the UDP checksums (if any) have to be corrected after this
> (for both directions).
> 
> The PTP hardware performs internal detection of PTP frames (likely
> similar as ptp_classify_raw() and ptp_parse_header()). As these filters
> cannot be disabled, the current delay mode (E2E/P2P) and the clock mode
> (master/slave) must be configured via sysfs attributes. Time stamping
> will only be performed on PTP packets matching the current mode
> settings.
> 
> Everything has been tested on a Microchip KSZ9563 switch.

I looked a little bit at the KSZ9563 datasheet and I'm more confused
than I was before opening it.

-----------------------------[cut here]-----------------------------
The device supports V2 (2008) of the IEEE 1588 PTP specification and can
be programmed as either an end-to-end (E2E) or peer-to-peer (P2P)
transparent clock (TC) between ports. In addition, the host port can be
programmed as either a slave or master ordinary clock (OC) port.
Ingress timestamp capture, egress timestamp recording, correction field
update with residence time and link delay, delay turn-around time
insertion, egress timestamp insertion, and checksum update are
supported.
-----------------------------[cut here]-----------------------------

So it's a 1-step transparent clock, fair enough. That works autonomously
without any sort of involvement from the operating system, you know
that, right? This is stateless functionality.

BUT, if that is the case, what do you need PTP support in the kernel
for? What profiles are you using with linuxptp? What benefit does it
bring you if you report timestamps to the operating system, for
terminated 1588 traffic? Why would you even terminate 1588 traffic on
the host CPU? I fail to understand many of the use cases that this
switch is tailored for.

Also, I know that Microchip support does a pretty bad job at giving
useful answers, and the datasheet isn't quite clear either (looks like
there's info that has been copied from other switches, like for 2-step
timestamping, then removed, and too much was removed because now nothing
is clear) so you'll have to give your best shot at explaining some
things.


Global PTP Message Config 1 Register
------------------------------------

Bit 2: Selection of P2P or E2E
1 = Peer-to-peer (P2P) transparent clock mode
0 = End-to-end (E2E) transparent clock mode

What does this bit do exactly?
Does it change the switch's behavior as an autonomous 1-step transparent
clock? Or does it have anything to do with how/which timestamps are
delivered to the CPU? The point is, why do you care to configure this?
Sysfs is not going to fly without a solid explanation, which you did not
provide here.

My understanding of E2E vs P2P TC is that an E2E TC will correct the
timestamps of Pdelay messages, while a P2P TC won't. The P2P TC must
speak proper PDelay and not forward those packets sheepishly. Which
starts to answer my question, I believe... So my comment above, that the
1-step TC functionality doesn't require any involvement from the CPU, is
only correct for E2E TC, am I right? For P2P TC, you would need the host
CPU to speak peer delay. But you wouldn't need it for anything else (the
SYNC messages would have no reason to go to the CPU, would they?). So,
again, what profile are you using with linuxptp for this one?

If my understanding is right, maybe you want to just leave the switch
operate in E2E TC mode by default, and put it into P2P TC as soon as
your .port_hwtstamp_set() method is called?


Ok, on to my next question....

Bit 1: Selection of Master or Slave
1 = Host port is PTP master ordinary clock
0 = Host port is PTP slave ordinary clock

What does this _actually_ do? Here I really have no idea. I can only
imagine that this has again to do with the 1-step TC operation, and that
it's treating the host port as a switched endpoint, and this has to do
with the port states of the P2P TC. I'm so confused by this one that I
don't even know what to ask...
Ok, let's put it differently. You bothered to add a sysfs for it, so you
must be using it for something. What are you using it for?
Christian Eggers Oct. 20, 2020, 8:39 a.m. UTC | #2
On Tuesday, 20 October 2020, 02:10:40 CEST, Vladimir Oltean wrote:
> I looked a little bit at the KSZ9563 datasheet and I'm more confused
> than I was before opening it.
> 
> -----------------------------[cut here]-----------------------------
> The device supports V2 (2008) of the IEEE 1588 PTP specification and can
> be programmed as either an end-to-end (E2E) or peer-to-peer (P2P)
> transparent clock (TC) between ports. In addition, the host port can be
> programmed as either a slave or master ordinary clock (OC) port.
> Ingress timestamp capture, egress timestamp recording, correction field
> update with residence time and link delay, delay turn-around time
> insertion, egress timestamp insertion, and checksum update are
> supported.
> -----------------------------[cut here]-----------------------------
> 
> So it's a 1-step transparent clock, fair enough. That works autonomously
> without any sort of involvement from the operating system, you know
> that, right? This is stateless functionality.
The device performs "one-step time stamping". This means that the correction 
field in Sync and PDelay_Resp messages is updated automatically. Not more. For 
operation as a switch, this would be probably enough. But I need also to set 
the switch' local PTP clock in order to generate a synchronized output signal 
for local use.

> BUT, if that is the case, what do you need PTP support in the kernel
> for?
The device cannot autonomously set it's PTP clock. This requires calculating 
the link delay between the own port and the master/peer. It looks like the 
KSZ9563 cannot perform this calculation itself.

> What profiles are you using with linuxptp?
My config is at the bottom.

> What benefit does it
> bring you if you report timestamps to the operating system, for
> terminated 1588 traffic? Why would you even terminate 1588 traffic on
> the host CPU? I fail to understand many of the use cases that this
> switch is tailored for.
In my opinion, the switch doesn't terminate 1588 traffic. It only generates 
hardware time stamps and performs update of the correction field on selected 
messages.

> Also, I know that Microchip support does a pretty bad job at giving
> useful answers, and the datasheet isn't quite clear either (looks like
> there's info that has been copied from other switches, like for 2-step
> timestamping, then removed, and too much was removed because now nothing
> is clear) so you'll have to give your best shot at explaining some
> things.
It looks like it was planned to support also 2-step time stamping. You can see 
an errata in revision B of the errata sheet [1]. Revision C dropped this 
again.

> Global PTP Message Config 1 Register
> ------------------------------------
> 
> Bit 2: Selection of P2P or E2E
> 1 = Peer-to-peer (P2P) transparent clock mode
> 0 = End-to-end (E2E) transparent clock mode
> 
> What does this bit do exactly?
> Does it change the switch's behavior as an autonomous 1-step transparent
> clock? Or does it have anything to do with how/which timestamps are
> delivered to the CPU? The point is, why do you care to configure this?
> Sysfs is not going to fly without a solid explanation, which you did not
> provide here.
In short: Clock synchronization (using linuxptp) doesn't work if this is not 
configured in sync with ptp4l.conf. Either no time stamping is performed or 
PTP messages are filtered out. If required, I can investigate this further. 

> My understanding of E2E vs P2P TC is that an E2E TC will correct the
> timestamps of Pdelay messages, while a P2P TC won't
Yes, I think so. But I am no expert for PTP.

> The P2P TC must speak proper PDelay and not forward those packets 
> sheepishly. Which starts to answer my question, I believe... So my comment 
> above, that the 1-step TC functionality doesn't require any involvement from 
> the CPU, is only correct for E2E TC, am I right? For P2P TC, you would need 
> the host CPU to speak peer delay.
The host CPU is not required for updating forwarded (switched) PTP messages. 
But it is required RX/TX time stamping of own PTP messages. All local PTP 
operations can be done on the switch, there's no work left for the MAC 
connected to the host port.

> But you wouldn't need it for anything else (the
> SYNC messages would have no reason to go to the CPU, would they?). 
The (time stamped) SYNC messages are required in order to calculate the offset 
from the master clock.

> So, again, what profile are you using with linuxptp for this one?
only a few lines left...

> If my understanding is right, maybe you want to just leave the switch
> operate in E2E TC mode by default, and put it into P2P TC as soon as
> your .port_hwtstamp_set() method is called?
Currently the TC mode has to be set manually. So the automatic mode of ptp4l 
cannot be used. Probably this could be automated depending on received PTP 
messages. But probably we cannot see these messages because these are filtered 
out.

> Ok, on to my next question....
> 
> Bit 1: Selection of Master or Slave
> 1 = Host port is PTP master ordinary clock
> 0 = Host port is PTP slave ordinary clock
> 
> What does this _actually_ do? Here I really have no idea. I can only
> imagine that this has again to do with the 1-step TC operation, and that
> it's treating the host port as a switched endpoint, and this has to do
> with the port states of the P2P TC. I'm so confused by this one that I
> don't even know what to ask...
As for the TC mode questions, also this one could be best answered by 
Microchip. If not set correctly, clock synchronization will not work. I need 
to investigate what happens here concretely. Again I suppose that time 
stamping of the associated messages will not be performed or some messages may 
be filtered out.

> Ok, let's put it differently. You bothered to add a sysfs for it, so you
> must be using it for something. What are you using it for?
If not set correctly, nothing will work. This has the major drawback, that 
ptp4l cannot determine the clock mode nor the delay measurement mode 
automatically. I will try find some answers for this.

Best regards
Christian

[global]
twoStepFlag 0
time_stamping p2p1step
#slaveOnly 1
delay_mechanism P2P
#delay_mechanism E2E
delay_filter_length 100
tx_timestamp_timeout 100
#network_transport UDPv6
#network_transport L2

[1] 
http://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9563R-Errata-80000786B.pdf
Vladimir Oltean Oct. 21, 2020, 11:39 p.m. UTC | #3
On Mon, Oct 19, 2020 at 07:24:33PM +0200, Christian Eggers wrote:
> Add routines required for TX hardware time stamping.
> 
> The KSZ9563 only supports one step time stamping
> (HWTSTAMP_TX_ONESTEP_P2P), which requires linuxptp-2.0 or later. PTP
> mode is permanently enabled (changes tail tag; depends on
> CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP).TX time stamps are reported via an
> interrupt / device registers whilst RX time stamps are reported via an
> additional tail tag.
> 
> One step TX time stamping of PDelay_Resp requires the RX time stamp from
> the associated PDelay_Req message. linuxptp assumes that the RX time
> stamp has already been subtracted from the PDelay_Req correction field
> (as done by the TI PHYTER). linuxptp will echo back the value of the
> correction field in the PDelay_Resp message.
> 
> In order to be compatible to this already established interface, the
> KSZ9563 code emulates this behavior. When processing the PDelay_Resp
> message, the time stamp is moved back from the correction field to the
> tail tag, as the hardware doesn't support negative values on this field.
> Of course, the UDP checksums (if any) have to be corrected after this
> (for both directions).
> 
> The PTP hardware performs internal detection of PTP frames (likely
> similar as ptp_classify_raw() and ptp_parse_header()). As these filters
> cannot be disabled, the current delay mode (E2E/P2P) and the clock mode
> (master/slave) must be configured via sysfs attributes. Time stamping
> will only be performed on PTP packets matching the current mode
> settings.
> 
> Everything has been tested on a Microchip KSZ9563 switch.
> 
> Signed-off-by: Christian Eggers <ceggers@arri.de>
> ---

Hi Richard!

Looks like we've been discussing without you here. I had an off-list discussion
with Christian and we just realized that. The topic of this email is that we've
got new P2P one-step hardware coming in, and it's not quite API-compatible with
what's there already (you might not actually be surprised by that).

Since you weren't copied to Christian's patches, here's a patchwork link for you:
https://patchwork.ozlabs.org/project/netdev/cover/20201019172435.4416-1-ceggers@arri.de/

Short introduction by quoting from the IEEE 1588 standard, just to get everybody
on the same page.

----------------------------------[cut here]-----------------------------------

11.4.3 Peer delay mechanism operational specifications
------------------------------------------------------

The actual value of the meanPathDelay shall be measured and computed as follows
for each instance of a peer delay request-response measurement:

(a) The delay requestor, Node-A:
    (1) Prepares a Pdelay_Req message. The correctionField (see 13.3.2.7) shall
        be set to 0.
    (2) If asymmetry corrections are required, shall modify the correctionField
        per 11.6.4.
    (3) Shall set the originTimestamp to 0 or an estimate no worse than +/-1s of
        the egress timestamp, t1, of the Pdelay_Req message.
    (4) Shall send the Pdelay_Req message and generate and save timestamp t1.
(b) If the delay responder, Node-B, is a one-step clock, it shall:
    (1) Generate timestamp t2 upon receipt of the Pdelay_Req message
    (2) Prepare a Pdelay_Resp message
    (3) Copy the sequenceId field from the Pdelay_Req message to the sequenceId
        field of the Pdelay_Resp message
    (4) Copy the sourcePortIdentity field from the Pdelay_Req message to the
        requestingPortIdentity field of the Pdelay_Resp message
    (5) Copy the domainNumber field from the Pdelay_Req message to the
        domainNumber field of the Pdelay_Resp message
    (6) Copy the correctionField from the Pdelay_Req message to the
        correctionField of the Pdelay_Resp message
    (7) Then:
        (i) Set to 0 the requestReceiptTimestamp field of the Pdelay_Resp message
        (ii) Issue the Pdelay_Resp message and generate timestamp t3 upon sending
        (iii) After t3 is generated but while the Pdelay_Resp message is
              leaving the responder, shall add the turnaround time t3 - t2 to
              the correctionField of the Pdelay_Resp message and make any
              needed corrections to checksums or other content-dependent fields
              of the Pdelay_Resp message
(c) If the delay responder is a two-step clock, it shall:
    [skipping because in our case, the delay responder is a one-step clock]
(d) The delay requestor, Node-A, shall:
    (1) Generate timestamp t 4 upon receipt of the Pdelay_Resp message
    (2) If asymmetry corrections are required, modify the correctionField of
        the Pdelay_Resp message per 11.6.5
    (3) If the twoStepFlag of the received Pdelay_Resp message is FALSE,
        indicating that no Pdelay_Resp_Follow_Up message will be received,
        compute the <meanPathDelay> as:
        <meanPathDelay> = [(t4 - t1) - correctionField of Pdelay_Resp] / 2

----------------------------------[cut here]-----------------------------------

Simply put:

          |                    |
       t1 |------\ Pdelay_Req  |
          |       ------\      |
          |              ----->| t2
Clock A   |                    |     Clock B
          | Pdelay_Resp /------| t3
          |       ------       |
       t4 |<-----/             |
          |                    |

                (t4 - t1) - (t3 - t2)
meanPathDelay = ---------------------
                          2

where t3 - t2 (the "turnaround" time) is contained inside the correctionField
of the Pdelay_Resp. To reiterate, t3 is the RX timestamp of the Pdelay_Req, and
t4 is the TX timestamp of the Pdelay_Resp, both taken at the delay responder
(Clock B).

That was about it for the standard.


The hardware that Christian is configuring (consider operation as Clock B)
works this way:
(a) the ingress port takes the t2 RX timestamp of the Pdelay_Req normally
(b) on transmission of the Pdelay_Resp, the kernel must provide t2 as metadata
    together with the Pdelay_Resp frame itself (it is put in the "TX timestamp"
    field of the DSA tag)
(c) the egress port takes the t3 TX timestamp and rewrites the correctionField
    of the PTP header with the (t3 - t2) value

Straightforward, one might say?
Except for the fact that this is not how the API for peer-to-peer one-step
timestamping currently works.
We were expecting that for this switch to have a streamlined implementation for
P2P 1-step, the kernel would have an API to inject a Pdelay_Resp packet into
the stack _along_ with t2. But it doesn't.

So how _does_ that work for TI PHYTER?

As far as we understand, the PHYTER appears to autonomously mangle PTP packets
in the following way:
- subtracting t2 on RX from the correctionField of the Pdelay_Req
- adding t3 on TX to the correctionField of the Pdelay_Resp

All that linuxptp has to do is connect the wires. That's why in process_pdelay_req()
from port.c, there is:

	if (p->timestamping == TS_P2P1STEP) {
		rsp->header.correction = m->header.correction;

Otherwise said, the t2 is not provided to the kernel explicitly, but simply by
copying the correctionField from the Pdelay_Req into the Pdelay_Resp. This
correctionField is the play ground of the PHYTER, so by having linuxptp copy
that field, it simply has a path to finish off what it started. We believe that
by the time the Pdelay_Req message arrives at the host, the correctionField
will be negative, and it will only become back positive after transmitting the
Pdelay_Resp. This is also something that might be problematic for other devices.

Nevertheless, I'm not here to say that it's wrong to do what we are currently
doing for PHYTER. I mean, paragraph 11.4.3.(b).(6) clearly states that the
correctionField should be transferred anyway from the Pdelay_Req to the
Pdelay_Resp. I am not entirely sure of the reasons why it stipulates this, if
it's just to allow hardware like PHYTER to operate within spec, or if it's for
the delay requestor to be aware of the correction that was applied to his frame.
Regardless of which way it is, we expect that any 1588 stack will have to do
that, so this is not implementation-defined behavior.

What I'm here to say is that this is not how Christian's hardware works, and he
is simply emulating the kernel interface used by the PHYTER. He needs to
subtract t2 on RX from the correctionField manually in the tagger code, then
have extra code on TX to copy t2 from the correctionField into the DSA tag's TX
timestamp field. This complicates the code by quite a significant margin, and
my concern is that nobody is going to follow what's going on while
maintaining it, and why it is done in this backwards way.
We would like to know your opinion on whether we couldn't, in fact, have a more
straightforward UAPI for hardware like his. For example, expand the sendmsg()
API with a new cmsg that contains t2 specifically for P2P one-step
timestamping. Or anything else. But "straightforward" should be the key.

Thanks,
-Vladimir
Richard Cochran Oct. 22, 2020, 2:42 a.m. UTC | #4
I'm just catching up with this.

Really. Truly. Please -- Include the maintainer on CC for such patches!

In case you don't know who that is, you can always consult the MAINTAINERS file.

There you will find the following entry.

    PTP HARDWARE CLOCK SUPPORT
    M:      Richard Cochran <richardcochran@gmail.com>
    L:      netdev@vger.kernel.org
    S:      Maintained
    W:      http://linuxptp.sourceforge.net/

On Thu, Oct 22, 2020 at 02:39:35AM +0300, Vladimir Oltean wrote:
> On Mon, Oct 19, 2020 at 07:24:33PM +0200, Christian Eggers wrote:
> > The PTP hardware performs internal detection of PTP frames (likely
> > similar as ptp_classify_raw() and ptp_parse_header()). As these filters
> > cannot be disabled, the current delay mode (E2E/P2P) and the clock mode
> > (master/slave) must be configured via sysfs attributes.

This is a complete no-go.  NAK.

Richard
diff mbox series

Patch

diff --git a/drivers/net/dsa/microchip/ksz9477_main.c b/drivers/net/dsa/microchip/ksz9477_main.c
index 7d623400139f..42cd17c8c25d 100644
--- a/drivers/net/dsa/microchip/ksz9477_main.c
+++ b/drivers/net/dsa/microchip/ksz9477_main.c
@@ -1388,6 +1388,7 @@  static const struct dsa_switch_ops ksz9477_switch_ops = {
 	.phy_read		= ksz9477_phy_read16,
 	.phy_write		= ksz9477_phy_write16,
 	.phylink_mac_link_down	= ksz_mac_link_down,
+	.get_ts_info		= ksz9477_ptp_get_ts_info,
 	.port_enable		= ksz_enable_port,
 	.get_strings		= ksz9477_get_strings,
 	.get_ethtool_stats	= ksz_get_ethtool_stats,
@@ -1408,6 +1409,11 @@  static const struct dsa_switch_ops ksz9477_switch_ops = {
 	.port_mdb_del           = ksz9477_port_mdb_del,
 	.port_mirror_add	= ksz9477_port_mirror_add,
 	.port_mirror_del	= ksz9477_port_mirror_del,
+	.port_hwtstamp_get      = ksz9477_ptp_port_hwtstamp_get,
+	.port_hwtstamp_set      = ksz9477_ptp_port_hwtstamp_set,
+	.port_txtstamp          = ksz9477_ptp_port_txtstamp,
+	/* never defer rx delivery, tstamping is done via tail tagging */
+	.port_rxtstamp          = NULL,
 };
 
 static u32 ksz9477_get_port_addr(int port, int offset)
@@ -1554,7 +1560,8 @@  static irqreturn_t ksz9477_switch_irq_thread(int irq, void *dev_id)
 			if (ret)
 				return result;
 
-			/* ToDo: Add specific handling of port interrupts */
+			if (data8 & PORT_PTP_INT)
+				result |= ksz9477_ptp_port_interrupt(dev, port);
 		}
 	}
 
diff --git a/drivers/net/dsa/microchip/ksz9477_ptp.c b/drivers/net/dsa/microchip/ksz9477_ptp.c
index 44d7bbdea518..d4bcfff0577e 100644
--- a/drivers/net/dsa/microchip/ksz9477_ptp.c
+++ b/drivers/net/dsa/microchip/ksz9477_ptp.c
@@ -6,7 +6,10 @@ 
  * Copyright (c) 2020 ARRI Lighting
  */
 
+#include <linux/net_tstamp.h>
+#include <linux/ptp_classify.h>
 #include <linux/ptp_clock_kernel.h>
+#include <linux/sysfs.h>
 
 #include "ksz_common.h"
 #include "ksz9477_reg.h"
@@ -71,8 +74,10 @@  static int ksz9477_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
 static int ksz9477_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 {
 	struct ksz_device *dev = container_of(ptp, struct ksz_device, ptp_caps);
+	struct timespec64 delta64 = ns_to_timespec64(delta);
 	s32 sec, nsec;
 	u16 data16;
+	unsigned long flags;
 	int ret;
 
 	mutex_lock(&dev->ptp_mutex);
@@ -103,6 +108,10 @@  static int ksz9477_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	if (ret)
 		goto error_return;
 
+	spin_lock_irqsave(&dev->ptp_clock_lock, flags);
+	dev->ptp_clock_time = timespec64_add(dev->ptp_clock_time, delta64);
+	spin_unlock_irqrestore(&dev->ptp_clock_lock, flags);
+
 error_return:
 	mutex_unlock(&dev->ptp_mutex);
 	return ret;
@@ -160,6 +169,7 @@  static int ksz9477_ptp_settime(struct ptp_clock_info *ptp, struct timespec64 con
 {
 	struct ksz_device *dev = container_of(ptp, struct ksz_device, ptp_caps);
 	u16 data16;
+	unsigned long flags;
 	int ret;
 
 	mutex_lock(&dev->ptp_mutex);
@@ -198,6 +208,10 @@  static int ksz9477_ptp_settime(struct ptp_clock_info *ptp, struct timespec64 con
 	if (ret)
 		goto error_return;
 
+	spin_lock_irqsave(&dev->ptp_clock_lock, flags);
+	dev->ptp_clock_time = *ts;
+	spin_unlock_irqrestore(&dev->ptp_clock_lock, flags);
+
 error_return:
 	mutex_unlock(&dev->ptp_mutex);
 	return ret;
@@ -208,9 +222,27 @@  static int ksz9477_ptp_enable(struct ptp_clock_info *ptp, struct ptp_clock_reque
 	return -ENOTTY;
 }
 
+static long ksz9477_ptp_do_aux_work(struct ptp_clock_info *ptp)
+{
+	struct ksz_device *dev = container_of(ptp, struct ksz_device, ptp_caps);
+	struct timespec64 ts;
+	unsigned long flags;
+
+	mutex_lock(&dev->ptp_mutex);
+	_ksz9477_ptp_gettime(dev, &ts);
+	mutex_unlock(&dev->ptp_mutex);
+
+	spin_lock_irqsave(&dev->ptp_clock_lock, flags);
+	dev->ptp_clock_time = ts;
+	spin_unlock_irqrestore(&dev->ptp_clock_lock, flags);
+
+	return HZ;  /* reschedule in 1 second */
+}
+
 static int ksz9477_ptp_start_clock(struct ksz_device *dev)
 {
 	u16 data;
+	unsigned long flags;
 	int ret;
 
 	ret = ksz_read16(dev, REG_PTP_CLK_CTRL, &data);
@@ -230,6 +262,11 @@  static int ksz9477_ptp_start_clock(struct ksz_device *dev)
 	if (ret)
 		return ret;
 
+	spin_lock_irqsave(&dev->ptp_clock_lock, flags);
+	dev->ptp_clock_time.tv_sec = 0;
+	dev->ptp_clock_time.tv_nsec = 0;
+	spin_unlock_irqrestore(&dev->ptp_clock_lock, flags);
+
 	return 0;
 }
 
@@ -251,11 +288,349 @@  static int ksz9477_ptp_stop_clock(struct ksz_device *dev)
 	return 0;
 }
 
+/* Time stamping support */
+
+static int ksz9477_ptp_enable_mode(struct ksz_device *dev)
+{
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, REG_PTP_MSG_CONF1, &data);
+	if (ret)
+		return ret;
+
+	/* Enable PTP mode */
+	data |= PTP_ENABLE;
+	ret = ksz_write16(dev, REG_PTP_MSG_CONF1, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int ksz9477_ptp_disable_mode(struct ksz_device *dev)
+{
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, REG_PTP_MSG_CONF1, &data);
+	if (ret)
+		return ret;
+
+	/* Disable PTP mode */
+	data &= ~PTP_ENABLE;
+	ret = ksz_write16(dev, REG_PTP_MSG_CONF1, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int ksz9477_ptp_enable_port_ptp_interrupts(struct ksz_device *dev, int port)
+{
+	u32 addr = PORT_CTRL_ADDR(port, REG_PORT_INT_MASK);
+	u8 data;
+	int ret;
+
+	ret = ksz_read8(dev, addr, &data);
+	if (ret)
+		return ret;
+
+	/* Enable port PTP interrupt (0 means enabled) */
+	data &= ~PORT_PTP_INT;
+	ret = ksz_write8(dev, addr, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int ksz9477_ptp_disable_port_ptp_interrupts(struct ksz_device *dev, int port)
+{
+	u32 addr = PORT_CTRL_ADDR(port, REG_PORT_INT_MASK);
+	u8 data;
+	int ret;
+
+	ret = ksz_read8(dev, addr, &data);
+	if (ret)
+		return ret;
+
+	/* Enable port PTP interrupt (1 means disabled) */
+	data |= PORT_PTP_INT;
+	ret = ksz_write8(dev, addr, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int ksz9477_ptp_enable_port_egress_interrupts(struct ksz_device *dev, int port)
+{
+	u32 addr = PORT_CTRL_ADDR(port, REG_PTP_PORT_TX_INT_ENABLE__2);
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, addr, &data);
+	if (ret)
+		return ret;
+
+	/* Enable port xdelay egress timestamp interrupt (1 means enabled) */
+	data |= PTP_PORT_XDELAY_REQ_INT;
+	ret = ksz_write16(dev, addr, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int ksz9477_ptp_disable_port_egress_interrupts(struct ksz_device *dev, int port)
+{
+	u32 addr = PORT_CTRL_ADDR(port, REG_PTP_PORT_TX_INT_ENABLE__2);
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, addr, &data);
+	if (ret)
+		return ret;
+
+	/* Disable port xdelay egress timestamp interrupts (0 means disabled) */
+	data &= PTP_PORT_XDELAY_REQ_INT;
+	ret = ksz_write16(dev, addr, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int ksz9477_ptp_port_init(struct ksz_device *dev, int port)
+{
+	struct ksz_port *prt = &dev->ports[port];
+	int ret;
+
+	/* Read rx and tx delay from port registers */
+	ret = ksz_read16(dev, PORT_CTRL_ADDR(port, REG_PTP_PORT_RX_DELAY__2),
+			 &prt->tstamp_rx_latency_ns);
+	if (ret)
+		return ret;
+
+	ret = ksz_read16(dev, PORT_CTRL_ADDR(port, REG_PTP_PORT_TX_DELAY__2),
+			 &prt->tstamp_tx_latency_ns);
+	if (ret)
+		return ret;
+
+	if (port != dev->cpu_port) {
+		ret = ksz9477_ptp_enable_port_ptp_interrupts(dev, port);
+		if (ret)
+			return ret;
+
+		ret = ksz9477_ptp_enable_port_egress_interrupts(dev, port);
+		if (ret)
+			goto error_disable_port_ptp_interrupts;
+	}
+
+	return 0;
+
+error_disable_port_ptp_interrupts:
+	if (port != dev->cpu_port)
+		ksz9477_ptp_disable_port_ptp_interrupts(dev, port);
+	return ret;
+}
+
+static void ksz9477_ptp_port_deinit(struct ksz_device *dev, int port)
+{
+	if (port != dev->cpu_port) {
+		ksz9477_ptp_disable_port_egress_interrupts(dev, port);
+		ksz9477_ptp_disable_port_ptp_interrupts(dev, port);
+	}
+}
+
+static int ksz9477_ptp_ports_init(struct ksz_device *dev)
+{
+	int port;
+	int ret;
+
+	for (port = 0; port < dev->port_cnt; port++) {
+		ret = ksz9477_ptp_port_init(dev, port);
+		if (ret)
+			goto error_deinit;
+	}
+
+	return 0;
+
+error_deinit:
+	for (--port; port >= 0; --port)
+		ksz9477_ptp_port_deinit(dev, port);
+	return ret;
+}
+
+static void ksz9477_ptp_ports_deinit(struct ksz_device *dev)
+{
+	int port;
+
+	for (port = dev->port_cnt - 1; port >= 0; --port)
+		ksz9477_ptp_port_deinit(dev, port);
+}
+
+/* device attributes */
+
+enum ksz9477_ptp_tcmode {
+	KSZ9477_PTP_TCMODE_E2E,
+	KSZ9477_PTP_TCMODE_P2P,
+};
+
+static int ksz9477_ptp_tcmode_get(struct ksz_device *dev, enum ksz9477_ptp_tcmode *tcmode)
+{
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, REG_PTP_MSG_CONF1, &data);
+	if (ret)
+		return ret;
+
+	*tcmode = (data & PTP_TC_P2P) ? KSZ9477_PTP_TCMODE_P2P : KSZ9477_PTP_TCMODE_E2E;
+
+	return 0;
+}
+
+static int ksz9477_ptp_tcmode_set(struct ksz_device *dev, enum ksz9477_ptp_tcmode tcmode)
+{
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, REG_PTP_MSG_CONF1, &data);
+	if (ret)
+		return ret;
+
+	if (tcmode == KSZ9477_PTP_TCMODE_P2P)
+		data |= PTP_TC_P2P;
+	else
+		data &= ~PTP_TC_P2P;
+
+	ret = ksz_write16(dev, REG_PTP_MSG_CONF1, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static ssize_t tcmode_show(struct device *dev, struct device_attribute *attr __always_unused,
+			   char *buf)
+{
+	struct ksz_device *ksz = dev_get_drvdata(dev);
+	enum ksz9477_ptp_tcmode tcmode;
+	int ret = ksz9477_ptp_tcmode_get(ksz, &tcmode);
+
+	if (ret)
+		return ret;
+
+	return sprintf(buf, "%s\n", tcmode == KSZ9477_PTP_TCMODE_P2P ? "P2P" : "E2E");
+}
+
+static ssize_t tcmode_store(struct device *dev, struct device_attribute *attr __always_unused,
+			    char const *buf, size_t count)
+{
+	struct ksz_device *ksz = dev_get_drvdata(dev);
+	int ret;
+
+	if (strcasecmp(buf, "E2E") == 0)
+		ret = ksz9477_ptp_tcmode_set(ksz, KSZ9477_PTP_TCMODE_E2E);
+	else if (strcasecmp(buf, "P2P") == 0)
+		ret = ksz9477_ptp_tcmode_set(ksz, KSZ9477_PTP_TCMODE_P2P);
+	else
+		return -EINVAL;
+
+	return ret ? ret : (ssize_t)count;
+}
+
+static DEVICE_ATTR_RW(tcmode);
+
+enum ksz9477_ptp_ocmode {
+	KSZ9477_PTP_OCMODE_SLAVE,
+	KSZ9477_PTP_OCMODE_MASTER,
+};
+
+static int ksz9477_ptp_ocmode_get(struct ksz_device *dev, enum ksz9477_ptp_ocmode *ocmode)
+{
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, REG_PTP_MSG_CONF1, &data);
+	if (ret)
+		return ret;
+
+	*ocmode = (data & PTP_MASTER) ? KSZ9477_PTP_OCMODE_MASTER : KSZ9477_PTP_OCMODE_SLAVE;
+
+	return 0;
+}
+
+static int ksz9477_ptp_ocmode_set(struct ksz_device *dev, enum ksz9477_ptp_ocmode ocmode)
+{
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, REG_PTP_MSG_CONF1, &data);
+	if (ret)
+		return ret;
+
+	if (ocmode == KSZ9477_PTP_OCMODE_MASTER)
+		data |= PTP_MASTER;
+	else
+		data &= ~PTP_MASTER;
+
+	ret = ksz_write16(dev, REG_PTP_MSG_CONF1, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static ssize_t ocmode_show(struct device *dev, struct device_attribute *attr __always_unused,
+			   char *buf)
+{
+	struct ksz_device *ksz = dev_get_drvdata(dev);
+	enum ksz9477_ptp_ocmode ocmode;
+	int ret = ksz9477_ptp_ocmode_get(ksz, &ocmode);
+
+	if (ret)
+		return ret;
+
+	return sprintf(buf, "%s\n", ocmode == KSZ9477_PTP_OCMODE_MASTER ? "master" : "slave");
+}
+
+static ssize_t ocmode_store(struct device *dev, struct device_attribute *attr __always_unused,
+			    char const *buf, size_t count)
+{
+	struct ksz_device *ksz = dev_get_drvdata(dev);
+	int ret;
+
+	if (strcasecmp(buf, "master") == 0)
+		ret = ksz9477_ptp_ocmode_set(ksz, KSZ9477_PTP_OCMODE_MASTER);
+	else if (strcasecmp(buf, "slave") == 0)
+		ret = ksz9477_ptp_ocmode_set(ksz, KSZ9477_PTP_OCMODE_SLAVE);
+	else
+		return -EINVAL;
+
+	return ret ? ret : (ssize_t)count;
+}
+
+static DEVICE_ATTR_RW(ocmode);
+
+static struct attribute *ksz9477_ptp_attrs[] = {
+	&dev_attr_tcmode.attr,
+	&dev_attr_ocmode.attr,
+	NULL,
+};
+
+static struct attribute_group ksz9477_ptp_attrgrp = {
+	.attrs = ksz9477_ptp_attrs,
+};
+
 int ksz9477_ptp_init(struct ksz_device *dev)
 {
 	int ret;
 
 	mutex_init(&dev->ptp_mutex);
+	spin_lock_init(&dev->ptp_clock_lock);
 
 	/* PTP clock properties */
 
@@ -275,6 +650,7 @@  int ksz9477_ptp_init(struct ksz_device *dev)
 	dev->ptp_caps.gettime64   = ksz9477_ptp_gettime;
 	dev->ptp_caps.settime64   = ksz9477_ptp_settime;
 	dev->ptp_caps.enable      = ksz9477_ptp_enable;
+	dev->ptp_caps.do_aux_work = ksz9477_ptp_do_aux_work;
 
 	/* Start hardware counter (will overflow after 136 years) */
 	ret = ksz9477_ptp_start_clock(dev);
@@ -287,8 +663,36 @@  int ksz9477_ptp_init(struct ksz_device *dev)
 		goto error_stop_clock;
 	}
 
+	/* Enable PTP mode (will affect tail tagging format) */
+	ret = ksz9477_ptp_enable_mode(dev);
+	if (ret)
+		goto error_unregister_clock;
+
+	/* Init switch ports */
+	ret = ksz9477_ptp_ports_init(dev);
+	if (ret)
+		goto error_disable_mode;
+
+	/* Init attributes */
+	ret = sysfs_create_group(&dev->dev->kobj, &ksz9477_ptp_attrgrp);
+	if (ret)
+		goto error_ports_deinit;
+
+	/* Schedule cyclic call of ksz_ptp_do_aux_work() */
+	ret = ptp_schedule_worker(dev->ptp_clock, 0);
+	if (ret)
+		goto error_device_remove_attrgrp;
+
 	return 0;
 
+error_device_remove_attrgrp:
+	sysfs_remove_group(&dev->dev->kobj, &ksz9477_ptp_attrgrp);
+error_ports_deinit:
+	ksz9477_ptp_ports_deinit(dev);
+error_disable_mode:
+	ksz9477_ptp_disable_mode(dev);
+error_unregister_clock:
+	ptp_clock_unregister(dev->ptp_clock);
 error_stop_clock:
 	ksz9477_ptp_stop_clock(dev);
 	return ret;
@@ -296,6 +700,231 @@  int ksz9477_ptp_init(struct ksz_device *dev)
 
 void ksz9477_ptp_deinit(struct ksz_device *dev)
 {
+	sysfs_remove_group(&dev->dev->kobj, &ksz9477_ptp_attrgrp);
+	ksz9477_ptp_ports_deinit(dev);
+	ksz9477_ptp_disable_mode(dev);
 	ptp_clock_unregister(dev->ptp_clock);
 	ksz9477_ptp_stop_clock(dev);
 }
+
+irqreturn_t ksz9477_ptp_port_interrupt(struct ksz_device *dev, int port)
+{
+	u32 addr = PORT_CTRL_ADDR(port, REG_PTP_PORT_TX_INT_STATUS__2);
+	struct ksz_port *prt = &dev->ports[port];
+	irqreturn_t result = IRQ_NONE;
+	u16 data;
+	int ret;
+
+	ret = ksz_read16(dev, addr, &data);
+	if (ret)
+		return IRQ_NONE;
+
+	if ((data & PTP_PORT_XDELAY_REQ_INT) && prt->tstamp_tx_xdelay_skb) {
+		/* Timestamp for Pdelay_Req / Delay_Req */
+		u32 tstamp_raw;
+		struct skb_shared_hwtstamps shhwtstamps;
+		struct sk_buff *tmp_skb;
+
+		/* In contrast to the KSZ9563R data sheet, the format of the
+		 * port time stamp registers is 2 bit seconds + 30 bit nano
+		 * seconds (same as in the tail tags).
+		 */
+		ret = ksz_read32(dev, PORT_CTRL_ADDR(port, REG_PTP_PORT_XDELAY_TS), &tstamp_raw);
+		if (ret)
+			return result;
+
+		memset(&shhwtstamps, 0, sizeof(shhwtstamps));
+		shhwtstamps.hwtstamp = ksz9477_tstamp_to_clock(dev, tstamp_raw,
+							       prt->tstamp_tx_latency_ns);
+
+		/* skb_complete_tx_timestamp() will free up the client to make
+		 * another timestamp-able transmit. We have to be ready for it
+		 * -- by clearing the ps->tx_skb "flag" -- beforehand.
+		 */
+
+		tmp_skb = prt->tstamp_tx_xdelay_skb;
+		prt->tstamp_tx_xdelay_skb = NULL;
+		clear_bit_unlock(KSZ_HWTSTAMP_TX_XDELAY_IN_PROGRESS, &prt->tstamp_state);
+		skb_complete_tx_timestamp(tmp_skb, &shhwtstamps);
+	}
+
+	/* Clear interrupt(s) (W1C) */
+	ret = ksz_write16(dev, addr, data);
+	if (ret)
+		return IRQ_NONE;
+
+	return IRQ_HANDLED;
+}
+
+/* DSA PTP operations */
+
+int ksz9477_ptp_get_ts_info(struct dsa_switch *ds, int port __always_unused,
+			    struct ethtool_ts_info *ts)
+{
+	struct ksz_device *dev = ds->priv;
+
+	ts->so_timestamping =
+			SOF_TIMESTAMPING_TX_HARDWARE |
+			SOF_TIMESTAMPING_RX_HARDWARE |
+			SOF_TIMESTAMPING_RAW_HARDWARE;
+
+	ts->phc_index = ptp_clock_index(dev->ptp_clock);
+
+	ts->tx_types =
+			BIT(HWTSTAMP_TX_OFF) |
+			BIT(HWTSTAMP_TX_ONESTEP_P2P);
+
+	ts->rx_filters =
+			BIT(HWTSTAMP_FILTER_NONE) |
+			BIT(HWTSTAMP_FILTER_PTP_V2_L4_EVENT) |
+			BIT(HWTSTAMP_FILTER_PTP_V2_L2_EVENT) |
+			BIT(HWTSTAMP_FILTER_PTP_V2_EVENT);
+
+	return 0;
+}
+
+static int ksz9477_set_hwtstamp_config(struct ksz_device *dev, int port,
+				       struct hwtstamp_config *config)
+{
+	struct ksz_port *prt = &dev->ports[port];
+	bool tstamp_enable = false;
+
+	/* Prevent the TX/RX paths from trying to interact with the
+	 * timestamp hardware while we reconfigure it.
+	 */
+	clear_bit_unlock(KSZ_HWTSTAMP_ENABLED, &prt->tstamp_state);
+
+	/* reserved for future extensions */
+	if (config->flags)
+		return -EINVAL;
+
+	switch (config->tx_type) {
+	case HWTSTAMP_TX_OFF:
+		tstamp_enable = false;
+		break;
+	case HWTSTAMP_TX_ONESTEP_P2P:
+		tstamp_enable = true;
+		break;
+	default:
+		return -ERANGE;
+	}
+
+	switch (config->rx_filter) {
+	case HWTSTAMP_FILTER_NONE:
+		tstamp_enable = false;
+		break;
+	case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+		config->rx_filter = HWTSTAMP_FILTER_PTP_V2_L4_EVENT;  /* UDPv4/UDPv6 */
+		break;
+	case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+		config->rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT;  /* 802.3 ether */
+		break;
+	case HWTSTAMP_FILTER_PTP_V2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+		config->rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;     /* UDP / 802.3 */
+		break;
+	case HWTSTAMP_FILTER_ALL:
+	default:
+		config->rx_filter = HWTSTAMP_FILTER_NONE;
+		return -ERANGE;
+	}
+
+	/* Once hardware has been configured, enable timestamp checks
+	 * in the RX/TX paths.
+	 */
+	if (tstamp_enable)
+		set_bit(KSZ_HWTSTAMP_ENABLED, &prt->tstamp_state);
+
+	return 0;
+}
+
+int ksz9477_ptp_port_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr)
+{
+	struct ksz_device *dev = ds->priv;
+	struct hwtstamp_config *port_tstamp_config = &dev->ports[port].tstamp_config;
+
+	return copy_to_user(ifr->ifr_data,
+			    port_tstamp_config, sizeof(*port_tstamp_config)) ? -EFAULT : 0;
+}
+
+int ksz9477_ptp_port_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr)
+{
+	struct ksz_device *dev = ds->priv;
+	struct hwtstamp_config *port_tstamp_config = &dev->ports[port].tstamp_config;
+	struct hwtstamp_config config;
+	int err;
+
+	if (copy_from_user(&config, ifr->ifr_data, sizeof(config)))
+		return -EFAULT;
+
+	err = ksz9477_set_hwtstamp_config(dev, port, &config);
+	if (err)
+		return err;
+
+	/* Save the chosen configuration to be returned later. */
+	memcpy(port_tstamp_config, &config, sizeof(config));
+
+	return copy_to_user(ifr->ifr_data, &config, sizeof(config)) ? -EFAULT : 0;
+}
+
+/* Returns a pointer to the PTP header if the caller should time stamp,
+ * or NULL if the caller should not.
+ */
+static struct ptp_header *ksz9477_ptp_should_tstamp(struct ksz_port *port, struct sk_buff *skb,
+						    unsigned int type)
+{
+	if (!test_bit(KSZ_HWTSTAMP_ENABLED, &port->tstamp_state))
+		return NULL;
+
+	return ptp_parse_header(skb, type);
+}
+
+bool ksz9477_ptp_port_txtstamp(struct dsa_switch *ds, int port, struct sk_buff *clone,
+			       unsigned int type)
+{
+	struct ksz_device *dev = ds->priv;
+	struct ksz_port *prt = &dev->ports[port];
+	enum ksz9477_ptp_event_messages msg_type;
+	struct ptp_header *hdr;
+
+	/* KSZ9563 supports PTPv2 only */
+	if (!(type & PTP_CLASS_V2))
+		return false;
+
+	/* Should already been tested in dsa_skb_tx_timestamp()? */
+	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
+		return false;
+
+	hdr = ksz9477_ptp_should_tstamp(prt, clone, type);
+	if (!hdr)
+		return false;
+
+	msg_type = ptp_get_msgtype(hdr, type);
+
+	switch (msg_type) {
+	/* As the KSZ9563 always performs one step time stamping, only the time
+	 * stamp for Delay_Req and Pdelay_Req are reported to the application
+	 * via socket error queue. Time stamps for Sync and Pdelay_resp will be
+	 * applied directly to the outgoing message (e.g. correction field), but
+	 * will NOT be reported to the socket.
+	 */
+	case PTP_Event_Message_Delay_Req:
+	case PTP_Event_Message_Pdelay_Req:
+		if (test_and_set_bit_lock(KSZ_HWTSTAMP_TX_XDELAY_IN_PROGRESS,
+					  &prt->tstamp_state))
+			return false;  /* free cloned skb */
+
+		prt->tstamp_tx_xdelay_skb = clone;
+		break;
+
+	default:
+		return false;  /* free cloned skb */
+	}
+
+	return true;  /* keep cloned skb */
+}
diff --git a/drivers/net/dsa/microchip/ksz9477_ptp.h b/drivers/net/dsa/microchip/ksz9477_ptp.h
index 0076538419fa..e8d50a086311 100644
--- a/drivers/net/dsa/microchip/ksz9477_ptp.h
+++ b/drivers/net/dsa/microchip/ksz9477_ptp.h
@@ -10,6 +10,9 @@ 
 #ifndef DRIVERS_NET_DSA_MICROCHIP_KSZ9477_PTP_H_
 #define DRIVERS_NET_DSA_MICROCHIP_KSZ9477_PTP_H_
 
+#include <linux/irqreturn.h>
+#include <linux/types.h>
+
 #include "ksz_common.h"
 
 #if IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP)
@@ -17,11 +20,35 @@ 
 int ksz9477_ptp_init(struct ksz_device *dev);
 void ksz9477_ptp_deinit(struct ksz_device *dev);
 
+irqreturn_t ksz9477_ptp_port_interrupt(struct ksz_device *dev, int port);
+
+int ksz9477_ptp_get_ts_info(struct dsa_switch *ds, int port, struct ethtool_ts_info *ts);
+int ksz9477_ptp_port_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr);
+int ksz9477_ptp_port_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr);
+bool ksz9477_ptp_port_txtstamp(struct dsa_switch *ds, int port, struct sk_buff *clone,
+			       unsigned int type);
+
 #else
 
 static inline int ksz9477_ptp_init(struct ksz_device *dev) { return 0; }
 static inline void ksz9477_ptp_deinit(struct ksz_device *dev) {}
 
+static inline irqreturn_t ksz9477_ptp_port_interrupt(struct ksz_device *dev, int port)
+	{ return IRQ_NONE; }
+
+static inline int ksz9477_ptp_get_ts_info(struct dsa_switch *ds, int port,
+					  struct ethtool_ts_info *ts) { return -EOPNOTSUPP; }
+
+static inline int ksz9477_ptp_port_hwtstamp_get(struct dsa_switch *ds, int port,
+						struct ifreq *ifr) { return -EOPNOTSUPP; }
+
+static inline int ksz9477_ptp_port_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr)
+	{ return -EOPNOTSUPP; }
+
+static inline bool ksz9477_ptp_port_txtstamp(struct dsa_switch *ds, int port,
+					     struct sk_buff *clone, unsigned int type)
+	{ return false; }
+
 #endif
 
 #endif /* DRIVERS_NET_DSA_MICROCHIP_KSZ9477_PTP_H_ */
diff --git a/include/linux/dsa/ksz_common.h b/include/linux/dsa/ksz_common.h
index 4d5b6cc9429a..10b42154ba5a 100644
--- a/include/linux/dsa/ksz_common.h
+++ b/include/linux/dsa/ksz_common.h
@@ -8,8 +8,10 @@ 
 #include <linux/gpio/consumer.h>
 #include <linux/kernel.h>
 #include <linux/mutex.h>
+#include <linux/net_tstamp.h>
 #include <linux/phy.h>
 #include <linux/ptp_clock_kernel.h>
+#include <linux/spinlock.h>
 #include <linux/regmap.h>
 #include <linux/timer.h>
 #include <linux/workqueue.h>
@@ -25,6 +27,12 @@  struct ksz_port_mib {
 	u64 *counters;
 };
 
+/* state flags for ksz_port::tstamp_state */
+enum {
+	KSZ_HWTSTAMP_ENABLED,
+	KSZ_HWTSTAMP_TX_XDELAY_IN_PROGRESS,
+};
+
 struct ksz_port {
 	u16 member;
 	u16 vid_member;
@@ -41,6 +49,13 @@  struct ksz_port {
 
 	struct ksz_port_mib mib;
 	phy_interface_t interface;
+#if IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP)
+	u16 tstamp_rx_latency_ns;   /* rx delay from wire to tstamp unit */
+	u16 tstamp_tx_latency_ns;   /* tx delay from tstamp unit to wire */
+	struct hwtstamp_config tstamp_config;
+	struct sk_buff *tstamp_tx_xdelay_skb;
+	unsigned long tstamp_state;
+#endif
 };
 
 struct ksz_device {
@@ -98,7 +113,22 @@  struct ksz_device {
 	struct ptp_clock *ptp_clock;
 	struct ptp_clock_info ptp_caps;
 	struct mutex ptp_mutex;
+	spinlock_t ptp_clock_lock; /* for ptp_clock_time */
+	/* approximated current time, read once per second from hardware */
+	struct timespec64 ptp_clock_time;
 #endif
 };
 
+/* KSZ9563 will only timestamp event messages */
+enum ksz9477_ptp_event_messages {
+	PTP_Event_Message_Sync        = 0x0,
+	PTP_Event_Message_Delay_Req   = 0x1,
+	PTP_Event_Message_Pdelay_Req  = 0x2,
+	PTP_Event_Message_Pdelay_Resp = 0x3,
+};
+
+/* net/dsa/tag_ksz.c */
+ktime_t ksz9477_tstamp_to_clock(struct ksz_device *ksz, u32 tstamp,
+		int offset_ns);
+
 #endif /* _NET_DSA_KSZ_COMMON_H_ */
diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c
index 4820dbcedfa2..9dbe2fde3db0 100644
--- a/net/dsa/tag_ksz.c
+++ b/net/dsa/tag_ksz.c
@@ -4,10 +4,14 @@ 
  * Copyright (c) 2017 Microchip Technology
  */
 
+#include <asm/unaligned.h>
+#include <linux/dsa/ksz_common.h>
 #include <linux/etherdevice.h>
 #include <linux/list.h>
+#include <linux/ptp_classify.h>
 #include <linux/slab.h>
 #include <net/dsa.h>
+#include <net/checksum.h>
 #include "dsa_priv.h"
 
 /* Typically only one byte is used for tail tag. */
@@ -87,26 +91,210 @@  MODULE_ALIAS_DSA_TAG_DRIVER(DSA_TAG_PROTO_KSZ8795);
 /*
  * For Ingress (Host -> KSZ9477), 2 bytes are added before FCS.
  * ---------------------------------------------------------------------------
- * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|tag0(1byte)|tag1(1byte)|FCS(4bytes)
+ * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|ts(4bytes)|tag0(1byte)|tag1(1byte)|FCS(4bytes)
  * ---------------------------------------------------------------------------
+ * ts   : time stamp (only present if PTP is enabled on the hardware).
  * tag0 : Prioritization (not used now)
  * tag1 : each bit represents port (eg, 0x01=port1, 0x02=port2, 0x10=port5)
  *
- * For Egress (KSZ9477 -> Host), 1 byte is added before FCS.
+ * For Egress (KSZ9477 -> Host), 1/4 bytes are added before FCS.
  * ---------------------------------------------------------------------------
- * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|tag0(1byte)|FCS(4bytes)
+ * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|ts(4bytes)|tag0(1byte)|FCS(4bytes)
  * ---------------------------------------------------------------------------
+ * ts   : time stamp (only present of bit 7 in tag0 is set
  * tag0 : zero-based value represents port
  *	  (eg, 0x00=port1, 0x02=port3, 0x06=port7)
  */
 
 #define KSZ9477_INGRESS_TAG_LEN		2
 #define KSZ9477_PTP_TAG_LEN		4
-#define KSZ9477_PTP_TAG_INDICATION	0x80
+#define KSZ9477_PTP_TAG_INDICATION	BIT(7)
 
 #define KSZ9477_TAIL_TAG_OVERRIDE	BIT(9)
 #define KSZ9477_TAIL_TAG_LOOKUP		BIT(10)
 
+#if IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP)
+static inline __wsum ksz9477_check_diff8(__be64 old, __be64 new, __wsum oldsum)
+{
+	__be64 diff[2] = { ~old, new };
+
+	return csum_partial(diff, sizeof(diff), oldsum);
+}
+
+/* Replace content of PTP header's correction field and update UDP checksum */
+static void ksz9477_update_ptp_correction_field(struct sk_buff *skb, unsigned int type,
+						struct ptp_header *ptp_header, u64 value)
+{
+	u8 *ptr = skb_mac_header(skb);
+	struct udphdr *uhdr = NULL;
+	__be64 correction_old;
+
+	/* previous correction value is required for checksum update. */
+	memcpy(&correction_old,  &ptp_header->correction, sizeof(correction_old));
+
+	/* write new correction value */
+	put_unaligned_be64(value, &ptp_header->correction);
+
+	/* locate udp header */
+	if (type & PTP_CLASS_VLAN)
+		ptr += VLAN_HLEN;
+
+	ptr += ETH_HLEN;
+
+	switch (type & PTP_CLASS_PMASK) {
+	case PTP_CLASS_IPV4:
+		ptr += ((struct iphdr *)ptr)->ihl << 2;
+		uhdr = (struct udphdr *)ptr;
+		break;
+	case PTP_CLASS_IPV6:
+		ptr += IP6_HLEN;
+		uhdr = (struct udphdr *)ptr;
+		break;
+	}
+
+	if (uhdr) {
+		/* update checksum */
+		uhdr->check = csum_fold(ksz9477_check_diff8(correction_old, ptp_header->correction,
+							    ~csum_unfold(uhdr->check)));
+		if (!uhdr->check)
+			uhdr->check = CSUM_MANGLED_0;
+	}
+}
+
+/* Time stamp tag is only inserted if PTP is enabled in hardware. */
+static void ksz9477_xmit_timestamp(struct sk_buff *skb)
+{
+	enum ksz9477_ptp_event_messages msg_type;
+	struct ptp_header *ptp_hdr;
+	unsigned int ptp_type;
+	u32 tstamp_raw = 0;
+	u64 correction;
+
+	/* For PDelay_Resp messages we will likely have a negative value in the
+	 * correction field (see ksz9477_rcv()). The switch hardware cannot
+	 * correctly update such values, so it must be moved to the time stamp
+	 * field in the tail tag.
+	 */
+
+	/* Check PTP message type */
+	ptp_type = ptp_classify_raw(skb);
+
+	if (ptp_type == PTP_CLASS_NONE)
+		goto out_put_tag;
+
+	ptp_hdr = ptp_parse_header(skb, ptp_type);
+	if (!ptp_hdr)
+		goto out_put_tag;
+
+	msg_type = ptp_get_msgtype(ptp_hdr, ptp_type);
+	if (msg_type != PTP_Event_Message_Pdelay_Resp)
+		goto out_put_tag;
+
+	correction = get_unaligned_be64(&ptp_hdr->correction);
+	if ((s64)correction < 0) {
+		struct timespec64 ts;
+
+		/* Move ingress time stamp from PTP header's correction field to tail tag. */
+		ts = ns_to_timespec64(-((s64)correction));
+		tstamp_raw = ((ts.tv_sec & 3) << 30) | ts.tv_nsec;
+		correction = 0;
+		ksz9477_update_ptp_correction_field(skb, ptp_type, ptp_hdr, correction);
+	}
+
+out_put_tag:
+	put_unaligned_be32(tstamp_raw, skb_put(skb, KSZ9477_PTP_TAG_LEN));
+}
+
+ktime_t ksz9477_tstamp_to_clock(struct ksz_device *ksz, u32 tstamp, int offset_ns)
+{
+	struct timespec64 ptp_clock_time;
+	/* Split up time stamp, 2 bit seconds, 30 bit nano seconds */
+	struct timespec64 ts = {
+		.tv_sec  = tstamp >> 30,
+		.tv_nsec = tstamp & (BIT_MASK(30) - 1),
+	};
+	struct timespec64 diff;
+	unsigned long flags;
+	s64 ns;
+
+	spin_lock_irqsave(&ksz->ptp_clock_lock, flags);
+	ptp_clock_time = ksz->ptp_clock_time;
+	spin_unlock_irqrestore(&ksz->ptp_clock_lock, flags);
+
+	/* calculate full time from partial time stamp */
+	ts.tv_sec = (ptp_clock_time.tv_sec & ~3) | ts.tv_sec;
+
+	/* find nearest possible point in time */
+	diff = timespec64_sub(ts, ptp_clock_time);
+	if (diff.tv_sec > 2)
+		ts.tv_sec -= 4;
+	else if (diff.tv_sec < -2)
+		ts.tv_sec += 4;
+
+	/* Add/remove excess delay between wire and time stamp unit */
+	ns = timespec64_to_ns(&ts) + offset_ns;
+
+	return ns_to_ktime(ns);
+}
+EXPORT_SYMBOL(ksz9477_tstamp_to_clock);
+
+static void ksz9477_rcv_timestamp(struct sk_buff *skb, u8 *tag,
+				  struct net_device *dev, unsigned int port)
+{
+	struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb);
+	enum ksz9477_ptp_event_messages msg_type;
+	struct dsa_switch *ds = dev->dsa_ptr->ds;
+	struct ksz_device *ksz = ds->priv;
+	struct ksz_port *prt = &ksz->ports[port];
+	u8 *tstamp = tag - KSZ9477_PTP_TAG_LEN;
+	struct ptp_header *ptp_hdr;
+	unsigned int ptp_type;
+	u64 correction;
+
+	/* convert time stamp and write to skb */
+	memset(hwtstamps, 0, sizeof(*hwtstamps));
+	hwtstamps->hwtstamp = ksz9477_tstamp_to_clock(ksz, get_unaligned_be32(tstamp),
+						      -prt->tstamp_rx_latency_ns);
+
+	/* For PDelay_Req messages, user space (ptp4l) expects that the hardware
+	 * subtracts the egress time stamp from the correction field. The
+	 * separate hw time stamp from the sk_buff struct will not be used in
+	 * this case.
+	 */
+	if (skb_headroom(skb) < ETH_HLEN)
+		return;
+
+	__skb_push(skb, ETH_HLEN);
+	ptp_type = ptp_classify_raw(skb);
+	__skb_pull(skb, ETH_HLEN);
+
+	if (ptp_type == PTP_CLASS_NONE)
+		return;
+
+	ptp_hdr = ptp_parse_header(skb, ptp_type);
+	if (!ptp_hdr)
+		return;
+
+	msg_type = ptp_get_msgtype(ptp_hdr, ptp_type);
+	if (msg_type != PTP_Event_Message_Pdelay_Req)
+		return;
+
+	correction = get_unaligned_be64(&ptp_hdr->correction);
+	correction -= hwtstamps->hwtstamp;
+	ksz9477_update_ptp_correction_field(skb, ptp_type, ptp_hdr, correction);
+}
+#else   /* IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP) */
+static void ksz9477_xmit_timestamp(struct sk_buff *skb __maybe_unused)
+{
+}
+
+static void ksz9477_rcv_timestamp(struct sk_buff *skb __maybe_unused, u8 *tag __maybe_unused,
+				  struct net_device *dev __maybe_unused,
+				  unsigned int port __maybe_unused)
+{
+}
+#endif  /* IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP) */
+
 static struct sk_buff *ksz9477_xmit(struct sk_buff *skb,
 				    struct net_device *dev)
 {
@@ -116,6 +304,7 @@  static struct sk_buff *ksz9477_xmit(struct sk_buff *skb,
 	u16 val;
 
 	/* Tag encoding */
+	ksz9477_xmit_timestamp(skb);
 	tag = skb_put(skb, KSZ9477_INGRESS_TAG_LEN);
 	addr = skb_mac_header(skb);
 
@@ -138,8 +327,10 @@  static struct sk_buff *ksz9477_rcv(struct sk_buff *skb, struct net_device *dev,
 	unsigned int len = KSZ_EGRESS_TAG_LEN;
 
 	/* Extra 4-bytes PTP timestamp */
-	if (tag[0] & KSZ9477_PTP_TAG_INDICATION)
+	if (tag[0] & KSZ9477_PTP_TAG_INDICATION) {
+		ksz9477_rcv_timestamp(skb, tag, dev, port);
 		len += KSZ9477_PTP_TAG_LEN;
+	}
 
 	return ksz_common_rcv(skb, dev, port, len);
 }
@@ -149,7 +340,7 @@  static const struct dsa_device_ops ksz9477_netdev_ops = {
 	.proto	= DSA_TAG_PROTO_KSZ9477,
 	.xmit	= ksz9477_xmit,
 	.rcv	= ksz9477_rcv,
-	.overhead = KSZ9477_INGRESS_TAG_LEN,
+	.overhead = KSZ9477_INGRESS_TAG_LEN + KSZ9477_PTP_TAG_LEN,
 	.tail_tag = true,
 };
 
@@ -167,6 +358,7 @@  static struct sk_buff *ksz9893_xmit(struct sk_buff *skb,
 	u8 *tag;
 
 	/* Tag encoding */
+	ksz9477_xmit_timestamp(skb);
 	tag = skb_put(skb, KSZ_INGRESS_TAG_LEN);
 	addr = skb_mac_header(skb);
 
@@ -183,7 +375,7 @@  static const struct dsa_device_ops ksz9893_netdev_ops = {
 	.proto	= DSA_TAG_PROTO_KSZ9893,
 	.xmit	= ksz9893_xmit,
 	.rcv	= ksz9477_rcv,
-	.overhead = KSZ_INGRESS_TAG_LEN,
+	.overhead = KSZ_INGRESS_TAG_LEN + KSZ9477_PTP_TAG_LEN,
 	.tail_tag = true,
 };