mbox series

[net-next,v3,00/10] ptp: support virtual clocks and timestamping

Message ID 20210615094517.48752-1-yangbo.lu@nxp.com
Headers show
Series ptp: support virtual clocks and timestamping | expand

Message

Y.b. Lu June 15, 2021, 9:45 a.m. UTC
Current PTP driver exposes one PTP device to user which binds network
interface/interfaces to provide timestamping. Actually we have a way
utilizing timecounter/cyclecounter to virtualize any number of PTP
clocks based on a same free running physical clock for using.
The purpose of having multiple PTP virtual clocks is for user space
to directly/easily use them for multiple domains synchronization.

user
space:     ^                                  ^
           | SO_TIMESTAMPING new flag:        | Packets with
           | SOF_TIMESTAMPING_BIND_PHC        | TX/RX HW timestamps
           v                                  v
         +--------------------------------------------+
sock:    |     sock (new member sk_bind_phc)          |
         +--------------------------------------------+
           ^                                  ^
           | ethtool_op_get_phc_vclocks       | Convert HW timestamps
           |                                  | to sk_bind_phc
           v                                  v
         +--------------+--------------+--------------+
vclock:  | ptp1         | ptp2         | ptpN         |
         +--------------+--------------+--------------+
pclock:  |             ptp0 free running              |
         +--------------------------------------------+

The block diagram may explain how it works. Besides the PTP virtual
clocks, the packet HW timestamp converting to the bound PHC is also
done in sock driver. For user space, PTP virtual clocks can be
created via sysfs, and extended SO_TIMESTAMPING API (new flag
SOF_TIMESTAMPING_BIND_PHC) can be used to bind one PTP virtual clock
for timestamping.

The test tool timestamping.c (together with linuxptp phc_ctl tool) can
be used to verify:

  # echo 4 > /sys/class/ptp/ptp0/n_vclocks
  [  129.399472] ptp ptp0: new virtual clock ptp2
  [  129.404234] ptp ptp0: new virtual clock ptp3
  [  129.409532] ptp ptp0: new virtual clock ptp4
  [  129.413942] ptp ptp0: new virtual clock ptp5
  [  129.418257] ptp ptp0: guarantee physical clock free running
  #
  # phc_ctl /dev/ptp2 set 10000
  # phc_ctl /dev/ptp3 set 20000
  #
  # timestamping eno0 2 SOF_TIMESTAMPING_TX_HARDWARE SOF_TIMESTAMPING_RAW_HARDWARE SOF_TIMESTAMPING_BIND_PHC
  # timestamping eno0 2 SOF_TIMESTAMPING_RX_HARDWARE SOF_TIMESTAMPING_RAW_HARDWARE SOF_TIMESTAMPING_BIND_PHC
  # timestamping eno0 3 SOF_TIMESTAMPING_TX_HARDWARE SOF_TIMESTAMPING_RAW_HARDWARE SOF_TIMESTAMPING_BIND_PHC
  # timestamping eno0 3 SOF_TIMESTAMPING_RX_HARDWARE SOF_TIMESTAMPING_RAW_HARDWARE SOF_TIMESTAMPING_BIND_PHC

Changes for v2:
	- Converted to num_vclocks for creating virtual clocks.
	- Guranteed physical clock free running when using virtual
	  clocks.
	- Fixed build warning.
	- Updated copyright.
Changes for v3:
	- Supported PTP virtual clock in default in PTP driver.
	- Protected concurrency of ptp->num_vclocks accessing.
	- Supported PHC vclocks query via ethtool.
	- Extended SO_TIMESTAMPING API for PHC binding.
	- Converted HW timestamps to PHC bound, instead of previous
	  binding domain value to PHC idea.
	- Other minor fixes.

Yangbo Lu (10):
  ptp: add ptp virtual clock driver framework
  ptp: support ptp physical/virtual clocks conversion
  ptp: track available ptp vclocks information
  ptp: add kernel API ptp_get_vclocks_index()
  ethtool: add a new command for getting PHC virtual clocks
  ptp: add kernel API ptp_convert_timestamp()
  net: sock: extend SO_TIMESTAMPING for PHC binding
  net: socket: support hardware timestamp conversion to PHC bound
  selftests/net: timestamping: support binding PHC
  MAINTAINERS: add entry for PTP virtual clock driver

 Documentation/ABI/testing/sysfs-ptp        |  13 ++
 MAINTAINERS                                |   7 +
 drivers/ptp/Makefile                       |   2 +-
 drivers/ptp/ptp_clock.c                    |  27 ++-
 drivers/ptp/ptp_private.h                  |  34 ++++
 drivers/ptp/ptp_sysfs.c                    |  95 +++++++++
 drivers/ptp/ptp_vclock.c                   | 212 +++++++++++++++++++++
 include/linux/ethtool.h                    |   2 +
 include/linux/ptp_clock_kernel.h           |  29 ++-
 include/net/sock.h                         |   8 +-
 include/uapi/linux/ethtool.h               |  14 ++
 include/uapi/linux/ethtool_netlink.h       |  15 ++
 include/uapi/linux/net_tstamp.h            |  17 +-
 include/uapi/linux/ptp_clock.h             |   5 +
 net/core/sock.c                            |  65 ++++++-
 net/ethtool/Makefile                       |   2 +-
 net/ethtool/common.c                       |  24 +++
 net/ethtool/common.h                       |   2 +
 net/ethtool/ioctl.c                        |  27 +++
 net/ethtool/netlink.c                      |  10 +
 net/ethtool/netlink.h                      |   2 +
 net/ethtool/phc_vclocks.c                  |  86 +++++++++
 net/mptcp/sockopt.c                        |  10 +-
 net/socket.c                               |  19 +-
 tools/testing/selftests/net/timestamping.c |  62 ++++--
 25 files changed, 750 insertions(+), 39 deletions(-)
 create mode 100644 drivers/ptp/ptp_vclock.c
 create mode 100644 net/ethtool/phc_vclocks.c


base-commit: 89212e160b81e778f829b89743570665810e3b13

Comments

Jakub Kicinski June 15, 2021, 7:49 p.m. UTC | #1
On Tue, 15 Jun 2021 17:45:12 +0800 Yangbo Lu wrote:
> Add an interface for getting PHC (PTP Hardware Clock)
> virtual clocks, which are based on PHC physical clock
> providing hardware timestamp to network packets.
> 
> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>

> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> index cfef6b08169a..0fb04f945767 100644
> --- a/include/uapi/linux/ethtool.h
> +++ b/include/uapi/linux/ethtool.h
> @@ -17,6 +17,7 @@
>  #include <linux/const.h>
>  #include <linux/types.h>
>  #include <linux/if_ether.h>
> +#include <linux/ptp_clock.h>
>  
>  #ifndef __KERNEL__
>  #include <limits.h> /* for INT_MAX */
> @@ -1341,6 +1342,18 @@ struct ethtool_ts_info {
>  	__u32	rx_reserved[3];
>  };
>  
> +/**
> + * struct ethtool_phc_vclocks - holds a device's PTP virtual clocks
> + * @cmd: command number = %ETHTOOL_GET_PHC_VCLOCKS
> + * @num: number of PTP vclocks
> + * @index: all index values of PTP vclocks
> + */
> +struct ethtool_phc_vclocks {
> +	__u32	cmd;
> +	__u8	num;
> +	__s32	index[PTP_MAX_VCLOCKS];
> +};
> +
>  /*
>   * %ETHTOOL_SFEATURES changes features present in features[].valid to the
>   * values of corresponding bits in features[].requested. Bits in .requested
> @@ -1552,6 +1565,7 @@ enum ethtool_fec_config_bits {
>  #define ETHTOOL_PHY_STUNABLE	0x0000004f /* Set PHY tunable configuration */
>  #define ETHTOOL_GFECPARAM	0x00000050 /* Get FEC settings */
>  #define ETHTOOL_SFECPARAM	0x00000051 /* Set FEC settings */
> +#define ETHTOOL_GET_PHC_VCLOCKS	0x00000052 /* Get PHC virtual clocks info */

We don't add new IOCTL commands, only netlink API is going to be
extended. Please remove the IOCTL interface & uAPI.

>  /* compatibility with older code */
>  #define SPARC_ETH_GSET		ETHTOOL_GSET

> +/* PHC VCLOCKS */
> +
> +enum {
> +	ETHTOOL_A_PHC_VCLOCKS_UNSPEC,
> +	ETHTOOL_A_PHC_VCLOCKS_HEADER,			/* nest - _A_HEADER_* */
> +	ETHTOOL_A_PHC_VCLOCKS_NUM,			/* u8 */

u32, no need to limit yourself, the netlink attribute is rounded up to
4B anyway.

> +	ETHTOOL_A_PHC_VCLOCKS_INDEX,			/* s32 */

This is an array, AFAICT, not a single s32.

> +
> +	/* add new constants above here */
> +	__ETHTOOL_A_PHC_VCLOCKS_CNT,
> +	ETHTOOL_A_PHC_VCLOCKS_MAX = (__ETHTOOL_A_PHC_VCLOCKS_CNT - 1)
> +};
> +
>  /* CABLE TEST */
>  
>  enum {

> +static int phc_vclocks_fill_reply(struct sk_buff *skb,
> +				  const struct ethnl_req_info *req_base,
> +				  const struct ethnl_reply_data *reply_base)
> +{
> +	const struct phc_vclocks_reply_data *data =
> +		PHC_VCLOCKS_REPDATA(reply_base);
> +	const struct ethtool_phc_vclocks *phc_vclocks = &data->phc_vclocks;
> +
> +	if (phc_vclocks->num <= 0)
> +		return 0;
> +
> +	if (nla_put_u32(skb, ETHTOOL_A_PHC_VCLOCKS_NUM, phc_vclocks->num) ||
> +	    nla_put(skb, ETHTOOL_A_PHC_VCLOCKS_INDEX,
> +		    sizeof(phc_vclocks->index), phc_vclocks->index))

Looks like you'll report the whole array, why not just num?

> +		return -EMSGSIZE;
> +
> +	return 0;
> +}
> +
> +const struct ethnl_request_ops ethnl_phc_vclocks_request_ops = {
> +	.request_cmd		= ETHTOOL_MSG_PHC_VCLOCKS_GET,
> +	.reply_cmd		= ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY,
> +	.hdr_attr		= ETHTOOL_A_PHC_VCLOCKS_HEADER,
> +	.req_info_size		= sizeof(struct phc_vclocks_req_info),
> +	.reply_data_size	= sizeof(struct phc_vclocks_reply_data),
> +
> +	.prepare_data		= phc_vclocks_prepare_data,
> +	.reply_size		= phc_vclocks_reply_size,
> +	.fill_reply		= phc_vclocks_fill_reply,
> +};
Mat Martineau June 16, 2021, 1 a.m. UTC | #2
On Tue, 15 Jun 2021, Yangbo Lu wrote:

> Since PTP virtual clock support is added, there can be
> several PTP virtual clocks based on one PTP physical
> clock for timestamping.
>
> This patch is to extend SO_TIMESTAMPING API to support
> PHC (PTP Hardware Clock) binding by adding a new flag
> SOF_TIMESTAMPING_BIND_PHC. When PTP virtual clocks are
> in use, user space can configure to bind one for
> timestamping, but PTP physical clock is not supported
> and not needed to bind.
>
> This patch is preparation for timestamp conversion from
> raw timestamp to a specific PTP virtual clock time in
> core net.
>
> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> ---
> Changes for v3:
> 	- Added this patch.
> ---
> include/net/sock.h              |  8 +++-
> include/uapi/linux/net_tstamp.h | 17 ++++++++-
> net/core/sock.c                 | 65 +++++++++++++++++++++++++++++++--
> net/ethtool/common.c            |  1 +
> net/mptcp/sockopt.c             | 10 +++--
> 5 files changed, 91 insertions(+), 10 deletions(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 9b341c2c924f..adb6c0edc867 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -316,7 +316,9 @@ struct bpf_local_storage;
>   *	@sk_timer: sock cleanup timer
>   *	@sk_stamp: time stamp of last packet received
>   *	@sk_stamp_seq: lock for accessing sk_stamp on 32 bit architectures only
> -  *	@sk_tsflags: SO_TIMESTAMPING socket options
> +  *	@sk_tsflags: SO_TIMESTAMPING flags
> +  *	@sk_bind_phc: SO_TIMESTAMPING bind PHC index of PTP virtual clock
> +  *	              for timestamping
>   *	@sk_tskey: counter to disambiguate concurrent tstamp requests
>   *	@sk_zckey: counter to order MSG_ZEROCOPY notifications
>   *	@sk_socket: Identd and reporting IO signals
> @@ -493,6 +495,7 @@ struct sock {
> 	seqlock_t		sk_stamp_seq;
> #endif
> 	u16			sk_tsflags;
> +	int			sk_bind_phc;
> 	u8			sk_shutdown;
> 	u32			sk_tskey;
> 	atomic_t		sk_zckey;
> @@ -2744,7 +2747,8 @@ void sock_def_readable(struct sock *sk);
>
> int sock_bindtoindex(struct sock *sk, int ifindex, bool lock_sk);
> void sock_set_timestamp(struct sock *sk, int optname, bool valbool);
> -int sock_set_timestamping(struct sock *sk, int optname, int val);
> +int sock_set_timestamping(struct sock *sk, int optname, int val,
> +			  sockptr_t optval, unsigned int optlen);
>
> void sock_enable_timestamps(struct sock *sk);
> void sock_no_linger(struct sock *sk);
> diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
> index 7ed0b3d1c00a..9d4cde4ef7e6 100644
> --- a/include/uapi/linux/net_tstamp.h
> +++ b/include/uapi/linux/net_tstamp.h
> @@ -13,7 +13,7 @@
> #include <linux/types.h>
> #include <linux/socket.h>   /* for SO_TIMESTAMPING */
>
> -/* SO_TIMESTAMPING gets an integer bit field comprised of these values */
> +/* SO_TIMESTAMPING flags */
> enum {
> 	SOF_TIMESTAMPING_TX_HARDWARE = (1<<0),
> 	SOF_TIMESTAMPING_TX_SOFTWARE = (1<<1),
> @@ -30,8 +30,9 @@ enum {
> 	SOF_TIMESTAMPING_OPT_STATS = (1<<12),
> 	SOF_TIMESTAMPING_OPT_PKTINFO = (1<<13),
> 	SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14),
> +	SOF_TIMESTAMPING_BIND_PHC = (1<<15),
>
> -	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_TX_SWHW,
> +	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_BIND_PHC,
> 	SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
> 				 SOF_TIMESTAMPING_LAST
> };
> @@ -46,6 +47,18 @@ enum {
> 					 SOF_TIMESTAMPING_TX_SCHED | \
> 					 SOF_TIMESTAMPING_TX_ACK)
>
> +/**
> + * struct so_timestamping - SO_TIMESTAMPING parameter
> + *
> + * @flags:	SO_TIMESTAMPING flags
> + * @bind_phc:	Index of PTP virtual clock bound to sock. This is available
> + *		if flag SOF_TIMESTAMPING_BIND_PHC is set.
> + */
> +struct so_timestamping {
> +	int flags;
> +	int bind_phc;
> +};
> +
> /**
>  * struct hwtstamp_config - %SIOCGHWTSTAMP and %SIOCSHWTSTAMP parameter
>  *
> diff --git a/net/core/sock.c b/net/core/sock.c
> index ddfa88082a2b..70d2c2322429 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -139,6 +139,8 @@
> #include <net/tcp.h>
> #include <net/busy_poll.h>
>
> +#include <linux/ethtool.h>
> +
> static DEFINE_MUTEX(proto_list_mutex);
> static LIST_HEAD(proto_list);
>
> @@ -794,8 +796,53 @@ void sock_set_timestamp(struct sock *sk, int optname, bool valbool)
> 	}
> }
>
> -int sock_set_timestamping(struct sock *sk, int optname, int val)
> +static int sock_timestamping_bind_phc(struct sock *sk, int phc_index)
> {
> +	struct ethtool_phc_vclocks phc_vclocks = { };
> +	struct net *net = sock_net(sk);
> +	struct net_device *dev = NULL;
> +	bool match = false;
> +	int i;
> +
> +	if (sk->sk_bound_dev_if)
> +		dev = dev_get_by_index(net, sk->sk_bound_dev_if);
> +
> +	if (!dev) {
> +		pr_err("%s: sock not bind to device\n", __func__);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	ethtool_op_get_phc_vclocks(dev, &phc_vclocks);
> +
> +	for (i = 0; i < phc_vclocks.num; i++) {
> +		if (phc_vclocks.index[i] == phc_index) {
> +			match = true;
> +			break;
> +		}
> +	}
> +
> +	if (!match)
> +		return -EINVAL;
> +
> +	sk->sk_bind_phc = phc_index;
> +
> +	return 0;
> +}
> +
> +int sock_set_timestamping(struct sock *sk, int optname, int val,
> +			  sockptr_t optval, unsigned int optlen)
> +{
> +	struct so_timestamping timestamping;
> +	int ret;
> +
> +	if (optlen == sizeof(timestamping)) {
> +		if (copy_from_sockptr(&timestamping, optval,
> +				      sizeof(timestamping)))
> +			return -EFAULT;
> +
> +		val = timestamping.flags;
> +	}
> +
> 	if (val & ~SOF_TIMESTAMPING_MASK)
> 		return -EINVAL;
>
> @@ -816,6 +863,15 @@ int sock_set_timestamping(struct sock *sk, int optname, int val)
> 	    !(val & SOF_TIMESTAMPING_OPT_TSONLY))
> 		return -EINVAL;
>
> +	if (optlen == sizeof(timestamping) &&
> +	    val & SOF_TIMESTAMPING_BIND_PHC) {
> +		ret = sock_timestamping_bind_phc(sk, timestamping.bind_phc);
> +		if (ret)
> +			return ret;
> +	} else {
> +		sk->sk_bind_phc = -1;
> +	}
> +
> 	sk->sk_tsflags = val;
> 	sock_valbool_flag(sk, SOCK_TSTAMP_NEW, optname == SO_TIMESTAMPING_NEW);
>
> @@ -1057,7 +1113,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
>
> 	case SO_TIMESTAMPING_NEW:
> 	case SO_TIMESTAMPING_OLD:
> -		ret = sock_set_timestamping(sk, optname, val);
> +		ret = sock_set_timestamping(sk, optname, val, optval, optlen);
> 		break;
>
> 	case SO_RCVLOWAT:
> @@ -1332,6 +1388,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
> 		struct __kernel_old_timeval tm;
> 		struct  __kernel_sock_timeval stm;
> 		struct sock_txtime txtime;
> +		struct so_timestamping timestamping;
> 	} v;
>
> 	int lv = sizeof(int);
> @@ -1435,7 +1492,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
> 		break;
>
> 	case SO_TIMESTAMPING_OLD:
> -		v.val = sk->sk_tsflags;
> +		lv = sizeof(v.timestamping);
> +		v.timestamping.flags = sk->sk_tsflags;
> +		v.timestamping.bind_phc = sk->sk_bind_phc;
> 		break;
>
> 	case SO_RCVTIMEO_OLD:
> diff --git a/net/ethtool/common.c b/net/ethtool/common.c
> index 14035f2dc6d4..4bf1bd3a3db6 100644
> --- a/net/ethtool/common.c
> +++ b/net/ethtool/common.c
> @@ -398,6 +398,7 @@ const char sof_timestamping_names[][ETH_GSTRING_LEN] = {
> 	[const_ilog2(SOF_TIMESTAMPING_OPT_STATS)]    = "option-stats",
> 	[const_ilog2(SOF_TIMESTAMPING_OPT_PKTINFO)]  = "option-pktinfo",
> 	[const_ilog2(SOF_TIMESTAMPING_OPT_TX_SWHW)]  = "option-tx-swhw",
> +	[const_ilog2(SOF_TIMESTAMPING_BIND_PHC)]     = "bind-phc",
> };
> static_assert(ARRAY_SIZE(sof_timestamping_names) == __SOF_TIMESTAMPING_CNT);
>
> diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
> index 092d1f635d27..2ca90b230827 100644
> --- a/net/mptcp/sockopt.c
> +++ b/net/mptcp/sockopt.c
> @@ -140,7 +140,10 @@ static void mptcp_so_incoming_cpu(struct mptcp_sock *msk, int val)
> 	mptcp_sol_socket_sync_intval(msk, SO_INCOMING_CPU, val);
> }
>
> -static int mptcp_setsockopt_sol_socket_tstamp(struct mptcp_sock *msk, int optname, int val)
> +static int mptcp_setsockopt_sol_socket_tstamp(struct mptcp_sock *msk,
> +					      int optname, int val,
> +					      sockptr_t optval,
> +					      unsigned int optlen)
> {
> 	sockptr_t optval = KERNEL_SOCKPTR(&val);
> 	struct mptcp_subflow_context *subflow;
> @@ -166,7 +169,7 @@ static int mptcp_setsockopt_sol_socket_tstamp(struct mptcp_sock *msk, int optnam
> 			break;
> 		case SO_TIMESTAMPING_NEW:
> 		case SO_TIMESTAMPING_OLD:
> -			sock_set_timestamping(sk, optname, val);
> +			sock_set_timestamping(sk, optname, val, optval, optlen);

This is inside a loop, so in cases where optlen == sizeof(struct 
so_timestamping) this will end up re-copying the structure from userspace 
one extra time for each MPTCP subflow: once for the MPTCP socket, plus one 
time for each of the TCP subflows that are grouped under this MPTCP 
connection.

Given that the extra copies only happen when using the extended bind_phc 
option, it's not a huge cost. But sock_set_timestamping() was written to 
avoid the extra copies for 'int' sized options, and if that was worth the 
effort then the larger so_timestamping structure could be copied (once) 
before the sock_set_timestamping() call and passed in.

> 			break;
> 		}
>
> @@ -207,7 +210,8 @@ static int mptcp_setsockopt_sol_socket_int(struct mptcp_sock *msk, int optname,
> 	case SO_TIMESTAMPNS_NEW:
> 	case SO_TIMESTAMPING_OLD:
> 	case SO_TIMESTAMPING_NEW:
> -		return mptcp_setsockopt_sol_socket_tstamp(msk, optname, val);
> +		return mptcp_setsockopt_sol_socket_tstamp(msk, optname, val,
> +							  optval, optlen);

Rather than modifying mptcp_setsockopt_sol_socket_int(), I suggest adding 
a mptcp_setsockopt_sol_socket_timestamping() helper function that can 
handle the special copying for so_timestamping.

> 	}
>
> 	return -ENOPROTOOPT;
> -- 
> 2.25.1
>
>

--
Mat Martineau
Intel
Richard Cochran June 17, 2021, 5:32 p.m. UTC | #3
On Tue, Jun 15, 2021 at 05:45:08PM +0800, Yangbo Lu wrote:
> diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile
> index 8673d1743faa..3c6a905760e2 100644
> --- a/drivers/ptp/Makefile
> +++ b/drivers/ptp/Makefile
> @@ -3,7 +3,7 @@
>  # Makefile for PTP 1588 clock support.
>  #
>  
> -ptp-y					:= ptp_clock.o ptp_chardev.o ptp_sysfs.o
> +ptp-y					:= ptp_clock.o ptp_vclock.o ptp_chardev.o ptp_sysfs.o

Nit: Please place ptp_vclock.o at the end of the list.

>  ptp_kvm-$(CONFIG_X86)			:= ptp_kvm_x86.o ptp_kvm_common.o
>  ptp_kvm-$(CONFIG_HAVE_ARM_SMCCC)	:= ptp_kvm_arm.o ptp_kvm_common.o
>  obj-$(CONFIG_PTP_1588_CLOCK)		+= ptp.o

> diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h
> index 6b97155148f1..3f388d63904c 100644
> --- a/drivers/ptp/ptp_private.h
> +++ b/drivers/ptp/ptp_private.h
> @@ -48,6 +48,20 @@ struct ptp_clock {
>  	struct kthread_delayed_work aux_work;
>  };
>  
> +#define info_to_vclock(d) container_of((d), struct ptp_vclock, info)
> +#define cc_to_vclock(d) container_of((d), struct ptp_vclock, cc)
> +#define dw_to_vclock(d) container_of((d), struct ptp_vclock, refresh_work)
> +
> +struct ptp_vclock {
> +	struct ptp_clock *pclock;
> +	struct ptp_clock_info info;
> +	struct ptp_clock *clock;
> +	struct cyclecounter cc;
> +	struct timecounter tc;
> +	spinlock_t lock;	/* protects tc/cc */
> +	struct delayed_work refresh_work;

Can we please have a kthread worker here instead of work?

Experience shows that plain work can be delayed for a long, long time
on busy systems.

> +};
> +
>  /*
>   * The function queue_cnt() is safe for readers to call without
>   * holding q->lock. Readers use this function to verify that the queue
> @@ -89,4 +103,6 @@ extern const struct attribute_group *ptp_groups[];
>  int ptp_populate_pin_groups(struct ptp_clock *ptp);
>  void ptp_cleanup_pin_groups(struct ptp_clock *ptp);
>  
> +struct ptp_vclock *ptp_vclock_register(struct ptp_clock *pclock);
> +void ptp_vclock_unregister(struct ptp_vclock *vclock);
>  #endif

> diff --git a/drivers/ptp/ptp_vclock.c b/drivers/ptp/ptp_vclock.c
> new file mode 100644
> index 000000000000..b8f500677314
> --- /dev/null
> +++ b/drivers/ptp/ptp_vclock.c
> @@ -0,0 +1,154 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * PTP virtual clock driver
> + *
> + * Copyright 2021 NXP
> + */
> +#include <linux/slab.h>
> +#include "ptp_private.h"
> +
> +#define PTP_VCLOCK_CC_MULT		(1 << 31)
> +#define PTP_VCLOCK_CC_SHIFT		31

These two are okay, but ...

> +#define PTP_VCLOCK_CC_MULT_NUM		(1 << 9)
> +#define PTP_VCLOCK_CC_MULT_DEM		15625ULL

the similarity of naming is confusing for these two.  They are only
used in the .adjfine method.  How about this?

  PTP_VCLOCK_FADJ_NUMERATOR, or even PTP_VCLOCK_FADJ_SHIFT (see below)
  PTP_VCLOCK_FADJ_DENOMINATOR

> +#define PTP_VCLOCK_CC_REFRESH_INTERVAL	(HZ * 2)

Consider dropping CC from the name.
PTP_VCLOCK_REFRESH_INTERVAL sounds good to me.

> +static int ptp_vclock_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
> +{
> +	struct ptp_vclock *vclock = info_to_vclock(ptp);
> +	unsigned long flags;
> +	s64 adj;
> +
> +	adj = (s64)scaled_ppm * PTP_VCLOCK_CC_MULT_NUM;

Rather than

    scaled_ppm * (1 << 9)

I suggest

    scaled_ppm << 9

instead.  I suppose a good compiler would replace the multiplication
with a bit shift, but it never hurts to spell it out.

> +	adj = div_s64(adj, PTP_VCLOCK_CC_MULT_DEM);
> +
> +	spin_lock_irqsave(&vclock->lock, flags);
> +	timecounter_read(&vclock->tc);
> +	vclock->cc.mult = PTP_VCLOCK_CC_MULT + adj;
> +	spin_unlock_irqrestore(&vclock->lock, flags);
> +
> +	return 0;
> +}

Thanks,
Richard
Y.b. Lu June 22, 2021, 10:10 a.m. UTC | #4
Hi Jakub,

> -----Original Message-----

> From: Jakub Kicinski <kuba@kernel.org>

> Sent: 2021年6月16日 3:49

> To: Y.b. Lu <yangbo.lu@nxp.com>

> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;

> linux-kselftest@vger.kernel.org; mptcp@lists.linux.dev; Richard Cochran

> <richardcochran@gmail.com>; David S . Miller <davem@davemloft.net>; Mat

> Martineau <mathew.j.martineau@linux.intel.com>; Matthieu Baerts

> <matthieu.baerts@tessares.net>; Shuah Khan <shuah@kernel.org>; Michal

> Kubecek <mkubecek@suse.cz>; Florian Fainelli <f.fainelli@gmail.com>;

> Andrew Lunn <andrew@lunn.ch>; Rui Sousa <rui.sousa@nxp.com>; Sebastien

> Laveze <sebastien.laveze@nxp.com>

> Subject: Re: [net-next, v3, 05/10] ethtool: add a new command for getting PHC

> virtual clocks

> 

> On Tue, 15 Jun 2021 17:45:12 +0800 Yangbo Lu wrote:

> > Add an interface for getting PHC (PTP Hardware Clock) virtual clocks,

> > which are based on PHC physical clock providing hardware timestamp to

> > network packets.

> >

> > Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>

> 

> > diff --git a/include/uapi/linux/ethtool.h

> > b/include/uapi/linux/ethtool.h index cfef6b08169a..0fb04f945767 100644

> > --- a/include/uapi/linux/ethtool.h

> > +++ b/include/uapi/linux/ethtool.h

> > @@ -17,6 +17,7 @@

> >  #include <linux/const.h>

> >  #include <linux/types.h>

> >  #include <linux/if_ether.h>

> > +#include <linux/ptp_clock.h>

> >

> >  #ifndef __KERNEL__

> >  #include <limits.h> /* for INT_MAX */ @@ -1341,6 +1342,18 @@ struct

> > ethtool_ts_info {

> >  	__u32	rx_reserved[3];

> >  };

> >

> > +/**

> > + * struct ethtool_phc_vclocks - holds a device's PTP virtual clocks

> > + * @cmd: command number = %ETHTOOL_GET_PHC_VCLOCKS

> > + * @num: number of PTP vclocks

> > + * @index: all index values of PTP vclocks  */ struct

> > +ethtool_phc_vclocks {

> > +	__u32	cmd;

> > +	__u8	num;

> > +	__s32	index[PTP_MAX_VCLOCKS];

> > +};

> > +

> >  /*

> >   * %ETHTOOL_SFEATURES changes features present in features[].valid to

> the

> >   * values of corresponding bits in features[].requested. Bits in

> > .requested @@ -1552,6 +1565,7 @@ enum ethtool_fec_config_bits {

> >  #define ETHTOOL_PHY_STUNABLE	0x0000004f /* Set PHY tunable

> configuration */

> >  #define ETHTOOL_GFECPARAM	0x00000050 /* Get FEC settings */

> >  #define ETHTOOL_SFECPARAM	0x00000051 /* Set FEC settings */

> > +#define ETHTOOL_GET_PHC_VCLOCKS	0x00000052 /* Get PHC virtual

> clocks info */

> 

> We don't add new IOCTL commands, only netlink API is going to be extended.

> Please remove the IOCTL interface & uAPI.


Will remove. Thanks.

> 

> >  /* compatibility with older code */

> >  #define SPARC_ETH_GSET		ETHTOOL_GSET

> 

> > +/* PHC VCLOCKS */

> > +

> > +enum {

> > +	ETHTOOL_A_PHC_VCLOCKS_UNSPEC,

> > +	ETHTOOL_A_PHC_VCLOCKS_HEADER,			/* nest - _A_HEADER_*

> */

> > +	ETHTOOL_A_PHC_VCLOCKS_NUM,			/* u8 */

> 

> u32, no need to limit yourself, the netlink attribute is rounded up to 4B

> anyway.


Get it. Will use u32.

> 

> > +	ETHTOOL_A_PHC_VCLOCKS_INDEX,			/* s32 */

> 

> This is an array, AFAICT, not a single s32.


Will fix. Thanks.

> 

> > +

> > +	/* add new constants above here */

> > +	__ETHTOOL_A_PHC_VCLOCKS_CNT,

> > +	ETHTOOL_A_PHC_VCLOCKS_MAX = (__ETHTOOL_A_PHC_VCLOCKS_CNT -

> 1) };

> > +

> >  /* CABLE TEST */

> >

> >  enum {

> 

> > +static int phc_vclocks_fill_reply(struct sk_buff *skb,

> > +				  const struct ethnl_req_info *req_base,

> > +				  const struct ethnl_reply_data *reply_base) {

> > +	const struct phc_vclocks_reply_data *data =

> > +		PHC_VCLOCKS_REPDATA(reply_base);

> > +	const struct ethtool_phc_vclocks *phc_vclocks = &data->phc_vclocks;

> > +

> > +	if (phc_vclocks->num <= 0)

> > +		return 0;

> > +

> > +	if (nla_put_u32(skb, ETHTOOL_A_PHC_VCLOCKS_NUM, phc_vclocks->num)

> ||

> > +	    nla_put(skb, ETHTOOL_A_PHC_VCLOCKS_INDEX,

> > +		    sizeof(phc_vclocks->index), phc_vclocks->index))

> 

> Looks like you'll report the whole array, why not just num?


Will report just num. Thanks.

> 

> > +		return -EMSGSIZE;

> > +

> > +	return 0;

> > +}

> > +

> > +const struct ethnl_request_ops ethnl_phc_vclocks_request_ops = {

> > +	.request_cmd		= ETHTOOL_MSG_PHC_VCLOCKS_GET,

> > +	.reply_cmd		= ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY,

> > +	.hdr_attr		= ETHTOOL_A_PHC_VCLOCKS_HEADER,

> > +	.req_info_size		= sizeof(struct phc_vclocks_req_info),

> > +	.reply_data_size	= sizeof(struct phc_vclocks_reply_data),

> > +

> > +	.prepare_data		= phc_vclocks_prepare_data,

> > +	.reply_size		= phc_vclocks_reply_size,

> > +	.fill_reply		= phc_vclocks_fill_reply,

> > +};
Y.b. Lu June 22, 2021, 10:10 a.m. UTC | #5
Hi Michal,

> -----Original Message-----

> From: Michal Kubecek <mkubecek@suse.cz>

> Sent: 2021年6月16日 7:25

> To: Y.b. Lu <yangbo.lu@nxp.com>

> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;

> linux-kselftest@vger.kernel.org; mptcp@lists.linux.dev; Richard Cochran

> <richardcochran@gmail.com>; David S . Miller <davem@davemloft.net>;

> Jakub Kicinski <kuba@kernel.org>; Mat Martineau

> <mathew.j.martineau@linux.intel.com>; Matthieu Baerts

> <matthieu.baerts@tessares.net>; Shuah Khan <shuah@kernel.org>; Florian

> Fainelli <f.fainelli@gmail.com>; Andrew Lunn <andrew@lunn.ch>; Rui Sousa

> <rui.sousa@nxp.com>; Sebastien Laveze <sebastien.laveze@nxp.com>

> Subject: Re: [net-next, v3, 05/10] ethtool: add a new command for getting PHC

> virtual clocks

> 

> On Tue, Jun 15, 2021 at 05:45:12PM +0800, Yangbo Lu wrote:

> > Add an interface for getting PHC (PTP Hardware Clock) virtual clocks,

> > which are based on PHC physical clock providing hardware timestamp to

> > network packets.

> >

> > Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>

> > ---

> > Changes for v3:

> > 	- Added this patch.

> > ---

> >  include/linux/ethtool.h              |  2 +

> >  include/uapi/linux/ethtool.h         | 14 +++++

> >  include/uapi/linux/ethtool_netlink.h | 15 +++++

> >  net/ethtool/Makefile                 |  2 +-

> >  net/ethtool/common.c                 | 23 ++++++++

> >  net/ethtool/common.h                 |  2 +

> >  net/ethtool/ioctl.c                  | 27 +++++++++

> >  net/ethtool/netlink.c                | 10 ++++

> >  net/ethtool/netlink.h                |  2 +

> >  net/ethtool/phc_vclocks.c            | 86

> ++++++++++++++++++++++++++++

> >  10 files changed, 182 insertions(+), 1 deletion(-)  create mode

> > 100644 net/ethtool/phc_vclocks.c

> 

> When updating the ethtool netlink API, please update also its documentation

> in Documentation/networking/ethtool-netlink.rst


Will update doc. Thank you.

> 

> Michal
Y.b. Lu June 22, 2021, 10:14 a.m. UTC | #6
Hi Mat,

> -----Original Message-----

> From: Mat Martineau <mathew.j.martineau@linux.intel.com>

> Sent: 2021年6月16日 9:01

> To: Y.b. Lu <yangbo.lu@nxp.com>

> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;

> linux-kselftest@vger.kernel.org; mptcp@lists.linux.dev; Richard Cochran

> <richardcochran@gmail.com>; David S . Miller <davem@davemloft.net>;

> Jakub Kicinski <kuba@kernel.org>; Matthieu Baerts

> <matthieu.baerts@tessares.net>; Shuah Khan <shuah@kernel.org>; Michal

> Kubecek <mkubecek@suse.cz>; Florian Fainelli <f.fainelli@gmail.com>;

> Andrew Lunn <andrew@lunn.ch>; Rui Sousa <rui.sousa@nxp.com>; Sebastien

> Laveze <sebastien.laveze@nxp.com>; Florian Westphal <fw@strlen.de>

> Subject: Re: [net-next, v3, 07/10] net: sock: extend SO_TIMESTAMPING for

> PHC binding

> 

> On Tue, 15 Jun 2021, Yangbo Lu wrote:

> 

> > Since PTP virtual clock support is added, there can be several PTP

> > virtual clocks based on one PTP physical clock for timestamping.

> >

> > This patch is to extend SO_TIMESTAMPING API to support PHC (PTP

> > Hardware Clock) binding by adding a new flag

> > SOF_TIMESTAMPING_BIND_PHC. When PTP virtual clocks are in use, user

> > space can configure to bind one for timestamping, but PTP physical

> > clock is not supported and not needed to bind.

> >

> > This patch is preparation for timestamp conversion from raw timestamp

> > to a specific PTP virtual clock time in core net.

> >

> > Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>

> > ---

> > Changes for v3:

> > 	- Added this patch.

> > ---

> > include/net/sock.h              |  8 +++-

> > include/uapi/linux/net_tstamp.h | 17 ++++++++-

> > net/core/sock.c                 | 65

> +++++++++++++++++++++++++++++++--

> > net/ethtool/common.c            |  1 +

> > net/mptcp/sockopt.c             | 10 +++--

> > 5 files changed, 91 insertions(+), 10 deletions(-)

> >

[...]
> > @@ -166,7 +169,7 @@ static int

> mptcp_setsockopt_sol_socket_tstamp(struct mptcp_sock *msk, int optnam

> > 			break;

> > 		case SO_TIMESTAMPING_NEW:

> > 		case SO_TIMESTAMPING_OLD:

> > -			sock_set_timestamping(sk, optname, val);

> > +			sock_set_timestamping(sk, optname, val, optval, optlen);

> 

> This is inside a loop, so in cases where optlen == sizeof(struct

> so_timestamping) this will end up re-copying the structure from userspace

> one extra time for each MPTCP subflow: once for the MPTCP socket, plus one

> time for each of the TCP subflows that are grouped under this MPTCP

> connection.

> 

> Given that the extra copies only happen when using the extended bind_phc

> option, it's not a huge cost. But sock_set_timestamping() was written to

> avoid the extra copies for 'int' sized options, and if that was worth the

> effort then the larger so_timestamping structure could be copied (once)

> before the sock_set_timestamping() call and passed in.


I see now...
Let me pass so_timestamping structure in to avoid re-copying from userspace.

> 

> > 			break;

> > 		}

> >

> > @@ -207,7 +210,8 @@ static int mptcp_setsockopt_sol_socket_int(struct

> mptcp_sock *msk, int optname,

> > 	case SO_TIMESTAMPNS_NEW:

> > 	case SO_TIMESTAMPING_OLD:

> > 	case SO_TIMESTAMPING_NEW:

> > -		return mptcp_setsockopt_sol_socket_tstamp(msk, optname, val);

> > +		return mptcp_setsockopt_sol_socket_tstamp(msk, optname, val,

> > +							  optval, optlen);

> 

> Rather than modifying mptcp_setsockopt_sol_socket_int(), I suggest adding

> a mptcp_setsockopt_sol_socket_timestamping() helper function that can

> handle the special copying for so_timestamping.


Thank you. Let me do this in next version.

> 

> > 	}

> >

> > 	return -ENOPROTOOPT;

> > --

> > 2.25.1

> >

> >

> 

> --

> Mat Martineau

> Intel
Y.b. Lu June 22, 2021, 10:17 a.m. UTC | #7
Hi Richard,

> -----Original Message-----

> From: Richard Cochran <richardcochran@gmail.com>

> Sent: 2021年6月18日 1:33

> To: Y.b. Lu <yangbo.lu@nxp.com>

> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;

> linux-kselftest@vger.kernel.org; mptcp@lists.linux.dev; David S . Miller

> <davem@davemloft.net>; Jakub Kicinski <kuba@kernel.org>; Mat Martineau

> <mathew.j.martineau@linux.intel.com>; Matthieu Baerts

> <matthieu.baerts@tessares.net>; Shuah Khan <shuah@kernel.org>; Michal

> Kubecek <mkubecek@suse.cz>; Florian Fainelli <f.fainelli@gmail.com>;

> Andrew Lunn <andrew@lunn.ch>; Rui Sousa <rui.sousa@nxp.com>; Sebastien

> Laveze <sebastien.laveze@nxp.com>

> Subject: Re: [net-next, v3, 01/10] ptp: add ptp virtual clock driver framework

> 

> On Tue, Jun 15, 2021 at 05:45:08PM +0800, Yangbo Lu wrote:

> > diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile index

> > 8673d1743faa..3c6a905760e2 100644

> > --- a/drivers/ptp/Makefile

> > +++ b/drivers/ptp/Makefile

> > @@ -3,7 +3,7 @@

> >  # Makefile for PTP 1588 clock support.

> >  #

> >

> > -ptp-y					:= ptp_clock.o ptp_chardev.o ptp_sysfs.o

> > +ptp-y					:= ptp_clock.o ptp_vclock.o ptp_chardev.o

> ptp_sysfs.o

> 

> Nit: Please place ptp_vclock.o at the end of the list.


Ok.

> 

> >  ptp_kvm-$(CONFIG_X86)			:= ptp_kvm_x86.o ptp_kvm_common.o

> >  ptp_kvm-$(CONFIG_HAVE_ARM_SMCCC)	:= ptp_kvm_arm.o

> ptp_kvm_common.o

> >  obj-$(CONFIG_PTP_1588_CLOCK)		+= ptp.o

> 

> > diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h

> > index 6b97155148f1..3f388d63904c 100644

> > --- a/drivers/ptp/ptp_private.h

> > +++ b/drivers/ptp/ptp_private.h

> > @@ -48,6 +48,20 @@ struct ptp_clock {

> >  	struct kthread_delayed_work aux_work;  };

> >

> > +#define info_to_vclock(d) container_of((d), struct ptp_vclock, info)

> > +#define cc_to_vclock(d) container_of((d), struct ptp_vclock, cc)

> > +#define dw_to_vclock(d) container_of((d), struct ptp_vclock,

> > +refresh_work)

> > +

> > +struct ptp_vclock {

> > +	struct ptp_clock *pclock;

> > +	struct ptp_clock_info info;

> > +	struct ptp_clock *clock;

> > +	struct cyclecounter cc;

> > +	struct timecounter tc;

> > +	spinlock_t lock;	/* protects tc/cc */

> > +	struct delayed_work refresh_work;

> 

> Can we please have a kthread worker here instead of work?

> 

> Experience shows that plain work can be delayed for a long, long time on busy

> systems.

> 


I think do_aux_work callback could be utilized for ptp virtual clock, right?

> > +};

> > +

> >  /*

> >   * The function queue_cnt() is safe for readers to call without

> >   * holding q->lock. Readers use this function to verify that the

> > queue @@ -89,4 +103,6 @@ extern const struct attribute_group

> > *ptp_groups[];  int ptp_populate_pin_groups(struct ptp_clock *ptp);

> > void ptp_cleanup_pin_groups(struct ptp_clock *ptp);

> >

> > +struct ptp_vclock *ptp_vclock_register(struct ptp_clock *pclock);

> > +void ptp_vclock_unregister(struct ptp_vclock *vclock);

> >  #endif

> 

> > diff --git a/drivers/ptp/ptp_vclock.c b/drivers/ptp/ptp_vclock.c new

> > file mode 100644 index 000000000000..b8f500677314

> > --- /dev/null

> > +++ b/drivers/ptp/ptp_vclock.c

> > @@ -0,0 +1,154 @@

> > +// SPDX-License-Identifier: GPL-2.0-or-later

> > +/*

> > + * PTP virtual clock driver

> > + *

> > + * Copyright 2021 NXP

> > + */

> > +#include <linux/slab.h>

> > +#include "ptp_private.h"

> > +

> > +#define PTP_VCLOCK_CC_MULT		(1 << 31)

> > +#define PTP_VCLOCK_CC_SHIFT		31

> 

> These two are okay, but ...

> 

> > +#define PTP_VCLOCK_CC_MULT_NUM		(1 << 9)

> > +#define PTP_VCLOCK_CC_MULT_DEM		15625ULL

> 

> the similarity of naming is confusing for these two.  They are only used in

> the .adjfine method.  How about this?

> 

>   PTP_VCLOCK_FADJ_NUMERATOR, or even PTP_VCLOCK_FADJ_SHIFT (see

> below)

>   PTP_VCLOCK_FADJ_DENOMINATOR

> 

> > +#define PTP_VCLOCK_CC_REFRESH_INTERVAL	(HZ * 2)

> 

> Consider dropping CC from the name.

> PTP_VCLOCK_REFRESH_INTERVAL sounds good to me.

> 


Thanks. Will rename the MACROs per your suggestion.

> > +static int ptp_vclock_adjfine(struct ptp_clock_info *ptp, long

> > +scaled_ppm) {

> > +	struct ptp_vclock *vclock = info_to_vclock(ptp);

> > +	unsigned long flags;

> > +	s64 adj;

> > +

> > +	adj = (s64)scaled_ppm * PTP_VCLOCK_CC_MULT_NUM;

> 

> Rather than

> 

>     scaled_ppm * (1 << 9)

> 

> I suggest

> 

>     scaled_ppm << 9

> 

> instead.  I suppose a good compiler would replace the multiplication with a

> bit shift, but it never hurts to spell it out.


Ok.

> 

> > +	adj = div_s64(adj, PTP_VCLOCK_CC_MULT_DEM);

> > +

> > +	spin_lock_irqsave(&vclock->lock, flags);

> > +	timecounter_read(&vclock->tc);

> > +	vclock->cc.mult = PTP_VCLOCK_CC_MULT + adj;

> > +	spin_unlock_irqrestore(&vclock->lock, flags);

> > +

> > +	return 0;

> > +}

> 

> Thanks,

> Richard