mbox series

[v3,00/11] Tegra234 Memory interconnect support

Message ID 20230320182441.11904-1-sumitg@nvidia.com
Headers show
Series Tegra234 Memory interconnect support | expand

Message

Sumit Gupta March 20, 2023, 6:24 p.m. UTC
Hi Krzysztof.

Thank you for the ACK on the Memory patches [2-5 & 8].
Rebased the patch series on latest next and added two more patches
for 'memory' in v3. Please review and ACK if fine.
  [Patch v3 10/11] memory: tegra: handle no BWMGR MRQ support in BPMP
  [Patch v3 11/11] memory: tegra186-emc: fix interconnect registration


Hi All,
Requesting for ACK on below remaining patches and please consider for merging
in "6.4".

- Thierry:
   "Memory Interconnect base support" patches are dependent on the bpmp patch.
   [Patch v3 1/9] firmware: tegra: add function to get BPMP data

- Rafael & Viresh: For the CPUFREQ MC Client patches.  
   [Patch v3 06/11] arm64: tegra: Add cpu OPP tables and interconnects property
   [Patch v3 07/11] cpufreq: tegra194: add OPP support and set bandwidth

- Lorenzo, Bjorn & Krzysztof WilczyƄski: For the PCIE MC client patch.
   [Patch v3 09/11] PCI: tegra194: add interconnect support in Tegra234

Thank you,
Sumit Gupta
============

This patch series adds memory interconnect support for Tegra234 SoC.
It is used to dynamically scale DRAM Frequency as per the bandwidth
requests from different Memory Controller (MC) clients.
MC Clients use ICC Framework's icc_set_bw() api to dynamically request
for the DRAM bandwidth (BW). As per path, the request will be routed
from MC to the EMC driver. MC driver passes the request info like the
Client ID, type, and frequency request info to the BPMP-FW which will
set the final DRAM freq considering all exisiting requests.

MC and EMC are the ICC providers. Nodes in path for a request will be:
     Client[1-n] -> MC -> EMC -> EMEM/DRAM

The patch series also adds interconnect support in below client drivers:
1) CPUFREQ driver for scaling bandwidth with CPU frequency. For that,
   added per cluster OPP table which will be used in the CPUFREQ driver
   by requesting the minimum BW respective to the given CPU frequency in
   the OPP table of given cluster.
2) PCIE driver to request BW required for different modes.

---
v2[2] -> v3:
- in 'patch 7', set 'icc_dram_bw_scaling' to false if set_opp call failed
  to avoid flooding of uart with 'Failed to set bw' messages.
- added 'patch 10' to handle if the bpmp-fw is old and not support bwmgr mrq.
- added 'patch 11' to fix interconnect registration in tegra186-emc.
  ref patch link in linux next:
  [https://lore.kernel.org/all/20230306075651.2449-21-johan+linaro@kernel.org/]

v1[1] -> v2:
- moved BW setting to tegra234_mc_icc_set() from EMC driver.
- moved sw clients to the 'tegra_mc_clients' table.
- point 'node->data' to the entry within 'tegra_mc_clients'.
- removed 'struct tegra_icc_node' and get client info using 'node->data'.
- changed error handling in and around tegra_emc_interconnect_init().
- moved 'tegra-icc.h' from 'include/soc/tegra' to 'include/linux'.
- added interconnect support to PCIE driver in 'Patch 9'.
- merged 'Patch 9 & 10' from [1] to get num_channels and use.
- merged 'Patch 2 & 3' from [1] to add ISO and NISO clients.
- added 'Acked-by' of Krzysztof from 'Patch 05/10' of [1].
- Removed 'Patch 7' from [1] as that is merged now.

Sumit Gupta (11):
  firmware: tegra: add function to get BPMP data
  memory: tegra: add interconnect support for DRAM scaling in Tegra234
  memory: tegra: add mc clients for Tegra234
  memory: tegra: add software mc clients in Tegra234
  dt-bindings: tegra: add icc ids for dummy MC clients
  arm64: tegra: Add cpu OPP tables and interconnects property
  cpufreq: tegra194: add OPP support and set bandwidth
  memory: tegra: make cpu cluster bw request a multiple of mc channels
  PCI: tegra194: add interconnect support in Tegra234
  memory: tegra: handle no BWMGR MRQ support in BPMP
  memory: tegra186-emc: fix interconnect registration race

 arch/arm64/boot/dts/nvidia/tegra234.dtsi   | 276 ++++++++++
 drivers/cpufreq/tegra194-cpufreq.c         | 156 +++++-
 drivers/firmware/tegra/bpmp.c              |  38 ++
 drivers/memory/tegra/mc.c                  |  24 +
 drivers/memory/tegra/mc.h                  |   1 +
 drivers/memory/tegra/tegra186-emc.c        | 118 ++++
 drivers/memory/tegra/tegra234.c            | 599 ++++++++++++++++++++-
 drivers/pci/controller/dwc/pcie-tegra194.c |  40 +-
 include/dt-bindings/memory/tegra234-mc.h   |   5 +
 include/linux/tegra-icc.h                  |  65 +++
 include/soc/tegra/bpmp.h                   |   5 +
 include/soc/tegra/mc.h                     |   8 +-
 12 files changed, 1312 insertions(+), 23 deletions(-)
 create mode 100644 include/linux/tegra-icc.h

[1] https://lore.kernel.org/lkml/20221220160240.27494-1-sumitg@nvidia.com/T/
[2] https://lore.kernel.org/linux-tegra/20230220140559.28289-1-sumitg@nvidia.com/

Comments

Thierry Reding March 23, 2023, 10:14 a.m. UTC | #1
On Mon, Mar 20, 2023 at 11:54:32PM +0530, Sumit Gupta wrote:
[...]
> diff --git a/drivers/memory/tegra/tegra234.c b/drivers/memory/tegra/tegra234.c
[...]
> +static int tegra234_mc_icc_set(struct icc_node *src, struct icc_node *dst)
> +{
> +	struct tegra_mc *mc = icc_provider_to_tegra_mc(dst->provider);
> +	struct mrq_bwmgr_int_request bwmgr_req = { 0 };
> +	struct mrq_bwmgr_int_response bwmgr_resp = { 0 };
> +	const struct tegra_mc_client *pclient = src->data;
> +	struct tegra_bpmp_message msg;
> +	struct tegra_bpmp *bpmp;
> +	int ret;
> +
> +	/*
> +	 * Same Src and Dst node will happen during boot from icc_node_add().
> +	 * This can be used to pre-initialize and set bandwidth for all clients
> +	 * before their drivers are loaded. We are skipping this case as for us,
> +	 * the pre-initialization already happened in Bootloader(MB2) and BPMP-FW.
> +	 */
> +	if (src->id == dst->id)
> +		return 0;
> +
> +	bpmp = of_tegra_bpmp_get();
> +	if (IS_ERR(bpmp)) {
> +		ret = PTR_ERR(bpmp);
> +		return ret;
> +	}

Irrespective of Whether we end up doing the BPMP lookup via
tegra_bpmp_get() or of_tegra_bpmp_get(), I think we should resolve at
probe time and cache the result, since this function can get called
multiple times and the lookup is a rather heavy operation.

Thierry