Message ID | 20220426172346.3508411-1-cezary.rojewski@intel.com |
---|---|
Headers | show |
Series | ASoC: Intel: avs: Driver core and PCM operations | expand |
On 2022-04-27 12:18 AM, Pierre-Louis Bossart wrote: > On 4/26/22 12:23, Cezary Rojewski wrote: >> To preserve power during sleep operations, handle suspend (S3), >> hibernation (S4) and runtime (RTD3) transitions. As flow for all of >> is shared, define common handlers to reduce code size. >> >> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> >> Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> >> --- >> sound/soc/intel/avs/core.c | 125 +++++++++++++++++++++++++++++++++++++ >> 1 file changed, 125 insertions(+) >> >> diff --git a/sound/soc/intel/avs/core.c b/sound/soc/intel/avs/core.c >> index 93180c22032d..c2f8fb87cfc2 100644 >> --- a/sound/soc/intel/avs/core.c >> +++ b/sound/soc/intel/avs/core.c >> @@ -536,6 +536,128 @@ static void avs_pci_remove(struct pci_dev *pci) >> pm_runtime_get_noresume(&pci->dev); >> } >> >> +static int __maybe_unused avs_suspend_common(struct avs_dev *adev, bool low_power) > > low_power is not used so.... Indeed, this argument should be part of the patch which introduces its usage. Removing in v2! >> +{ >> + struct hdac_bus *bus = &adev->base.core; >> + int ret; >> + >> + flush_work(&adev->probe_work); >> + >> + snd_hdac_ext_bus_link_power_down_all(bus); >> + >> + ret = avs_ipc_set_dx(adev, AVS_MAIN_CORE_MASK, false); >> + /* >> + * pm_runtime is blocked on DSP failure but system-wide suspend is not. >> + * Do not block entire system from suspending if that's the case. >> + */ >> + if (ret && ret != -EPERM) { >> + dev_err(adev->dev, "set dx failed: %d\n", ret); >> + return AVS_IPC_RET(ret); >> + } >> + >> + avs_dsp_op(adev, int_control, false); >> + snd_hdac_ext_bus_ppcap_int_enable(bus, false); >> + >> + ret = avs_dsp_core_disable(adev, AVS_MAIN_CORE_MASK); >> + if (ret < 0) { >> + dev_err(adev->dev, "core_mask %ld disable failed: %d\n", AVS_MAIN_CORE_MASK, ret); >> + return ret; >> + } >> + >> + snd_hdac_ext_bus_ppcap_enable(bus, false); >> + /* disable LP SRAM retention */ >> + avs_hda_power_gating_enable(adev, false); >> + snd_hdac_bus_stop_chip(bus); >> + /* disable CG when putting controller to reset */ >> + avs_hdac_clock_gating_enable(bus, false); >> + snd_hdac_bus_enter_link_reset(bus); >> + avs_hdac_clock_gating_enable(bus, true); >> + >> + snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, false); >> + >> + return 0; >> +} >> + >> +static int __maybe_unused avs_resume_common(struct avs_dev *adev, bool low_power, bool purge) >> +{ >> + struct hdac_bus *bus = &adev->base.core; >> + struct hdac_ext_link *hlink; >> + int ret; >> + >> + snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, true); >> + avs_hdac_bus_init_chip(bus, true); >> + >> + snd_hdac_ext_bus_ppcap_enable(bus, true); >> + snd_hdac_ext_bus_ppcap_int_enable(bus, true); >> + >> + ret = avs_dsp_boot_firmware(adev, purge); >> + if (ret < 0) { >> + dev_err(adev->dev, "firmware boot failed: %d\n", ret); >> + return ret; >> + } >> + >> + /* turn off the links that were off before suspend */ >> + list_for_each_entry(hlink, &bus->hlink_list, list) { >> + if (!hlink->ref_count) >> + snd_hdac_ext_bus_link_power_down(hlink); >> + } >> + >> + /* check dma status and clean up CORB/RIRB buffers */ >> + if (!bus->cmd_dma_state) >> + snd_hdac_bus_stop_cmd_io(bus); >> + >> + return 0; >> +} >> + >> +static int __maybe_unused avs_suspend(struct device *dev) >> +{ >> + return avs_suspend_common(to_avs_dev(dev), true); >> +} >> + >> +static int __maybe_unused avs_resume(struct device *dev) >> +{ >> + return avs_resume_common(to_avs_dev(dev), true, true); >> +} >> + >> +static int __maybe_unused avs_runtime_suspend(struct device *dev) >> +{ >> + return avs_suspend_common(to_avs_dev(dev), true); >> +} >> + >> +static int __maybe_unused avs_runtime_resume(struct device *dev) >> +{ >> + return avs_resume_common(to_avs_dev(dev), true, false); >> +} >> + >> +static int __maybe_unused avs_freeze(struct device *dev) >> +{ >> + return avs_suspend_common(to_avs_dev(dev), false); >> +} >> +static int __maybe_unused avs_thaw(struct device *dev) >> +{ >> + return avs_resume_common(to_avs_dev(dev), false, true); >> +} >> + >> +static int __maybe_unused avs_poweroff(struct device *dev) >> +{ >> + return avs_suspend_common(to_avs_dev(dev), false); >> +} >> + >> +static int __maybe_unused avs_restore(struct device *dev) >> +{ >> + return avs_resume_common(to_avs_dev(dev), false, true); >> +} >> + >> +static const struct dev_pm_ops avs_dev_pm = { >> + .suspend = avs_suspend, >> + .resume = avs_resume, > > ... all the 4 functions below seem unnecessary? If we exclude 'low_power' then this is true. Will move this to the 'low_power' patch as requested! >> + .freeze = avs_freeze, >> + .thaw = avs_thaw, >> + .poweroff = avs_poweroff, >> + .restore = avs_restore, > > we've historically never used those, what drives this new usage? > > on the SOF side, we've used prepare and complete, along with idle... No streams may survive S4 what is not the case for S3. That's the main difference. Yes, there are many opses vailable in dev_pm_ops, perhaps a different set of them is more appropriate. Let's have this talk once 'low_power' patch is here (PCM-power management patches are not part of this patchset but depend on it). >> + SET_RUNTIME_PM_OPS(avs_runtime_suspend, avs_runtime_resume, NULL) >> +}; >> + >> static const struct pci_device_id avs_ids[] = { >> { 0 } >> }; >> @@ -546,6 +668,9 @@ static struct pci_driver avs_pci_driver = { >> .id_table = avs_ids, >> .probe = avs_pci_probe, >> .remove = avs_pci_remove, >> + .driver = { >> + .pm = &avs_dev_pm, >> + }, >> }; >> module_pci_driver(avs_pci_driver); >>
On 2022-04-27 12:12 AM, Pierre-Louis Bossart wrote: > On 4/26/22 12:23, Cezary Rojewski wrote: >> AVS driver operates with granular audio card division in mind. >> Super-card approach (e.g.: I2S, DMIC and HDA DAIs combined) is >> deprecated in favour of individual cards - one per each device. This >> provides necessary dynamism, especially for configurations with number >> of codecs present and makes it easier to survive auxiliary devices >> failures - one card failing to probe does not prevent others from >> succeeding. >> >> All boards spawned by AVS are unregistered on ->remove(). This includes >> dummy codecs such as DMIC. >> >> As all machine boards found in sound/soc/intel/boards are irreversibly >> tied to 'super-card' approach, new boards are going to be introduced. >> This temporarily increases number of boards available under /intel >> directory until skylake-driver becomes deprecated and removed. > > I thought you wanted your own directory for cards, what's the point of adding new machine drivers in intel/boards if they ONLY work with your AVS driver? > > Also you can only remove the machine drivers that are NOT shared with SOF... Yes, if something is being actively used even once skylake-driver is removed, it will stay there. No worries. I recommend moving SOF-specific boards into /sof/intel/boards/ though - it feels logical to have driver-specific boards within driver-specific folder as very limited number of boards, if any, are "common" here. I've provided board-related patchset on the list simultaneously so people can see the full picture. >> +static struct snd_soc_acpi_mach *dmi_match_quirk(void *arg) >> +{ >> + struct snd_soc_acpi_mach *mach = arg; >> + const struct dmi_system_id *dmi_id; >> + struct dmi_system_id *dmi_table; >> + >> + if (mach->quirk_data == NULL) >> + return mach; >> + >> + dmi_table = (struct dmi_system_id *)mach->quirk_data; >> + >> + dmi_id = dmi_first_match(dmi_table); >> + if (!dmi_id) >> + return NULL; >> + >> + return mach; >> +} >> + >> +#define AVS_SSP(x) (BIT(x)) >> +#define AVS_SSP_RANGE(a, b) (GENMASK(b, a)) >> + >> +/* supported I2S board codec configurations */ >> +static struct snd_soc_acpi_mach avs_skl_i2s_machines[] = { >> + { >> + .id = "INT343A", >> + .drv_name = "avs_rt286", >> + .link_mask = AVS_SSP(0), > > I've told this before, 'link_mask' was introduced for *SoundWire*. Please do not overload existing concepts and use this instead: > > @i2s_link_mask: I2S/TDM links enabled on the board Noooo :( Sad panda is sad. 'link_mask' is such a wonderful name as it matches naming used in our specs - which call BE side 'LINK'. If it's a must then yes, we will resign from using 'link_mask'. >> + .tplg_filename = "skl-rt286-tplg.bin", >> + }, >> + { >> + .id = "10508825", >> + .drv_name = "avs_nau8825", >> + .link_mask = AVS_SSP(1), >> + .tplg_filename = "skl-nau8825-tplg.bin", >> + }, >> + { >> + .id = "INT343B", >> + .drv_name = "avs_ssm4567", >> + .link_mask = AVS_SSP(0), >> + .tplg_filename = "skl-ssm4567-tplg.bin", >> + }, >> + { >> + .id = "MX98357A", >> + .drv_name = "avs_max98357a", >> + .link_mask = AVS_SSP(0), >> + .tplg_filename = "skl-max98357a-tplg.bin", >> + }, >> + {}, >> +}; >> + >> +static struct snd_soc_acpi_mach avs_kbl_i2s_machines[] = { >> + { >> + .id = "INT343A", >> + .drv_name = "avs_rt286", >> + .link_mask = AVS_SSP(0), >> + .quirk_data = &kbl_dmi_table, >> + .machine_quirk = dmi_match_quirk, >> + .tplg_filename = "kbl-rt286-tplg.bin", >> + }, >> + { >> + .id = "INT343A", >> + .drv_name = "avs_rt298", >> + .link_mask = AVS_SSP(0), >> + .quirk_data = &kbl_r_dmi_table, >> + .machine_quirk = dmi_match_quirk, >> + .tplg_filename = "kblr-rt298-tplg.bin", >> + }, >> + { >> + .id = "MX98373", >> + .drv_name = "avs_max98373", >> + .link_mask = AVS_SSP(0), >> + .tplg_filename = "kbl-max98373-tplg.bin", >> + }, >> + { >> + .id = "DLGS7219", >> + .drv_name = "avs_da7219", >> + .link_mask = AVS_SSP(1), >> + .tplg_filename = "kbl-da7219-tplg.bin", >> + }, >> + {}, >> +}; >> + ... >> +struct avs_acpi_boards { >> + int id; >> + struct snd_soc_acpi_mach *machs; >> +}; >> + >> +#define AVS_MACH_ENTRY(_id, _mach) \ >> + { .id = (_id), .machs = (_mach), } >> + >> +/* supported I2S boards per platform */ >> +static const struct avs_acpi_boards i2s_boards[] = { >> + AVS_MACH_ENTRY(0x9d70, avs_skl_i2s_machines), /* SKL */ >> + AVS_MACH_ENTRY(0x9d71, avs_kbl_i2s_machines), /* KBL */ >> + AVS_MACH_ENTRY(0x5a98, avs_apl_i2s_machines), /* APL */ >> + AVS_MACH_ENTRY(0x3198, avs_gml_i2s_machines), /* GML */ >> + {}, > > you are not using the intel/commmon matching and ACPI tables so I would recommend you deal with machine drivers in your private space. And that's what we chose to do! I'm sorry if the message brought any confusion here. sound/soc/intel/avs/boards is the subdirectory for avs-driver boards. >> +static int avs_register_hda_board(struct avs_dev *adev, struct hda_codec *codec) >> +{ >> + struct snd_soc_acpi_mach mach = {{0}}; >> + struct platform_device *board; >> + struct hdac_device *hdev = &codec->core; >> + char *pname; >> + int ret, id; >> + >> + pname = devm_kasprintf(adev->dev, GFP_KERNEL, "%s-platform", dev_name(&hdev->dev)); >> + if (!pname) >> + return -ENOMEM; >> + >> + ret = avs_hda_platform_register(adev, pname); >> + if (ret < 0) >> + return ret; >> + >> + mach.pdata = codec; >> + mach.mach_params.platform = pname; >> + mach.tplg_filename = devm_kasprintf(adev->dev, GFP_KERNEL, "hda-%08x-tplg.bin", >> + hdev->vendor_id); > > this is surprising, how many topologies will you end-up supporting then? Topologies are typically NOT dependent on the HDaudio codec type or vendor and only deal with HDaudio link DMA configurations. Keen eye. I'm not providing all the patches simultaneously so the patchsets are easier to review. Note that avs/core.c (available on upstream) still provides 'NULL' for its hda_ext_ops. Separate patches for the point you brought here (and the completing the avs initialization for that matter) will be sent later. The default: be specific when choosing topology for specific board - allows for tailoring configuration-specific topology. However, if there is no such files, load "generic" topology instead. In our repo, most of the hda-XXXX-tplg.bin files are actually symlinks to few, real HD-Audio codec topologies. >> + if (!mach.tplg_filename) >> + return -ENOMEM; >> + >> + id = adev->base.core.idx * HDA_MAX_CODECS + hdev->addr; >> + board = platform_device_register_data(NULL, "avs_hdaudio", id, (const void *)&mach, >> + sizeof(mach)); >> + if (IS_ERR(board)) { >> + dev_err(adev->dev, "hda board register failed\n"); >> + return PTR_ERR(board); >> + } >> + >> + ret = devm_add_action(adev->dev, board_pdev_unregister, board); >> + if (ret < 0) { >> + platform_device_unregister(board); >> + return ret; >> + } >> + >> + return 0; >> +} >> +
On 2022-04-26 11:21 PM, Pierre-Louis Bossart wrote: > On 4/26/22 12:23, Cezary Rojewski wrote: ... >> diff --git a/sound/soc/intel/avs/loader.c b/sound/soc/intel/avs/loader.c >> index c47f85161d95..de98f4c3adf8 100644 >> --- a/sound/soc/intel/avs/loader.c >> +++ b/sound/soc/intel/avs/loader.c >> @@ -15,6 +15,7 @@ >> #include "cldma.h" >> #include "messages.h" >> #include "registers.h" >> +#include "topology.h" >> >> #define AVS_ROM_STS_MASK 0xFF >> #define AVS_ROM_INIT_DONE 0x1 >> @@ -466,6 +467,70 @@ int avs_hda_transfer_modules(struct avs_dev *adev, bool load, >> return 0; >> } >> >> +int avs_dsp_load_libraries(struct avs_dev *adev, struct avs_tplg_library *libs, u32 num_libs) >> +{ >> + int start, id, i = 0; >> + int ret; >> + >> + /* Calculate the id to assign for the next lib. */ >> + for (id = 0; id < adev->fw_cfg.max_libs_count; id++) >> + if (adev->lib_names[id][0] == '\0') >> + break; >> + if (id + num_libs >= adev->fw_cfg.max_libs_count) >> + return -EINVAL; > > use ida_alloc_max() ? After reading this one couple of times I'm keen to agree that IDA should have been used for library ID allocation and a at the same time, surprised it has't done that already. Till now we used IDA 'only' when allocating pipeline IDs and module instance IDs. Pipeline allocation is good comparison here - makes use of ida_alloc_max() already - library one should follow. This finding is much appreciated, Pierre. >> + >> + start = id; >> + while (i < num_libs) { >> + struct avs_fw_manifest *man; >> + const struct firmware *fw; >> + struct firmware stripped_fw; >> + char *filename; >> + int j; >> + >> + filename = kasprintf(GFP_KERNEL, "%s/%s/%s", AVS_ROOT_DIR, adev->spec->name, >> + libs[i].name); >> + if (!filename) >> + return -ENOMEM; >> + >> + ret = avs_request_firmware(adev, &fw, filename); >> + kfree(filename); >> + if (ret < 0) >> + return ret; >> + >> + stripped_fw = *fw; >> + ret = avs_fw_manifest_strip_verify(adev, &stripped_fw, NULL); >> + if (ret) { >> + dev_err(adev->dev, "invalid library data: %d\n", ret); >> + goto release_fw; >> + } >> + >> + ret = avs_fw_manifest_offset(&stripped_fw); >> + if (ret < 0) >> + goto release_fw; >> + man = (struct avs_fw_manifest *)(stripped_fw.data + ret); >> + >> + /* Don't load anything that's already in DSP memory. */ >> + for (j = 0; j < id; j++) >> + if (!strncmp(adev->lib_names[j], man->name, AVS_LIB_NAME_SIZE)) >> + goto next_lib; >> + >> + ret = avs_dsp_op(adev, load_lib, &stripped_fw, id); >> + if (ret) >> + goto release_fw; >> + >> + strncpy(adev->lib_names[id], man->name, AVS_LIB_NAME_SIZE); >> + id++; >> +next_lib: >> + i++; >> + } >> + >> + return start == id ? 1 : 0; >> + >> +release_fw: >> + avs_release_last_firmware(adev); > > why release only the last library and not all the ones that were previous loaded? > or why bother to even release the last since at this point you probably need to restart completely? Yes, avs_release_last_firmware() is used to clean 'locally' and indeed, failing to load a library will most likely end-up is complete restart. I'll provide an internal build with this part removed and have validation do some testing first as performing avs_release_last_firmware() keeps things sane i.e. no invalid entries in the ->fw_list, no unnecessarily occupied memory for the already invalid entry and such. In regard to *why is only last one released*, it's tied to how base firmware behaves. Assuming a scenario where two libraries need to be loaded, if the loading procedure for the second fails, base firmware will NOT unload/unmap allocated memory and resources for the first one. At the same time, there is no capability to unload library on demand. In order to rollback the transaction, DSP has to be shut down entirely, just like if we were talking about firmware exception or IPC timeout. So, by doing what we do here, driver reflects firmware side (MODULES_INFO) basically 1:1. Another good one, Pierre. >> + return ret; >> +}
On 2022-04-26 11:33 PM, Pierre-Louis Bossart wrote: >> +struct avs_dma_data { >> + struct avs_tplg_path_template *template; >> + struct avs_path *path; >> + /* >> + * link stream is stored within substream's runtime >> + * private_data to fulfill the needs of codec BE path >> + * >> + * host stream assigned > > not able to parse that comment, what are you trying to say? Sure. This actually stems from the legacy driver architecture. PCM operations for legacy HD-Audio are found in sound/pci/hda/hda_controller.c with respective types declared in sound/pci/hda/hda_controller.h. Operations found there make use of struct azx_dev and get_azx_dev(). So, to be able to re-use sound/pci/hda as much as possible plus declare virtually no custom HD-Audio related logic, we satisfy the needs of lower level code by assigning BE stream to substream->runtime->private_data. As HD-Audio legacy operates in coupled mode only, there is no need for it to differentiate between HOST and LINK side. It's not true for DSP configurations though. So, we declare separate pointer for HOST stream allowing PCM operations found here to have easy access to both, LINK and HOST streams whenever necessary. >> + */ >> + struct hdac_ext_stream *host_stream; >> +}; >> + >> +static ssize_t topology_name_read(struct file *file, char __user *user_buf, size_t count, >> + loff_t *ppos) >> +{ >> + struct snd_soc_component *component = file->private_data; >> + struct snd_soc_card *card = component->card; >> + struct snd_soc_acpi_mach *mach = dev_get_platdata(card->dev); >> + char buf[64]; >> + size_t len; >> + >> + len = snprintf(buf, sizeof(buf), "%s/%s\n", component->driver->topology_name_prefix, >> + mach->tplg_filename); >> + >> + return simple_read_from_buffer(user_buf, count, ppos, buf, len); >> +} >> + >> +static const struct file_operations topology_name_fops = { >> + .open = simple_open, >> + .read = topology_name_read, >> + .llseek = default_llseek, >> +}; > > can you clarify why this is needed? The usage of default_llseek() or the topology_name_fops as a whole? The latter is here to obtain name of the firmware file requested by given machine board easily from the debugfs. The former is probably just a copy-paste from other declarations of file_operation entries for READ-only files. >> + >> +static int avs_component_load_libraries(struct avs_soc_component *acomp) >> +{ >> + struct avs_tplg *tplg = acomp->tplg; >> + struct avs_dev *adev = to_avs_dev(acomp->base.dev); >> + int ret; >> + >> + if (!tplg->num_libs) >> + return 0; >> + >> + /* Parent device may be asleep and library loading involves IPCs. */ >> + ret = pm_runtime_get_sync(adev->dev); >> + if (ret < 0 && ret != -EACCES) { >> + pm_runtime_put_noidle(adev->dev); >> + return ret; >> + } > > Mark recommends the use of pm_runtime_resume_and_get(), see patches from today. Will definitely check this out, thanks for pointing this out. >> + >> + avs_hda_clock_gating_enable(adev, false); >> + avs_hda_l1sen_enable(adev, false); >> + >> + ret = avs_dsp_load_libraries(adev, tplg->libs, tplg->num_libs); >> + >> + avs_hda_l1sen_enable(adev, true); >> + avs_hda_clock_gating_enable(adev, true); >> + >> + if (!ret) >> + ret = avs_module_info_init(adev, false); >> + >> + pm_runtime_mark_last_busy(adev->dev); >> + pm_runtime_put_autosuspend(adev->dev); >> + >> + return ret; >> +} >> + >> +static int avs_component_probe(struct snd_soc_component *component) >> +{ >> + struct snd_soc_card *card = component->card; >> + struct snd_soc_acpi_mach *mach; >> + struct avs_soc_component *acomp; >> + struct avs_dev *adev; >> + char *filename; >> + int ret; >> + >> + dev_dbg(card->dev, "probing %s card %s\n", component->name, card->name); >> + mach = dev_get_platdata(card->dev); >> + acomp = to_avs_soc_component(component); >> + adev = to_avs_dev(component->dev); >> + >> + acomp->tplg = avs_tplg_new(component); >> + if (!acomp->tplg) >> + return -ENOMEM; >> + >> + if (!mach->tplg_filename) >> + goto finalize; >> + >> + /* Load specified topology and create sysfs for it. */ >> + filename = kasprintf(GFP_KERNEL, "%s/%s", component->driver->topology_name_prefix, >> + mach->tplg_filename); > > what is the link between topology and sysfs? This comment is misleading as debugfs entry is being created as seen in the code below. To answer the question assuming s/sysfs/debugfs/: Ability to allow for reading name of the file requested by given machine board from user space. AVS driver spawns multiple machine boards, each requesting a topology file. To make it easier for the validation and normal users to understand what has requested what, we provide a simple, read-only debugfs entry for each of the boards within the avs tree. >> + if (!filename) >> + return -ENOMEM; >> + >> + ret = avs_load_topology(component, filename); >> + kfree(filename); >> + if (ret < 0) >> + return ret; >> + >> + ret = avs_component_load_libraries(acomp); >> + if (ret < 0) { >> + dev_err(card->dev, "libraries loading failed: %d\n", ret); >> + goto err_load_libs; >> + } >> + >> +finalize: >> + debugfs_create_file("topology_name", 0444, component->debugfs_root, component, >> + &topology_name_fops); > > that's debugfs here, is this to make it possible to select an alternate topology file? If yes, a comment earlier wouldn't hurt. Nah, its purpose is very basic: get the exact name of the topology file loaded. Example: intel/avs/hda-8086280b-tplg.bin >> + >> + mutex_lock(&adev->comp_list_mutex); >> + list_add_tail(&acomp->node, &adev->comp_list); >> + mutex_unlock(&adev->comp_list_mutex); >> + >> + return 0; >> + >> +err_load_libs: >> + avs_remove_topology(component); >> + return ret; >> +} >> + > > >> +static const struct snd_soc_component_driver avs_component_driver = { >> + .name = "avs-pcm", >> + .probe = avs_component_probe, >> + .remove = avs_component_remove, >> + .open = avs_component_open, >> + .pointer = avs_component_pointer, >> + .mmap = avs_component_mmap, >> + .pcm_construct = avs_component_construct, >> + .module_get_upon_open = 1, /* increment refcount when a pcm is opened */ >> + .topology_name_prefix = "intel/avs", > > is this intentional that the firmware binaries and topologies will be stored in the same intel/avs directory? > >> + .non_legacy_dai_naming = true, > > is this needed? we've never used this for Intel? I'll recheck this but back in time when we drawn PCM ops for the first time there was a reason, certainly. Out of my head - impacts behavior of snd_soc_register_dais(). >> +}; >> +
On 2022-04-26 11:53 PM, Pierre-Louis Bossart wrote: > On 4/26/22 12:23, Cezary Rojewski wrote: >> In rare occassions, under stress conditions or hardware malfunction, DSP > > occasions Ack. >> firmware may fail. Software is notified about such situation with >> EXCEPTION_CAUGHT notification. IPC timeout is also counted as critical >> device failure. More often than not, driver can recover from such >> situations by performing full reset: killing and restarting ADSP. >> >> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> >> Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> >> --- >> sound/soc/intel/Kconfig | 1 + >> sound/soc/intel/avs/avs.h | 4 ++ >> sound/soc/intel/avs/ipc.c | 95 +++++++++++++++++++++++++++++++++- >> sound/soc/intel/avs/messages.h | 5 ++ >> 4 files changed, 103 insertions(+), 2 deletions(-) >> >> diff --git a/sound/soc/intel/Kconfig b/sound/soc/intel/Kconfig >> index c364ddf22267..05ad6bdecfc5 100644 >> --- a/sound/soc/intel/Kconfig >> +++ b/sound/soc/intel/Kconfig >> @@ -218,6 +218,7 @@ config SND_SOC_INTEL_AVS >> select SND_HDA_EXT_CORE >> select SND_HDA_DSP_LOADER >> select SND_INTEL_NHLT >> + select WANT_DEV_COREDUMP >> help >> Enable support for Intel(R) cAVS 1.5 platforms with DSP >> capabilities. This includes Skylake, Kabylake, Amberlake and >> diff --git a/sound/soc/intel/avs/avs.h b/sound/soc/intel/avs/avs.h >> index e628f78d1864..02c2aa1bcd5c 100644 >> --- a/sound/soc/intel/avs/avs.h >> +++ b/sound/soc/intel/avs/avs.h >> @@ -42,6 +42,7 @@ struct avs_dsp_ops { >> int (* const load_basefw)(struct avs_dev *, struct firmware *); >> int (* const load_lib)(struct avs_dev *, struct firmware *, u32); >> int (* const transfer_mods)(struct avs_dev *, bool, struct avs_module_entry *, u32); >> + int (* const coredump)(struct avs_dev *, union avs_notify_msg *); >> }; >> >> #define avs_dsp_op(adev, op, ...) \ >> @@ -164,12 +165,15 @@ struct avs_ipc { >> struct avs_ipc_msg rx; >> u32 default_timeout_ms; >> bool ready; >> + bool recovering; >> >> bool rx_completed; >> spinlock_t rx_lock; >> struct mutex msg_mutex; >> struct completion done_completion; >> struct completion busy_completion; >> + >> + struct work_struct recovery_work; >> }; >> >> #define AVS_EIPC EREMOTEIO >> diff --git a/sound/soc/intel/avs/ipc.c b/sound/soc/intel/avs/ipc.c >> index 68aaf01edbf2..84cb411c82fa 100644 >> --- a/sound/soc/intel/avs/ipc.c >> +++ b/sound/soc/intel/avs/ipc.c >> @@ -14,6 +14,87 @@ >> >> #define AVS_IPC_TIMEOUT_MS 300 >> >> +static void avs_dsp_recovery(struct avs_dev *adev) >> +{ >> + struct avs_soc_component *acomp; >> + unsigned int core_mask; >> + int ret; >> + >> + if (adev->ipc->recovering) >> + return; >> + adev->ipc->recovering = true; > > don't you need some sort of lock to test/clear this flag? Our stress tests do not confirm this. I'll not ignore this warning though, will recheck with my team next week. >> + >> + mutex_lock(&adev->comp_list_mutex); >> + /* disconnect all running streams */ >> + list_for_each_entry(acomp, &adev->comp_list, node) { >> + struct snd_soc_pcm_runtime *rtd; >> + struct snd_soc_card *card; >> + >> + card = acomp->base.card; >> + if (!card) >> + continue; >> + >> + for_each_card_rtds(card, rtd) { >> + struct snd_pcm *pcm; >> + int dir; >> + >> + pcm = rtd->pcm; >> + if (!pcm || rtd->dai_link->no_pcm) >> + continue; >> + >> + for_each_pcm_streams(dir) { >> + struct snd_pcm_substream *substream; >> + >> + substream = pcm->streams[dir].substream; >> + if (!substream || !substream->runtime) >> + continue; >> + >> + snd_pcm_stop(substream, SNDRV_PCM_STATE_DISCONNECTED); >> + } >> + } >> + } >> + mutex_unlock(&adev->comp_list_mutex); >> + >> + /* forcibly shutdown all cores */ >> + core_mask = GENMASK(adev->hw_cfg.dsp_cores - 1, 0); >> + avs_dsp_core_disable(adev, core_mask); >> + >> + /* attempt dsp reboot */ >> + ret = avs_dsp_boot_firmware(adev, true); >> + if (ret < 0) >> + dev_err(adev->dev, "dsp reboot failed: %d\n", ret); >> + >> + pm_runtime_mark_last_busy(adev->dev); >> + pm_runtime_enable(adev->dev); >> + pm_request_autosuspend(adev->dev); > > there are zero users of this routine in the entire sound/ tree, can you clarify why this is needed or what you are trying to do? Unsure which routine you question here. I'll assume it's pm_request_autosuspend(). pm_request_audiosuspend() is being used to queue suspend once recovery completes. Recovery takes time and during that time all communication attempts with DSP will yield -EPERM. PM is also blocked for the device with pm_runtime_disable(), performed before scheduling the recovery work. Once recovery completes we do not just unblock the PM as that would cause immediate suspend. Instead, we "refresh" the *last busy* status and queue the suspend operation. >> + >> + adev->ipc->recovering = false; >> +}
>>> + >>> + /* forcibly shutdown all cores */ >>> + core_mask = GENMASK(adev->hw_cfg.dsp_cores - 1, 0); >>> + avs_dsp_core_disable(adev, core_mask); >>> + >>> + /* attempt dsp reboot */ >>> + ret = avs_dsp_boot_firmware(adev, true); >>> + if (ret < 0) >>> + dev_err(adev->dev, "dsp reboot failed: %d\n", ret); >>> + >>> + pm_runtime_mark_last_busy(adev->dev); >>> + pm_runtime_enable(adev->dev); >>> + pm_request_autosuspend(adev->dev); >> >> there are zero users of this routine in the entire sound/ tree, can you clarify why this is needed or what you are trying to do? > > > Unsure which routine you question here. I'll assume it's pm_request_autosuspend(). > > pm_request_audiosuspend() is being used to queue suspend once recovery completes. Recovery takes time and during that time all communication attempts with DSP will yield -EPERM. PM is also blocked for the device with pm_runtime_disable(), performed before scheduling the recovery work. Once recovery completes we do not just unblock the PM as that would cause immediate suspend. Instead, we "refresh" the *last busy* status and queue the suspend operation. But since you already have autosuspend enabled, why would you need to explicitly queue a suspend operation? What happens if that last call is omitted, is there actually a functional difference? Not objecting if that's required, but since no one else used it so far I wonder if we missed something or if this is overkill.
On 2022-05-01 11:45, Cezary Rojewski wrote: > On 2022-04-26 11:21 PM, Pierre-Louis Bossart wrote: >> On 4/26/22 12:23, Cezary Rojewski wrote: > > ... > >>> diff --git a/sound/soc/intel/avs/loader.c >>> b/sound/soc/intel/avs/loader.c >>> index c47f85161d95..de98f4c3adf8 100644 >>> --- a/sound/soc/intel/avs/loader.c >>> +++ b/sound/soc/intel/avs/loader.c >>> @@ -15,6 +15,7 @@ >>> #include "cldma.h" >>> #include "messages.h" >>> #include "registers.h" >>> +#include "topology.h" >>> #define AVS_ROM_STS_MASK 0xFF >>> #define AVS_ROM_INIT_DONE 0x1 >>> @@ -466,6 +467,70 @@ int avs_hda_transfer_modules(struct avs_dev >>> *adev, bool load, >>> return 0; >>> } >>> +int avs_dsp_load_libraries(struct avs_dev *adev, struct >>> avs_tplg_library *libs, u32 num_libs) >>> +{ >>> + int start, id, i = 0; >>> + int ret; >>> + >>> + /* Calculate the id to assign for the next lib. */ >>> + for (id = 0; id < adev->fw_cfg.max_libs_count; id++) >>> + if (adev->lib_names[id][0] == '\0') >>> + break; >>> + if (id + num_libs >= adev->fw_cfg.max_libs_count) >>> + return -EINVAL; >> >> use ida_alloc_max() ? > > > After reading this one couple of times I'm keen to agree that IDA > should have been used for library ID allocation and a at the same > time, surprised it has't done that already. Till now we used IDA > 'only' when allocating pipeline IDs and module instance IDs. Pipeline > allocation is good comparison here - makes use of ida_alloc_max() > already - library one should follow. > > This finding is much appreciated, Pierre. I think that using ida here is a bit of an overkill. Ida works fine when there can be both id allocation and freeing and that's how it work with pipelines and modules IDs in avs. However there is no mechanism for unloading libraries in cAVS firmware, therefore ida would be used here only to increase the ID, so it needlessly complicates the code while not giving much of a benefit. Also our approach to check if we can load all libraries before the loop makes it problematic with ida because we would need to allocate an id at start and calculate if all libs would fit and then either free it instantly or complicate the loop to use id allocated before. Therefore I suggest to leave this code unchanged. I've synced with Cezary on this and provided explanation convinced him too. > >>> + >>> + start = id; >>> + while (i < num_libs) { >>> + struct avs_fw_manifest *man; >>> + const struct firmware *fw; >>> + struct firmware stripped_fw; >>> + char *filename; >>> + int j; >>> + >>> + filename = kasprintf(GFP_KERNEL, "%s/%s/%s", AVS_ROOT_DIR, >>> adev->spec->name, >>> + libs[i].name); >>> + if (!filename) >>> + return -ENOMEM; >>> + >>> + ret = avs_request_firmware(adev, &fw, filename); >>> + kfree(filename); >>> + if (ret < 0) >>> + return ret; >>> + >>> + stripped_fw = *fw; >>> + ret = avs_fw_manifest_strip_verify(adev, &stripped_fw, NULL); >>> + if (ret) { >>> + dev_err(adev->dev, "invalid library data: %d\n", ret); >>> + goto release_fw; >>> + } >>> + >>> + ret = avs_fw_manifest_offset(&stripped_fw); >>> + if (ret < 0) >>> + goto release_fw; >>> + man = (struct avs_fw_manifest *)(stripped_fw.data + ret); >>> + >>> + /* Don't load anything that's already in DSP memory. */ >>> + for (j = 0; j < id; j++) >>> + if (!strncmp(adev->lib_names[j], man->name, >>> AVS_LIB_NAME_SIZE)) >>> + goto next_lib; >>> + >>> + ret = avs_dsp_op(adev, load_lib, &stripped_fw, id); >>> + if (ret) >>> + goto release_fw; >>> + >>> + strncpy(adev->lib_names[id], man->name, AVS_LIB_NAME_SIZE); >>> + id++; >>> +next_lib: >>> + i++; >>> + } >>> + >>> + return start == id ? 1 : 0; >>> + >>> +release_fw: >>> + avs_release_last_firmware(adev); >> >> why release only the last library and not all the ones that were >> previous loaded? >> or why bother to even release the last since at this point you >> probably need to restart completely? > > > Yes, avs_release_last_firmware() is used to clean 'locally' and > indeed, failing to load a library will most likely end-up is complete > restart. > > I'll provide an internal build with this part removed and have > validation do some testing first as performing > avs_release_last_firmware() keeps things sane i.e. no invalid entries > in the ->fw_list, no unnecessarily occupied memory for the already > invalid entry and such. > > In regard to *why is only last one released*, it's tied to how base > firmware behaves. Assuming a scenario where two libraries need to be > loaded, if the loading procedure for the second fails, base firmware > will NOT unload/unmap allocated memory and resources for the first > one. At the same time, there is no capability to unload library on > demand. In order to rollback the transaction, DSP has to be shut down > entirely, just like if we were talking about firmware exception or IPC > timeout. > > So, by doing what we do here, driver reflects firmware side > (MODULES_INFO) basically 1:1. > > Another good one, Pierre. > >>> + return ret; >>> +}
>>>> +int avs_dsp_load_libraries(struct avs_dev *adev, struct >>>> avs_tplg_library *libs, u32 num_libs) >>>> +{ >>>> + int start, id, i = 0; >>>> + int ret; >>>> + >>>> + /* Calculate the id to assign for the next lib. */ >>>> + for (id = 0; id < adev->fw_cfg.max_libs_count; id++) >>>> + if (adev->lib_names[id][0] == '\0') >>>> + break; >>>> + if (id + num_libs >= adev->fw_cfg.max_libs_count) >>>> + return -EINVAL; >>> >>> use ida_alloc_max() ? >> >> >> After reading this one couple of times I'm keen to agree that IDA >> should have been used for library ID allocation and a at the same >> time, surprised it has't done that already. Till now we used IDA >> 'only' when allocating pipeline IDs and module instance IDs. Pipeline >> allocation is good comparison here - makes use of ida_alloc_max() >> already - library one should follow. >> >> This finding is much appreciated, Pierre. > > I think that using ida here is a bit of an overkill. Ida works fine when > there can be both id allocation and freeing and that's how it work with > pipelines and modules IDs in avs. However there is no mechanism for > unloading libraries in cAVS firmware, therefore ida would be used here > only to increase the ID, so it needlessly complicates the code while not > giving much of a benefit. Also our approach to check if we can load all > libraries before the loop makes it problematic with ida because we would > need to allocate an id at start and calculate if all libs would fit and > then either free it instantly or complicate the loop to use id allocated > before. Therefore I suggest to leave this code unchanged. I've synced > with Cezary on this and provided explanation convinced him too. That's fine, you should however capture this design decision with a comment or a clarification in the commit message. "libraries" mean different things to different people, and it's hard to review without context.