Quirks for MacroSilicon MS2100/MS2106

Message ID	795d8e1a-8fc7-2302-613e-ff1743de5c16@pelago.org.uk
State	New
Headers	show Return-Path: <alsa-devel-bounces@alsa-project.org> Message-ID: <795d8e1a-8fc7-2302-613e-ff1743de5c16@pelago.org.uk> Date: Wed, 22 Jun 2022 21:23:50 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 From: John Veness <john-linux@pelago.org.uk> Subject: Quirks for MacroSilicon MS2100/MS2106 To: Jaroslav Kysela <perex@perex.cz>, Takashi Iwai <tiwai@suse.com> Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: alsa-devel@alsa-project.org, Hector Martin <marcan@marcan.st> Precedence: list Errors-To: alsa-devel-bounces@alsa-project.org Sender: "Alsa-devel" <alsa-devel-bounces@alsa-project.org>
Series	Quirks for MacroSilicon MS2100/MS2106 \| expand Quirks for MacroSilicon MS2100/MS2106

John Veness June 22, 2022, 8:23 p.m. UTC

Hello,

This is my first ever kernel work, so I hope I am doing everything
correctly.

I have a USB analogue AV capture dongle with a MacroSilicon MS2100E chip
inside. PID 0x534d, VID 0x0021 (534d:0021). Apparently the MS2106 uses
the same PID/VID, so hopefully acts the same, but I don't have one to check.

For audio, the MS2100E has the same problems as the MS2109 USB HDMI
capture dongle, namely claiming 96kHz mono sound when it's actually
48kHz stereo, and with left/right channel problems.

Luckily, the fix is the exact same as that for the MS2109, as first
implemented by Marcan (Hector Martin) in July 2020. Below is a patch
which is basically a copy and paste of his, with a different VID.

I have tested this patch in 5.19.0 RC2. I haven't recruited any other
testers.

Even with this patch, there is a remaining problem, which is not present
in the MS2109. The sound sample values range from 0x0000 to 0x7fff, with
silence around 0x4000, i.e. 15-bit-ish audio. This actually sounds OK to
the ear (although half as loud as it should be), but looks odd when
looking at the waveform, and makes volume meters always think the sound
is very loud.

To convert to s16le, I can bitshift one bit left, and subtract 32768.
I'm told that this isn't something that can or should be done in the
kernel, but should be in userspace. Any more advice on how to fix this
remaining quirk would be very welcome.

Nonetheless, as I say, with the following kernel patch the captured
audio certainly sounds right, so is a big improvement and makes these
dongles usable:



Thanks for reading and considering this patch.

John Veness

Takashi Iwai June 23, 2022, 5:58 a.m. UTC | #1

On Wed, 22 Jun 2022 22:23:50 +0200,
John Veness wrote:
> 
> 
> Hello,
> 
> This is my first ever kernel work, so I hope I am doing everything
> correctly.
> 
> I have a USB analogue AV capture dongle with a MacroSilicon MS2100E chip
> inside. PID 0x534d, VID 0x0021 (534d:0021). Apparently the MS2106 uses
> the same PID/VID, so hopefully acts the same, but I don't have one to check.
> 
> For audio, the MS2100E has the same problems as the MS2109 USB HDMI
> capture dongle, namely claiming 96kHz mono sound when it's actually
> 48kHz stereo, and with left/right channel problems.
> 
> Luckily, the fix is the exact same as that for the MS2109, as first
> implemented by Marcan (Hector Martin) in July 2020. Below is a patch
> which is basically a copy and paste of his, with a different VID.
> 
> I have tested this patch in 5.19.0 RC2. I haven't recruited any other
> testers.

The suggested code change looks OK.  So far, so good...

> Even with this patch, there is a remaining problem, which is not present
> in the MS2109. The sound sample values range from 0x0000 to 0x7fff, with
> silence around 0x4000, i.e. 15-bit-ish audio. This actually sounds OK to
> the ear (although half as loud as it should be), but looks odd when
> looking at the waveform, and makes volume meters always think the sound
> is very loud.
> 
> To convert to s16le, I can bitshift one bit left, and subtract 32768.
> I'm told that this isn't something that can or should be done in the
> kernel, but should be in userspace. Any more advice on how to fix this
> remaining quirk would be very welcome.

Ouch, this is painful.  We haven't had any devices that require a
15 bit unsigned format, and maybe we don't want to add it to the
common standard format just for one funky device, either.  Such data
processing could be done in alsa-lib, but for the proper interaction
with the user-space, the kernel should provide some information so
that user-space can process the data accordingly.  However, we have no
proper way defined for it generically, so far.

Maybe an easy way would be to create an alsa-lib external plugin, and
apply it per device.  Jaroslav, could it be done via UCM?

> Nonetheless, as I say, with the following kernel patch the captured
> audio certainly sounds right, so is a big improvement and makes these
> dongles usable:

So as a first step, we can merge the patch as is.  The rest needs more
consideration.

In anyway, for merging the patch, it has to be submitted properly in a
right format, especially including your Signed-off-by tag.  Please
refer to Documentation/process/submitting-patches.rst for details.

thanks,

Takashi

Jaroslav Kysela June 23, 2022, 7:18 a.m. UTC | #2

On 23. 06. 22 7:58, Takashi Iwai wrote:

>> Even with this patch, there is a remaining problem, which is not present
>> in the MS2109. The sound sample values range from 0x0000 to 0x7fff, with
>> silence around 0x4000, i.e. 15-bit-ish audio. This actually sounds OK to
>> the ear (although half as loud as it should be), but looks odd when
>> looking at the waveform, and makes volume meters always think the sound
>> is very loud.
>>
>> To convert to s16le, I can bitshift one bit left, and subtract 32768.
>> I'm told that this isn't something that can or should be done in the
>> kernel, but should be in userspace. Any more advice on how to fix this
>> remaining quirk would be very welcome.
> 
> Ouch, this is painful.  We haven't had any devices that require a
> 15 bit unsigned format, and maybe we don't want to add it to the
> common standard format just for one funky device, either.  Such data
> processing could be done in alsa-lib, but for the proper interaction
> with the user-space, the kernel should provide some information so
> that user-space can process the data accordingly.  However, we have no
> proper way defined for it generically, so far.
> 
> Maybe an easy way would be to create an alsa-lib external plugin, and
> apply it per device.  Jaroslav, could it be done via UCM?

I agree that we may start with a special plugin for this format. The UCM can 
use any alsa-lib configuration now. So PA/PW should work with this very 
specific hardware when properly configured.

Note that we have SNDRV_PCM_FORMAT_SPECIAL for such cases. It will imply that 
the applications will fail when the special conversion plugin is not used. The 
minor issue may be with the silence routines (which is already with the 
improper format).

					Jaroslav

Takashi Sakamoto June 23, 2022, 8:24 a.m. UTC | #3

Hi,

On Thu, Jun 23, 2022, at 16:18, Jaroslav Kysela wrote:
> On 23. 06. 22 7:58, Takashi Iwai wrote:
>
>>> Even with this patch, there is a remaining problem, which is not present
>>> in the MS2109. The sound sample values range from 0x0000 to 0x7fff, with
>>> silence around 0x4000, i.e. 15-bit-ish audio. This actually sounds OK to
>>> the ear (although half as loud as it should be), but looks odd when
>>> looking at the waveform, and makes volume meters always think the sound
>>> is very loud.
>>>
>>> To convert to s16le, I can bitshift one bit left, and subtract 32768.
>>> I'm told that this isn't something that can or should be done in the
>>> kernel, but should be in userspace. Any more advice on how to fix this
>>> remaining quirk would be very welcome.
>> 
>> Ouch, this is painful.  We haven't had any devices that require a
>> 15 bit unsigned format, and maybe we don't want to add it to the
>> common standard format just for one funky device, either.  Such data
>> processing could be done in alsa-lib, but for the proper interaction
>> with the user-space, the kernel should provide some information so
>> that user-space can process the data accordingly.  However, we have no
>> proper way defined for it generically, so far.
>> 
>> Maybe an easy way would be to create an alsa-lib external plugin, and
>> apply it per device.  Jaroslav, could it be done via UCM?
>
> I agree that we may start with a special plugin for this format. The UCM can 
> use any alsa-lib configuration now. So PA/PW should work with this very 
> specific hardware when properly configured.
>
> Note that we have SNDRV_PCM_FORMAT_SPECIAL for such cases. It will imply that 
> the applications will fail when the special conversion plugin is not used. The 
> minor issue may be with the silence routines (which is already with the 
> improper format).

I think the combination of format/subformat is available in the case.

Actually the odd frame format is within 16 bit slot, so SNDRV_PCM_FORMAT_U16
or so is available. Then we can define new subformat to notify userspace; e.g.
SNDRV_PCM_SUBFORMAT_MODEL_SPECIFIC.

The cons is that we need to add new subformat, thus update asound.h. On the
other hand, the pros is that existent userspace stuffs still handle the odd format
and user can hear playback sound at least even if the loudness is not expected.


Regards

Takashi Sakamoto

Jaroslav Kysela June 23, 2022, 8:51 a.m. UTC | #4

On 23. 06. 22 10:24, Takashi Sakamoto wrote:
> Hi,
> 
> On Thu, Jun 23, 2022, at 16:18, Jaroslav Kysela wrote:
>> On 23. 06. 22 7:58, Takashi Iwai wrote:
>>
>>>> Even with this patch, there is a remaining problem, which is not present
>>>> in the MS2109. The sound sample values range from 0x0000 to 0x7fff, with
>>>> silence around 0x4000, i.e. 15-bit-ish audio. This actually sounds OK to
>>>> the ear (although half as loud as it should be), but looks odd when
>>>> looking at the waveform, and makes volume meters always think the sound
>>>> is very loud.
>>>>
>>>> To convert to s16le, I can bitshift one bit left, and subtract 32768.
>>>> I'm told that this isn't something that can or should be done in the
>>>> kernel, but should be in userspace. Any more advice on how to fix this
>>>> remaining quirk would be very welcome.
>>>
>>> Ouch, this is painful.  We haven't had any devices that require a
>>> 15 bit unsigned format, and maybe we don't want to add it to the
>>> common standard format just for one funky device, either.  Such data
>>> processing could be done in alsa-lib, but for the proper interaction
>>> with the user-space, the kernel should provide some information so
>>> that user-space can process the data accordingly.  However, we have no
>>> proper way defined for it generically, so far.
>>>
>>> Maybe an easy way would be to create an alsa-lib external plugin, and
>>> apply it per device.  Jaroslav, could it be done via UCM?
>>
>> I agree that we may start with a special plugin for this format. The UCM can
>> use any alsa-lib configuration now. So PA/PW should work with this very
>> specific hardware when properly configured.
>>
>> Note that we have SNDRV_PCM_FORMAT_SPECIAL for such cases. It will imply that
>> the applications will fail when the special conversion plugin is not used. The
>> minor issue may be with the silence routines (which is already with the
>> improper format).
> 
> I think the combination of format/subformat is available in the case.
> 
> Actually the odd frame format is within 16 bit slot, so SNDRV_PCM_FORMAT_U16
> or so is available. Then we can define new subformat to notify userspace; e.g.
> SNDRV_PCM_SUBFORMAT_MODEL_SPECIFIC.
> 
> The cons is that we need to add new subformat, thus update asound.h. On the
> other hand, the pros is that existent userspace stuffs still handle the odd format
> and user can hear playback sound at least even if the loudness is not expected.

It's a question if the cutted half-wave samples should be sent to D/A in this 
case. Also, if the high-bit (U16) is set, the resulting sample value for D/A 
is completely incorrect. It's not only about loudness. My opinion is to not 
use the U16 sample format for this and handle this hw as special one until the 
format is more common (which I do not expect) to extend the format list.

						Jaroslav

John Veness June 24, 2022, 2:11 p.m. UTC | #5

On 23/06/2022 06:58, Takashi Iwai wrote:
>> Nonetheless, as I say, with the following kernel patch the captured
>> audio certainly sounds right, so is a big improvement and makes these
>> dongles usable:
> 
> So as a first step, we can merge the patch as is.  The rest needs more
> consideration.
> 
> In anyway, for merging the patch, it has to be submitted properly in a
> right format, especially including your Signed-off-by tag.  Please
> refer to Documentation/process/submitting-patches.rst for details.
> 
> thanks,
> 
> Takashi

Thank you, or the others, for the encouraging comments. I have just 
submitted the patch in the proper format... I hope!

John Veness

Takashi Sakamoto June 25, 2022, 3:08 a.m. UTC | #6

On Thu, Jun 23, 2022 at 10:51:18AM +0200, Jaroslav Kysela wrote:
> On 23. 06. 22 10:24, Takashi Sakamoto wrote:
> > Hi,
> > 
> > On Thu, Jun 23, 2022, at 16:18, Jaroslav Kysela wrote:
> > > On 23. 06. 22 7:58, Takashi Iwai wrote:
> > > 
> > > > > Even with this patch, there is a remaining problem, which is not present
> > > > > in the MS2109. The sound sample values range from 0x0000 to 0x7fff, with
> > > > > silence around 0x4000, i.e. 15-bit-ish audio. This actually sounds OK to
> > > > > the ear (although half as loud as it should be), but looks odd when
> > > > > looking at the waveform, and makes volume meters always think the sound
> > > > > is very loud.
> > > > > 
> > > > > To convert to s16le, I can bitshift one bit left, and subtract 32768.
> > > > > I'm told that this isn't something that can or should be done in the
> > > > > kernel, but should be in userspace. Any more advice on how to fix this
> > > > > remaining quirk would be very welcome.
> > > > 
> > > > Ouch, this is painful.  We haven't had any devices that require a
> > > > 15 bit unsigned format, and maybe we don't want to add it to the
> > > > common standard format just for one funky device, either.  Such data
> > > > processing could be done in alsa-lib, but for the proper interaction
> > > > with the user-space, the kernel should provide some information so
> > > > that user-space can process the data accordingly.  However, we have no
> > > > proper way defined for it generically, so far.
> > > > 
> > > > Maybe an easy way would be to create an alsa-lib external plugin, and
> > > > apply it per device.  Jaroslav, could it be done via UCM?
> > > 
> > > I agree that we may start with a special plugin for this format. The UCM can
> > > use any alsa-lib configuration now. So PA/PW should work with this very
> > > specific hardware when properly configured.
> > > 
> > > Note that we have SNDRV_PCM_FORMAT_SPECIAL for such cases. It will imply that
> > > the applications will fail when the special conversion plugin is not used. The
> > > minor issue may be with the silence routines (which is already with the
> > > improper format).
> > 
> > I think the combination of format/subformat is available in the case.
> > 
> > Actually the odd frame format is within 16 bit slot, so SNDRV_PCM_FORMAT_U16
> > or so is available. Then we can define new subformat to notify userspace; e.g.
> > SNDRV_PCM_SUBFORMAT_MODEL_SPECIFIC.
> > 
> > The cons is that we need to add new subformat, thus update asound.h. On the
> > other hand, the pros is that existent userspace stuffs still handle the odd format
> > and user can hear playback sound at least even if the loudness is not expected.
> 
> It's a question if the cutted half-wave samples should be sent to D/A in
> this case. Also, if the high-bit (U16) is set, the resulting sample value
> for D/A is completely incorrect. It's not only about loudness.

Indeed. I overlooked in the report... Even if receiving 16 bit sample from
userspace, the device seems not to generate better sound. The conversion is
required for listening anyway.

> My opinion is to not use the U16 sample format for this and handle this
> hw as special one until the format is more common (which I do not expect)
> to extend the format list.

If it's left-justified or right-padding case, msbits parameter shall be
available, but the case is not. If avoiding new entry for format
definitions (the rest is 12 entries in kernel API), it seems to be
inevitable to use SPECIAL format since hardware parameter doesn't deliver
explicit information about effective bits (=width) inner sample slot
(=physical width) for left-padding case.


Regards

Takashi Sakamoto

John Veness June 25, 2022, 10:17 a.m. UTC | #7

Thanks for all the discussions so far. A lot of it is over my head, but 
I appreciate it!

> Even with this patch, there is a remaining problem, which is not present > in the MS2109. The sound sample values range from 0x0000 to 0x7fff, 
with> silence around 0x4000, i.e. 15-bit-ish audio. This actually sounds 
OK to> the ear (although half as loud as it should be), but looks odd 
when> looking at the waveform, and makes volume meters always think the 
sound> is very loud.
On further inspection, the last hex digit of the samples are always 0 or 
8. So the least significant three bits are always zero, and technically 
the values range from 0x0000 to 0x7ff8, not 0x7fff as I said before.

> To convert to s16le, I can bitshift one bit left, and subtract 32768.
> I'm told that this isn't something that can or should be done in the
> kernel, but should be in userspace. Any more advice on how to fix this
> remaining quirk would be very welcome.

I don't know if this makes things more complicated. It is still the case 
that the conversion described above to s16le makes the waveforms look 
correct (and the volume correct), even if the lowest four bits are zero.

John Veness

Quirks for MacroSilicon MS2100/MS2106

Commit Message

Comments

Patch