diff mbox series

hw/pci-host/grackle: Verify PIC link is properly set

Message ID 20201011190332.3189611-1-f4bug@amsat.org
State New
Headers show
Series hw/pci-host/grackle: Verify PIC link is properly set | expand

Commit Message

Philippe Mathieu-Daudé Oct. 11, 2020, 7:03 p.m. UTC
The Grackle PCI host model expects the interrupt controller
being set, but does not verify it is present. Add a check to
help developers using this model.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 hw/pci-host/grackle.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

David Gibson Oct. 11, 2020, 10:34 p.m. UTC | #1
On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe Mathieu-Daudé wrote:
> The Grackle PCI host model expects the interrupt controller

> being set, but does not verify it is present. Add a check to

> help developers using this model.


I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2

> 

> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

> ---

>  hw/pci-host/grackle.c | 4 ++++

>  1 file changed, 4 insertions(+)

> 

> diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c

> index 57c29b20afb..20361d215ca 100644

> --- a/hw/pci-host/grackle.c

> +++ b/hw/pci-host/grackle.c

> @@ -76,6 +76,10 @@ static void grackle_realize(DeviceState *dev, Error **errp)

>      GrackleState *s = GRACKLE_PCI_HOST_BRIDGE(dev);

>      PCIHostState *phb = PCI_HOST_BRIDGE(dev);

>  

> +    if (!s->pic) {

> +        error_setg(errp, TYPE_GRACKLE_PCI_HOST_BRIDGE ": 'pic' link not set");

> +        return;

> +    }

>      phb->bus = pci_register_root_bus(dev, NULL,

>                                       pci_grackle_set_irq,

>                                       pci_grackle_map_irq,


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
Philippe Mathieu-Daudé Oct. 12, 2020, 6:21 a.m. UTC | #2
On 10/12/20 12:34 AM, David Gibson wrote:
> On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe Mathieu-Daudé wrote:

>> The Grackle PCI host model expects the interrupt controller

>> being set, but does not verify it is present. Add a check to

>> help developers using this model.

> 

> I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2


Do you want I correct the description as:
"The Grackle PCI host model expects the interrupt controller
being set, but does not verify it is present.
A developer basing its implementation on the Grackle model
might hit this problem. Add a check to help future developers
using this model as reference."?

> 

>>

>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

>> ---

>>   hw/pci-host/grackle.c | 4 ++++

>>   1 file changed, 4 insertions(+)

>>

>> diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c

>> index 57c29b20afb..20361d215ca 100644

>> --- a/hw/pci-host/grackle.c

>> +++ b/hw/pci-host/grackle.c

>> @@ -76,6 +76,10 @@ static void grackle_realize(DeviceState *dev, Error **errp)

>>       GrackleState *s = GRACKLE_PCI_HOST_BRIDGE(dev);

>>       PCIHostState *phb = PCI_HOST_BRIDGE(dev);

>>   

>> +    if (!s->pic) {

>> +        error_setg(errp, TYPE_GRACKLE_PCI_HOST_BRIDGE ": 'pic' link not set");

>> +        return;

>> +    }

>>       phb->bus = pci_register_root_bus(dev, NULL,

>>                                        pci_grackle_set_irq,

>>                                        pci_grackle_map_irq,

>
David Gibson Oct. 12, 2020, 6:54 a.m. UTC | #3
On Mon, Oct 12, 2020 at 08:21:41AM +0200, Philippe Mathieu-Daudé wrote:
> On 10/12/20 12:34 AM, David Gibson wrote:
> > On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe Mathieu-Daudé wrote:
> > > The Grackle PCI host model expects the interrupt controller
> > > being set, but does not verify it is present. Add a check to
> > > help developers using this model.
> > 
> > I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2
> 
> Do you want I correct the description as:
> "The Grackle PCI host model expects the interrupt controller
> being set, but does not verify it is present.
> A developer basing its implementation on the Grackle model
> might hit this problem. Add a check to help future developers
> using this model as reference."?

No, it's fine.  All I was saying is that the chances of anyone using
Grackle in future seem very low to me.


> 
> > 
> > > 
> > > Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > ---
> > >   hw/pci-host/grackle.c | 4 ++++
> > >   1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c
> > > index 57c29b20afb..20361d215ca 100644
> > > --- a/hw/pci-host/grackle.c
> > > +++ b/hw/pci-host/grackle.c
> > > @@ -76,6 +76,10 @@ static void grackle_realize(DeviceState *dev, Error **errp)
> > >       GrackleState *s = GRACKLE_PCI_HOST_BRIDGE(dev);
> > >       PCIHostState *phb = PCI_HOST_BRIDGE(dev);
> > > +    if (!s->pic) {
> > > +        error_setg(errp, TYPE_GRACKLE_PCI_HOST_BRIDGE ": 'pic' link not set");
> > > +        return;
> > > +    }
> > >       phb->bus = pci_register_root_bus(dev, NULL,
> > >                                        pci_grackle_set_irq,
> > >                                        pci_grackle_map_irq,
> > 
>
Mark Cave-Ayland Oct. 12, 2020, 9:23 a.m. UTC | #4
On 11/10/2020 20:03, Philippe Mathieu-Daudé wrote:

> The Grackle PCI host model expects the interrupt controller

> being set, but does not verify it is present. Add a check to

> help developers using this model.

> 

> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

> ---

>  hw/pci-host/grackle.c | 4 ++++

>  1 file changed, 4 insertions(+)

> 

> diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c

> index 57c29b20afb..20361d215ca 100644

> --- a/hw/pci-host/grackle.c

> +++ b/hw/pci-host/grackle.c

> @@ -76,6 +76,10 @@ static void grackle_realize(DeviceState *dev, Error **errp)

>      GrackleState *s = GRACKLE_PCI_HOST_BRIDGE(dev);

>      PCIHostState *phb = PCI_HOST_BRIDGE(dev);

>  

> +    if (!s->pic) {

> +        error_setg(errp, TYPE_GRACKLE_PCI_HOST_BRIDGE ": 'pic' link not set");

> +        return;

> +    }

>      phb->bus = pci_register_root_bus(dev, NULL,

>                                       pci_grackle_set_irq,

>                                       pci_grackle_map_irq,


Reviewing this now, my feeling is that both grackle and uninorth are doing the wrong
thing here by passing in a link to the PIC - certainly if I had written this more
recently than I originally did, I would simply use qdev_init_gpio_out() for the IRQ
lines and do the wiring during machine init.

I've got a few other things to look at first, but I'll post a patchset later which I
think will improve this for both Mac machines.


ATB,

Mark.
Xingtao Yao (Fujitsu)" via Oct. 12, 2020, 11:50 a.m. UTC | #5
On Mon, 12 Oct 2020, David Gibson wrote:
> On Mon, Oct 12, 2020 at 08:21:41AM +0200, Philippe Mathieu-Daudé wrote:

>> On 10/12/20 12:34 AM, David Gibson wrote:

>>> On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe Mathieu-Daudé wrote:

>>>> The Grackle PCI host model expects the interrupt controller

>>>> being set, but does not verify it is present. Add a check to

>>>> help developers using this model.

>>>

>>> I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2

>>

>> Do you want I correct the description as:

>> "The Grackle PCI host model expects the interrupt controller

>> being set, but does not verify it is present.

>> A developer basing its implementation on the Grackle model

>> might hit this problem. Add a check to help future developers

>> using this model as reference."?

>

> No, it's fine.  All I was saying is that the chances of anyone using

> Grackle in future seem very low to me.


So maybe an assert instead of a user visible error would be enough?

Regards,
BALATON Zoltan

>

>

>>

>>>

>>>>

>>>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

>>>> ---

>>>>   hw/pci-host/grackle.c | 4 ++++

>>>>   1 file changed, 4 insertions(+)

>>>>

>>>> diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c

>>>> index 57c29b20afb..20361d215ca 100644

>>>> --- a/hw/pci-host/grackle.c

>>>> +++ b/hw/pci-host/grackle.c

>>>> @@ -76,6 +76,10 @@ static void grackle_realize(DeviceState *dev, Error **errp)

>>>>       GrackleState *s = GRACKLE_PCI_HOST_BRIDGE(dev);

>>>>       PCIHostState *phb = PCI_HOST_BRIDGE(dev);

>>>> +    if (!s->pic) {

>>>> +        error_setg(errp, TYPE_GRACKLE_PCI_HOST_BRIDGE ": 'pic' link not set");

>>>> +        return;

>>>> +    }

>>>>       phb->bus = pci_register_root_bus(dev, NULL,

>>>>                                        pci_grackle_set_irq,

>>>>                                        pci_grackle_map_irq,

>>>

>>

>

>
Philippe Mathieu-Daudé Oct. 12, 2020, noon UTC | #6
On 10/12/20 1:50 PM, BALATON Zoltan via wrote:
> On Mon, 12 Oct 2020, David Gibson wrote:

>> On Mon, Oct 12, 2020 at 08:21:41AM +0200, Philippe Mathieu-Daudé 

>> wrote:

>>> On 10/12/20 12:34 AM, David Gibson wrote:

>>>> On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe Mathieu-Daudé 

>>>> wrote:

>>>>> The Grackle PCI host model expects the interrupt controller

>>>>> being set, but does not verify it is present. Add a check to

>>>>> help developers using this model.

>>>>

>>>> I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2

>>>

>>> Do you want I correct the description as:

>>> "The Grackle PCI host model expects the interrupt controller

>>> being set, but does not verify it is present.

>>> A developer basing its implementation on the Grackle model

>>> might hit this problem. Add a check to help future developers

>>> using this model as reference."?

>>

>> No, it's fine.  All I was saying is that the chances of anyone using

>> Grackle in future seem very low to me.

> 

> So maybe an assert instead of a user visible error would be enough?


My understanding is realize() shouldn't abort()
(the caller might choose to by using &error_abort).

> 

> Regards,

> BALATON Zoltan

> 

>>

>>

>>>

>>>>

>>>>>

>>>>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

>>>>> ---

>>>>>   hw/pci-host/grackle.c | 4 ++++

>>>>>   1 file changed, 4 insertions(+)

>>>>>

>>>>> diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c

>>>>> index 57c29b20afb..20361d215ca 100644

>>>>> --- a/hw/pci-host/grackle.c

>>>>> +++ b/hw/pci-host/grackle.c

>>>>> @@ -76,6 +76,10 @@ static void grackle_realize(DeviceState *dev, 

>>>>> Error **errp)

>>>>>       GrackleState *s = GRACKLE_PCI_HOST_BRIDGE(dev);

>>>>>       PCIHostState *phb = PCI_HOST_BRIDGE(dev);

>>>>> +    if (!s->pic) {

>>>>> +        error_setg(errp, TYPE_GRACKLE_PCI_HOST_BRIDGE ": 'pic' 

>>>>> link not set");

>>>>> +        return;

>>>>> +    }

>>>>>       phb->bus = pci_register_root_bus(dev, NULL,

>>>>>                                        pci_grackle_set_irq,

>>>>>                                        pci_grackle_map_irq,

>>>>

>>>

>>

>>
Philippe Mathieu-Daudé Oct. 12, 2020, 12:01 p.m. UTC | #7
On 10/12/20 11:23 AM, Mark Cave-Ayland wrote:
> On 11/10/2020 20:03, Philippe Mathieu-Daudé wrote:
> 
>> The Grackle PCI host model expects the interrupt controller
>> being set, but does not verify it is present. Add a check to
>> help developers using this model.
>>
>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>> ---
>>   hw/pci-host/grackle.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c
>> index 57c29b20afb..20361d215ca 100644
>> --- a/hw/pci-host/grackle.c
>> +++ b/hw/pci-host/grackle.c
>> @@ -76,6 +76,10 @@ static void grackle_realize(DeviceState *dev, Error **errp)
>>       GrackleState *s = GRACKLE_PCI_HOST_BRIDGE(dev);
>>       PCIHostState *phb = PCI_HOST_BRIDGE(dev);
>>   
>> +    if (!s->pic) {
>> +        error_setg(errp, TYPE_GRACKLE_PCI_HOST_BRIDGE ": 'pic' link not set");
>> +        return;
>> +    }
>>       phb->bus = pci_register_root_bus(dev, NULL,
>>                                        pci_grackle_set_irq,
>>                                        pci_grackle_map_irq,
> 
> Reviewing this now, my feeling is that both grackle and uninorth are doing the wrong
> thing here by passing in a link to the PIC - certainly if I had written this more
> recently than I originally did, I would simply use qdev_init_gpio_out() for the IRQ
> lines and do the wiring during machine init.
> 
> I've got a few other things to look at first, but I'll post a patchset later which I
> think will improve this for both Mac machines.

Awesome, thanks!

> 
> 
> ATB,
> 
> Mark.
>
Markus Armbruster Oct. 19, 2020, 7:50 a.m. UTC | #8
Philippe Mathieu-Daudé <f4bug@amsat.org> writes:

> On 10/12/20 1:50 PM, BALATON Zoltan via wrote:
>> On Mon, 12 Oct 2020, David Gibson wrote:
>>> On Mon, Oct 12, 2020 at 08:21:41AM +0200, Philippe
>>> Mathieu-Daudé wrote:
>>>> On 10/12/20 12:34 AM, David Gibson wrote:
>>>>> On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe
>>>>> Mathieu-Daudé wrote:
>>>>>> The Grackle PCI host model expects the interrupt controller
>>>>>> being set, but does not verify it is present. Add a check to
>>>>>> help developers using this model.
>>>>>
>>>>> I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2
>>>>
>>>> Do you want I correct the description as:
>>>> "The Grackle PCI host model expects the interrupt controller
>>>> being set, but does not verify it is present.
>>>> A developer basing its implementation on the Grackle model
>>>> might hit this problem. Add a check to help future developers
>>>> using this model as reference."?
>>>
>>> No, it's fine.  All I was saying is that the chances of anyone using
>>> Grackle in future seem very low to me.
>> So maybe an assert instead of a user visible error would be enough?
>
> My understanding is realize() shouldn't abort()
> (the caller might choose to by using &error_abort).

assert() is for checking invariants.  A violated invariant is a
programming error: developers screwed up, safe recovery is impossible.

Abusing assert() to catch errors that are not programming errors is
wrong.

You may check invariants with assert() anywhere in the code.

You should not misuse assert() anywhere in the code.

Sometimes, an error condition that is *not* a programming error in the
function where it is detected *is* a programming error for certain
calls.  Having these calls pass &error_abort is a common solution for
this problem.

Hope this helps.
Xingtao Yao (Fujitsu)" via Oct. 19, 2020, 11:11 a.m. UTC | #9
On Mon, 19 Oct 2020, Markus Armbruster wrote:
> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:

>> On 10/12/20 1:50 PM, BALATON Zoltan via wrote:

>>> On Mon, 12 Oct 2020, David Gibson wrote:

>>>> On Mon, Oct 12, 2020 at 08:21:41AM +0200, Philippe

>>>> Mathieu-Daudé wrote:

>>>>> On 10/12/20 12:34 AM, David Gibson wrote:

>>>>>> On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe

>>>>>> Mathieu-Daudé wrote:

>>>>>>> The Grackle PCI host model expects the interrupt controller

>>>>>>> being set, but does not verify it is present. Add a check to

>>>>>>> help developers using this model.

>>>>>>

>>>>>> I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2

>>>>>

>>>>> Do you want I correct the description as:

>>>>> "The Grackle PCI host model expects the interrupt controller

>>>>> being set, but does not verify it is present.

>>>>> A developer basing its implementation on the Grackle model

>>>>> might hit this problem. Add a check to help future developers

>>>>> using this model as reference."?

>>>>

>>>> No, it's fine.  All I was saying is that the chances of anyone using

>>>> Grackle in future seem very low to me.

>>> So maybe an assert instead of a user visible error would be enough?

>>

>> My understanding is realize() shouldn't abort()

>> (the caller might choose to by using &error_abort).

>

> assert() is for checking invariants.  A violated invariant is a

> programming error: developers screwed up, safe recovery is impossible.

>

> Abusing assert() to catch errors that are not programming errors is

> wrong.

>

> You may check invariants with assert() anywhere in the code.

>

> You should not misuse assert() anywhere in the code.

>

> Sometimes, an error condition that is *not* a programming error in the

> function where it is detected *is* a programming error for certain

> calls.  Having these calls pass &error_abort is a common solution for

> this problem.

>

> Hope this helps.


Helps just a bit but after reading this I'm still confused if this 
particular case should be assert or ser error.

I was suggesting assert and I think it's a programming error to use the 
grackle model without setting PIC link but not sure if users can also 
create this instance via command line (even if it does not make much 
sense) in which case it's probably better to return error. Having all 
devices user creatable via -device without a way to describe their 
dependencies is a nice way to make all sorts of errors possible. But maybe 
aborting with assert during creation of the machine is still OK. If people 
device_add a model later and that crashes then it's their problem. Unless 
we want to avoid that being used as DoS in enterprise setting. So maybe we 
should never abort then if there's a way to fail with an error instead.

I can look at this problem from different angles and all seem to be 
plausible resulting in different solutions.

Regards,
BALATON Zoltan
Markus Armbruster Oct. 19, 2020, 2 p.m. UTC | #10
BALATON Zoltan via <qemu-devel@nongnu.org> writes:

> On Mon, 19 Oct 2020, Markus Armbruster wrote:
>> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
>>> On 10/12/20 1:50 PM, BALATON Zoltan via wrote:
>>>> On Mon, 12 Oct 2020, David Gibson wrote:
>>>>> On Mon, Oct 12, 2020 at 08:21:41AM +0200, Philippe
>>>>> Mathieu-Daudé wrote:
>>>>>> On 10/12/20 12:34 AM, David Gibson wrote:
>>>>>>> On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe
>>>>>>> Mathieu-Daudé wrote:
>>>>>>>> The Grackle PCI host model expects the interrupt controller
>>>>>>>> being set, but does not verify it is present. Add a check to
>>>>>>>> help developers using this model.
>>>>>>>
>>>>>>> I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2
>>>>>>
>>>>>> Do you want I correct the description as:
>>>>>> "The Grackle PCI host model expects the interrupt controller
>>>>>> being set, but does not verify it is present.
>>>>>> A developer basing its implementation on the Grackle model
>>>>>> might hit this problem. Add a check to help future developers
>>>>>> using this model as reference."?
>>>>>
>>>>> No, it's fine.  All I was saying is that the chances of anyone using
>>>>> Grackle in future seem very low to me.
>>>> So maybe an assert instead of a user visible error would be enough?
>>>
>>> My understanding is realize() shouldn't abort()
>>> (the caller might choose to by using &error_abort).
>>
>> assert() is for checking invariants.  A violated invariant is a
>> programming error: developers screwed up, safe recovery is impossible.
>>
>> Abusing assert() to catch errors that are not programming errors is
>> wrong.
>>
>> You may check invariants with assert() anywhere in the code.
>>
>> You should not misuse assert() anywhere in the code.
>>
>> Sometimes, an error condition that is *not* a programming error in the
>> function where it is detected *is* a programming error for certain
>> calls.  Having these calls pass &error_abort is a common solution for
>> this problem.
>>
>> Hope this helps.
>
> Helps just a bit but after reading this I'm still confused if this
> particular case should be assert or ser error.
>
> I was suggesting assert and I think it's a programming error to use
> the grackle model without setting PIC link but not sure if users can
> also create this instance via command line (even if it does not make
> much sense) in which case it's probably better to return error.

They can't: "info qdm" shows

    name "grackle-pcihost", bus System, no-user
                                        ~~~~~~~

>                                                                 Having
> all devices user creatable via -device without a way to describe their 
> dependencies is a nice way to make all sorts of errors possible. But
> maybe aborting with assert during creation of the machine is still
> OK. If people device_add a model later and that crashes then it's
> their problem. Unless we want to avoid that being used as DoS in
> enterprise setting. So maybe we should never abort then if there's a
> way to fail with an error instead.
>
> I can look at this problem from different angles and all seem to be
> plausible resulting in different solutions.

As long as the device is no-user, asserting that properties have sane
values feels reasonable enough to me.

Setting an error instead is not wrong, of course.
Mark Cave-Ayland Oct. 19, 2020, 2:38 p.m. UTC | #11
On 19/10/2020 15:00, Markus Armbruster wrote:

> BALATON Zoltan via <qemu-devel@nongnu.org> writes:

> 

>> On Mon, 19 Oct 2020, Markus Armbruster wrote:

>>> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:

>>>> On 10/12/20 1:50 PM, BALATON Zoltan via wrote:

>>>>> On Mon, 12 Oct 2020, David Gibson wrote:

>>>>>> On Mon, Oct 12, 2020 at 08:21:41AM +0200, Philippe

>>>>>> Mathieu-Daudé wrote:

>>>>>>> On 10/12/20 12:34 AM, David Gibson wrote:

>>>>>>>> On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe

>>>>>>>> Mathieu-Daudé wrote:

>>>>>>>>> The Grackle PCI host model expects the interrupt controller

>>>>>>>>> being set, but does not verify it is present. Add a check to

>>>>>>>>> help developers using this model.

>>>>>>>>

>>>>>>>> I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2

>>>>>>>

>>>>>>> Do you want I correct the description as:

>>>>>>> "The Grackle PCI host model expects the interrupt controller

>>>>>>> being set, but does not verify it is present.

>>>>>>> A developer basing its implementation on the Grackle model

>>>>>>> might hit this problem. Add a check to help future developers

>>>>>>> using this model as reference."?

>>>>>>

>>>>>> No, it's fine.  All I was saying is that the chances of anyone using

>>>>>> Grackle in future seem very low to me.

>>>>> So maybe an assert instead of a user visible error would be enough?

>>>>

>>>> My understanding is realize() shouldn't abort()

>>>> (the caller might choose to by using &error_abort).

>>>

>>> assert() is for checking invariants.  A violated invariant is a

>>> programming error: developers screwed up, safe recovery is impossible.

>>>

>>> Abusing assert() to catch errors that are not programming errors is

>>> wrong.

>>>

>>> You may check invariants with assert() anywhere in the code.

>>>

>>> You should not misuse assert() anywhere in the code.

>>>

>>> Sometimes, an error condition that is *not* a programming error in the

>>> function where it is detected *is* a programming error for certain

>>> calls.  Having these calls pass &error_abort is a common solution for

>>> this problem.

>>>

>>> Hope this helps.

>>

>> Helps just a bit but after reading this I'm still confused if this

>> particular case should be assert or ser error.

>>

>> I was suggesting assert and I think it's a programming error to use

>> the grackle model without setting PIC link but not sure if users can

>> also create this instance via command line (even if it does not make

>> much sense) in which case it's probably better to return error.

> 

> They can't: "info qdm" shows

> 

>      name "grackle-pcihost", bus System, no-user

>                                          ~~~~~~~

> 

>>                                                                  Having

>> all devices user creatable via -device without a way to describe their

>> dependencies is a nice way to make all sorts of errors possible. But

>> maybe aborting with assert during creation of the machine is still

>> OK. If people device_add a model later and that crashes then it's

>> their problem. Unless we want to avoid that being used as DoS in

>> enterprise setting. So maybe we should never abort then if there's a

>> way to fail with an error instead.

>>

>> I can look at this problem from different angles and all seem to be

>> plausible resulting in different solutions.

> 

> As long as the device is no-user, asserting that properties have sane

> values feels reasonable enough to me.

> 

> Setting an error instead is not wrong, of course.


One thing I have thought about is being able to mark a link property as mandatory so 
if a value hasn't been set before realize then you receive a fatal error. This would 
be for cases like this where 2 internal devices are connected together without any 
formal interface, i.e. in cases where -device wouldn't work anyway.

I should also add that this patch has been dropped, and its replacement is now in git 
master - the wiring of the PIC has been moved from the grackle PCI host bridge up to 
the machine level where it really belongs.


ATB,

Mark.
Xingtao Yao (Fujitsu)" via Oct. 19, 2020, 4:17 p.m. UTC | #12
On Mon, 19 Oct 2020, Mark Cave-Ayland wrote:
> On 19/10/2020 15:00, Markus Armbruster wrote:
>> BALATON Zoltan via <qemu-devel@nongnu.org> writes:
>>> On Mon, 19 Oct 2020, Markus Armbruster wrote:
>>>> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
>>>>> On 10/12/20 1:50 PM, BALATON Zoltan via wrote:
>>>>>> On Mon, 12 Oct 2020, David Gibson wrote:
>>>>>>> On Mon, Oct 12, 2020 at 08:21:41AM +0200, Philippe
>>>>>>> Mathieu-Daudé wrote:
>>>>>>>> On 10/12/20 12:34 AM, David Gibson wrote:
>>>>>>>>> On Sun, Oct 11, 2020 at 09:03:32PM +0200, Philippe
>>>>>>>>> Mathieu-Daudé wrote:
>>>>>>>>>> The Grackle PCI host model expects the interrupt controller
>>>>>>>>>> being set, but does not verify it is present. Add a check to
>>>>>>>>>> help developers using this model.
>>>>>>>>> 
>>>>>>>>> I don't think thaqt's very likely, but, sure, applied to ppc-for-5.2
>>>>>>>> 
>>>>>>>> Do you want I correct the description as:
>>>>>>>> "The Grackle PCI host model expects the interrupt controller
>>>>>>>> being set, but does not verify it is present.
>>>>>>>> A developer basing its implementation on the Grackle model
>>>>>>>> might hit this problem. Add a check to help future developers
>>>>>>>> using this model as reference."?
>>>>>>> 
>>>>>>> No, it's fine.  All I was saying is that the chances of anyone using
>>>>>>> Grackle in future seem very low to me.
>>>>>> So maybe an assert instead of a user visible error would be enough?
>>>>> 
>>>>> My understanding is realize() shouldn't abort()
>>>>> (the caller might choose to by using &error_abort).
>>>> 
>>>> assert() is for checking invariants.  A violated invariant is a
>>>> programming error: developers screwed up, safe recovery is impossible.
>>>> 
>>>> Abusing assert() to catch errors that are not programming errors is
>>>> wrong.
>>>> 
>>>> You may check invariants with assert() anywhere in the code.
>>>> 
>>>> You should not misuse assert() anywhere in the code.
>>>> 
>>>> Sometimes, an error condition that is *not* a programming error in the
>>>> function where it is detected *is* a programming error for certain
>>>> calls.  Having these calls pass &error_abort is a common solution for
>>>> this problem.
>>>> 
>>>> Hope this helps.
>>> 
>>> Helps just a bit but after reading this I'm still confused if this
>>> particular case should be assert or ser error.
>>> 
>>> I was suggesting assert and I think it's a programming error to use
>>> the grackle model without setting PIC link but not sure if users can
>>> also create this instance via command line (even if it does not make
>>> much sense) in which case it's probably better to return error.
>> 
>> They can't: "info qdm" shows
>>
>>      name "grackle-pcihost", bus System, no-user
>>                                          ~~~~~~~
>>
>>>                                                                  Having
>>> all devices user creatable via -device without a way to describe their
>>> dependencies is a nice way to make all sorts of errors possible. But
>>> maybe aborting with assert during creation of the machine is still
>>> OK. If people device_add a model later and that crashes then it's
>>> their problem. Unless we want to avoid that being used as DoS in
>>> enterprise setting. So maybe we should never abort then if there's a
>>> way to fail with an error instead.
>>> 
>>> I can look at this problem from different angles and all seem to be
>>> plausible resulting in different solutions.
>> 
>> As long as the device is no-user, asserting that properties have sane
>> values feels reasonable enough to me.
>> 
>> Setting an error instead is not wrong, of course.
>
> One thing I have thought about is being able to mark a link property as 
> mandatory so if a value hasn't been set before realize then you receive a 
> fatal error. This would be for cases like this where 2 internal devices are 
> connected together without any formal interface, i.e. in cases where -device 
> wouldn't work anyway.

In that case we should have no-user anyway and the device has to be 
instantiated by board code so then assert is enough to avoid error in 
board code. More complex way to describe dependencies may only be needed 
if we want to allow dynamic creation of new boards via the command line 
which was the original idea but that's quite a long way away if possible 
at all so I would not worry about that at this point.

Regards,
BALATON Zoltan
Markus Armbruster Oct. 20, 2020, 5:30 a.m. UTC | #13
Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> writes:

> One thing I have thought about is being able to mark a link property
> as mandatory so if a value hasn't been set before realize then you

A non-null value, I presume.

> receive a fatal error. This would be for cases like this where 2
> internal devices are connected together without any formal interface,
> i.e. in cases where -device wouldn't work anyway.

Moves the check from code one step closer to data: from the realize
method to the object_property_add_link() call.

I like doing things in data, because data is easier to reason about than
code.

[...]
Xingtao Yao (Fujitsu)" via Oct. 20, 2020, 11:37 a.m. UTC | #14
On Tue, 20 Oct 2020, Markus Armbruster wrote:
> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> writes:
>
>> One thing I have thought about is being able to mark a link property
>> as mandatory so if a value hasn't been set before realize then you
>
> A non-null value, I presume.

Do you mean something like distinguish between NULL and INVALID_VALUE 
where setting the latter as initial value means property is mandatory?

>> receive a fatal error. This would be for cases like this where 2
>> internal devices are connected together without any formal interface,
>> i.e. in cases where -device wouldn't work anyway.
>
> Moves the check from code one step closer to data: from the realize
> method to the object_property_add_link() call.
>
> I like doing things in data, because data is easier to reason about than
> code.

Except when object initialisation is scattered around in boiler plate code 
as in QEMU where it may not be obvious why a realize method fails due to 
something set/documented in a class_init method elsewhere, whereas an 
assert in the realize method is quite self evident and would document the 
same requirements.

Regards.
BALATON Zoltan
Markus Armbruster Oct. 21, 2020, 3:31 a.m. UTC | #15
BALATON Zoltan via <qemu-devel@nongnu.org> writes:

> On Tue, 20 Oct 2020, Markus Armbruster wrote:

>> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> writes:

>>

>>> One thing I have thought about is being able to mark a link property

>>> as mandatory so if a value hasn't been set before realize then you

>>

>> A non-null value, I presume.

>

> Do you mean something like distinguish between NULL and INVALID_VALUE

> where setting the latter as initial value means property is mandatory?


I doubt "somebody must have set some value (which could be null)" is
useful here.  I believe Mark was thinking about "somebody must have
connected the link (i.e. set a non-null value)".

>>> receive a fatal error. This would be for cases like this where 2

>>> internal devices are connected together without any formal interface,

>>> i.e. in cases where -device wouldn't work anyway.

>>

>> Moves the check from code one step closer to data: from the realize

>> method to the object_property_add_link() call.

>>

>> I like doing things in data, because data is easier to reason about than

>> code.

>

> Except when object initialisation is scattered around in boiler plate

> code as in QEMU where it may not be obvious why a realize method fails

> due to something set/documented in a class_init method elsewhere,

> whereas an assert in the realize method is quite self evident and

> would document the same requirements.


Two aspects.

One, making sense of an assertion failure.  Whether some ad hoc

    assert(s->some_other_object);

in the device realize method crashes, or a generic

    if (!object_property_get_link(s, prop->name, &error_abort)) {
        error_setg(errp, "link %s must be connected", prop->name);
    }

crashes in the error_setg() is all the same to me.  Some developers may
find the latter's crash message superior (I don't care).

Two, making sense of the code.  You have to consider property definition
(which might be code in .class_init(), code in .instance_init(), or data
passed to .device_class_set_props()), property use (in the code that
sets up the device), and property value checking (generic checks in
qom/object.c, specific checks in .realize()).  Moving one check from
"specific" to "generic" won't make much of a difference.
Xingtao Yao (Fujitsu)" via Oct. 21, 2020, 10:21 a.m. UTC | #16
On Wed, 21 Oct 2020, Markus Armbruster wrote:
> BALATON Zoltan via <qemu-devel@nongnu.org> writes:
>> On Tue, 20 Oct 2020, Markus Armbruster wrote:
>>> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> writes:
>>>
>>>> One thing I have thought about is being able to mark a link property
>>>> as mandatory so if a value hasn't been set before realize then you
>>>
>>> A non-null value, I presume.
>>
>> Do you mean something like distinguish between NULL and INVALID_VALUE
>> where setting the latter as initial value means property is mandatory?
>
> I doubt "somebody must have set some value (which could be null)" is
> useful here.  I believe Mark was thinking about "somebody must have
> connected the link (i.e. set a non-null value)".

This is all theoretical, as in this case we have no_user anyway so the 
problem can only happen through programmer error but if an object has 
several properties some of which can be NULL while others are mandatory 
how do you tell which one is mandataory? That's what I thouhgt having 
initial value some INVALID_VALUE could be used to say these are mandatory 
while those properties that are initially NULL don't need to have a valie. 
That way your generic check can work, otherwise it will need another way 
to describe mandatory properties which is additional complexity and more 
boilerplate or will make every link property mandatory which may not 
always work. Anyway, I was just trying to understand your suggestion and 
since we concluded that we don't really need it for this case we can just 
say "YAGNI" as I've learned from you and leave it at that.

Regards,
BALATON Zoltan
diff mbox series

Patch

diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c
index 57c29b20afb..20361d215ca 100644
--- a/hw/pci-host/grackle.c
+++ b/hw/pci-host/grackle.c
@@ -76,6 +76,10 @@  static void grackle_realize(DeviceState *dev, Error **errp)
     GrackleState *s = GRACKLE_PCI_HOST_BRIDGE(dev);
     PCIHostState *phb = PCI_HOST_BRIDGE(dev);
 
+    if (!s->pic) {
+        error_setg(errp, TYPE_GRACKLE_PCI_HOST_BRIDGE ": 'pic' link not set");
+        return;
+    }
     phb->bus = pci_register_root_bus(dev, NULL,
                                      pci_grackle_set_irq,
                                      pci_grackle_map_irq,