diff mbox

[v10,07/10] OF: Introduce helper function for getting PCI domain_nr

Message ID 1410184472-17630-8-git-send-email-Liviu.Dudau@arm.com
State New
Headers show

Commit Message

Liviu Dudau Sept. 8, 2014, 1:54 p.m. UTC
Add of_pci_get_domain_nr() to retrieve the PCI domain number
of a given device from DT. If the information is not present,
the function can be requested to allocate a new domain number.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
---
 drivers/of/of_pci.c    | 34 ++++++++++++++++++++++++++++++++++
 include/linux/of_pci.h |  7 +++++++
 2 files changed, 41 insertions(+)

Comments

Rob Herring Sept. 8, 2014, 2:27 p.m. UTC | #1
On Mon, Sep 8, 2014 at 8:54 AM, Liviu Dudau <Liviu.Dudau@arm.com> wrote:
> Add of_pci_get_domain_nr() to retrieve the PCI domain number
> of a given device from DT. If the information is not present,
> the function can be requested to allocate a new domain number.
>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Grant Likely <grant.likely@linaro.org>
> Cc: Rob Herring <robh+dt@kernel.org>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
> ---
>  drivers/of/of_pci.c    | 34 ++++++++++++++++++++++++++++++++++
>  include/linux/of_pci.h |  7 +++++++
>  2 files changed, 41 insertions(+)
>
> diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> index 8481996..a107edb 100644
> --- a/drivers/of/of_pci.c
> +++ b/drivers/of/of_pci.c
> @@ -89,6 +89,40 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
>  }
>  EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
>
> +static atomic_t of_domain_nr = ATOMIC_INIT(-1);
> +
> +/**
> + * This function will try to obtain the host bridge domain number by
> + * using of_alias_get_id() call with "pci-domain" as a stem. If that
> + * fails, a local allocator will be used. The local allocator can
> + * be requested to return a new domain_nr if the information is missing
> + * from the device tree.
> + *
> + * @node: device tree node with the domain information
> + * @allocate_if_missing: if DT lacks information about the domain nr,
> + * allocate a new number.
> + *
> + * Returns the associated domain number from DT, or a new domain number
> + * if DT information is missing and @allocate_if_missing is true. If
> + * @allocate_if_missing is false then the last allocated domain number
> + * will be returned.
> + */
> +int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
> +{
> +       int domain;
> +
> +       domain = of_alias_get_id(node, "pci-domain");
> +       if (domain == -ENODEV) {
> +               if (allocate_if_missing)
> +                       domain = atomic_inc_return(&of_domain_nr);
> +               else
> +                       domain = atomic_read(&of_domain_nr);

This function seems a bit broken to me. It is overloaded with too many
different outcomes. Think about how this would work if you have
multiple PCI buses and a mixture of having pci-domain aliases or not.
Aren't domain numbers global? Allocation should then start outside of
the range of alias ids.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liviu Dudau Sept. 8, 2014, 2:54 p.m. UTC | #2
On Mon, Sep 08, 2014 at 03:27:56PM +0100, Rob Herring wrote:
> On Mon, Sep 8, 2014 at 8:54 AM, Liviu Dudau <Liviu.Dudau@arm.com> wrote:
> > Add of_pci_get_domain_nr() to retrieve the PCI domain number
> > of a given device from DT. If the information is not present,
> > the function can be requested to allocate a new domain number.
> >
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Grant Likely <grant.likely@linaro.org>
> > Cc: Rob Herring <robh+dt@kernel.org>
> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
> > ---
> >  drivers/of/of_pci.c    | 34 ++++++++++++++++++++++++++++++++++
> >  include/linux/of_pci.h |  7 +++++++
> >  2 files changed, 41 insertions(+)
> >
> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> > index 8481996..a107edb 100644
> > --- a/drivers/of/of_pci.c
> > +++ b/drivers/of/of_pci.c
> > @@ -89,6 +89,40 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> >  }
> >  EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
> >
> > +static atomic_t of_domain_nr = ATOMIC_INIT(-1);
> > +
> > +/**
> > + * This function will try to obtain the host bridge domain number by
> > + * using of_alias_get_id() call with "pci-domain" as a stem. If that
> > + * fails, a local allocator will be used. The local allocator can
> > + * be requested to return a new domain_nr if the information is missing
> > + * from the device tree.
> > + *
> > + * @node: device tree node with the domain information
> > + * @allocate_if_missing: if DT lacks information about the domain nr,
> > + * allocate a new number.
> > + *
> > + * Returns the associated domain number from DT, or a new domain number
> > + * if DT information is missing and @allocate_if_missing is true. If
> > + * @allocate_if_missing is false then the last allocated domain number
> > + * will be returned.
> > + */
> > +int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
> > +{
> > +       int domain;
> > +
> > +       domain = of_alias_get_id(node, "pci-domain");
> > +       if (domain == -ENODEV) {
> > +               if (allocate_if_missing)
> > +                       domain = atomic_inc_return(&of_domain_nr);
> > +               else
> > +                       domain = atomic_read(&of_domain_nr);
> 
> This function seems a bit broken to me. It is overloaded with too many
> different outcomes. Think about how this would work if you have
> multiple PCI buses and a mixture of having pci-domain aliases or not.
> Aren't domain numbers global? Allocation should then start outside of
> the range of alias ids.
> 
> Rob
> 

Rob,

Would this version make more sense?

int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
{
       int domain;

       domain = of_alias_get_id(node, "pci-domain");
       if (domain == -ENODEV) {
               if (allocate_if_missing)
                       domain = atomic_inc_return(&of_domain_nr);
               else
                       domain = atomic_read(&of_domain_nr);
       } else {
               /* remember the largest value seen */
               int d = atomic_read(&of_domain_nr);
               atomic_set(&of_domain_nr, max(domain, d));
       }

       return domain;
}
EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);

It would still create gaps and possible duplicates, but this is just a number
and trying to create a new root bus in an existing domain should fail. I have
no clue on how to generate unique values without parsing the DT and filling
a sparse array with values found there and then checking for allocated values
on new requests. This function gets called quite a lot and I'm trying not to
make it too heavy weight.


Best regards,
Liviu
Rob Herring Sept. 8, 2014, 3:27 p.m. UTC | #3
On Mon, Sep 8, 2014 at 9:54 AM, Liviu Dudau <Liviu.Dudau@arm.com> wrote:
> On Mon, Sep 08, 2014 at 03:27:56PM +0100, Rob Herring wrote:
>> On Mon, Sep 8, 2014 at 8:54 AM, Liviu Dudau <Liviu.Dudau@arm.com> wrote:
>> > Add of_pci_get_domain_nr() to retrieve the PCI domain number
>> > of a given device from DT. If the information is not present,
>> > the function can be requested to allocate a new domain number.
>> >
>> > Cc: Bjorn Helgaas <bhelgaas@google.com>
>> > Cc: Arnd Bergmann <arnd@arndb.de>
>> > Cc: Grant Likely <grant.likely@linaro.org>
>> > Cc: Rob Herring <robh+dt@kernel.org>
>> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>> > Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
>> > ---
>> >  drivers/of/of_pci.c    | 34 ++++++++++++++++++++++++++++++++++
>> >  include/linux/of_pci.h |  7 +++++++
>> >  2 files changed, 41 insertions(+)
>> >
>> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
>> > index 8481996..a107edb 100644
>> > --- a/drivers/of/of_pci.c
>> > +++ b/drivers/of/of_pci.c
>> > @@ -89,6 +89,40 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
>> >  }
>> >  EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
>> >
>> > +static atomic_t of_domain_nr = ATOMIC_INIT(-1);
>> > +
>> > +/**
>> > + * This function will try to obtain the host bridge domain number by
>> > + * using of_alias_get_id() call with "pci-domain" as a stem. If that
>> > + * fails, a local allocator will be used. The local allocator can
>> > + * be requested to return a new domain_nr if the information is missing
>> > + * from the device tree.
>> > + *
>> > + * @node: device tree node with the domain information
>> > + * @allocate_if_missing: if DT lacks information about the domain nr,
>> > + * allocate a new number.
>> > + *
>> > + * Returns the associated domain number from DT, or a new domain number
>> > + * if DT information is missing and @allocate_if_missing is true. If
>> > + * @allocate_if_missing is false then the last allocated domain number
>> > + * will be returned.
>> > + */
>> > +int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
>> > +{
>> > +       int domain;
>> > +
>> > +       domain = of_alias_get_id(node, "pci-domain");
>> > +       if (domain == -ENODEV) {
>> > +               if (allocate_if_missing)
>> > +                       domain = atomic_inc_return(&of_domain_nr);
>> > +               else
>> > +                       domain = atomic_read(&of_domain_nr);
>>
>> This function seems a bit broken to me. It is overloaded with too many
>> different outcomes. Think about how this would work if you have
>> multiple PCI buses and a mixture of having pci-domain aliases or not.
>> Aren't domain numbers global? Allocation should then start outside of
>> the range of alias ids.
>>
>> Rob
>>
>
> Rob,
>
> Would this version make more sense?

No.

> int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
> {
>        int domain;
>
>        domain = of_alias_get_id(node, "pci-domain");
>        if (domain == -ENODEV) {
>                if (allocate_if_missing)
>                        domain = atomic_inc_return(&of_domain_nr);
>                else
>                        domain = atomic_read(&of_domain_nr);
>        } else {
>                /* remember the largest value seen */
>                int d = atomic_read(&of_domain_nr);
>                atomic_set(&of_domain_nr, max(domain, d));
>        }
>
>        return domain;
> }
> EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);
>
> It would still create gaps and possible duplicates, but this is just a number
> and trying to create a new root bus in an existing domain should fail. I have

Is failure okay in that case?

> no clue on how to generate unique values without parsing the DT and filling
> a sparse array with values found there and then checking for allocated values

You really only need to know the maximum value and then start the
non-DT allocations from there.

> on new requests. This function gets called quite a lot and I'm trying not to
> make it too heavy weight.

Generally, nothing should be accessing the same DT value frequently.
It should get cached somewhere.

I don't really understand how domains are used so it's hard to provide
a recommendation here. Do domains even belong in the DT? This function
is just a weird mixture of data retrieval and allocation. I think you
need to separate it into 2 functions.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liviu Dudau Sept. 8, 2014, 3:59 p.m. UTC | #4
On Mon, Sep 08, 2014 at 04:27:36PM +0100, Rob Herring wrote:
> On Mon, Sep 8, 2014 at 9:54 AM, Liviu Dudau <Liviu.Dudau@arm.com> wrote:
> > On Mon, Sep 08, 2014 at 03:27:56PM +0100, Rob Herring wrote:
> >> On Mon, Sep 8, 2014 at 8:54 AM, Liviu Dudau <Liviu.Dudau@arm.com> wrote:
> >> > Add of_pci_get_domain_nr() to retrieve the PCI domain number
> >> > of a given device from DT. If the information is not present,
> >> > the function can be requested to allocate a new domain number.
> >> >
> >> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> >> > Cc: Arnd Bergmann <arnd@arndb.de>
> >> > Cc: Grant Likely <grant.likely@linaro.org>
> >> > Cc: Rob Herring <robh+dt@kernel.org>
> >> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> >> > Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
> >> > ---
> >> >  drivers/of/of_pci.c    | 34 ++++++++++++++++++++++++++++++++++
> >> >  include/linux/of_pci.h |  7 +++++++
> >> >  2 files changed, 41 insertions(+)
> >> >
> >> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> >> > index 8481996..a107edb 100644
> >> > --- a/drivers/of/of_pci.c
> >> > +++ b/drivers/of/of_pci.c
> >> > @@ -89,6 +89,40 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> >> >  }
> >> >  EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
> >> >
> >> > +static atomic_t of_domain_nr = ATOMIC_INIT(-1);
> >> > +
> >> > +/**
> >> > + * This function will try to obtain the host bridge domain number by
> >> > + * using of_alias_get_id() call with "pci-domain" as a stem. If that
> >> > + * fails, a local allocator will be used. The local allocator can
> >> > + * be requested to return a new domain_nr if the information is missing
> >> > + * from the device tree.
> >> > + *
> >> > + * @node: device tree node with the domain information
> >> > + * @allocate_if_missing: if DT lacks information about the domain nr,
> >> > + * allocate a new number.
> >> > + *
> >> > + * Returns the associated domain number from DT, or a new domain number
> >> > + * if DT information is missing and @allocate_if_missing is true. If
> >> > + * @allocate_if_missing is false then the last allocated domain number
> >> > + * will be returned.
> >> > + */
> >> > +int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
> >> > +{
> >> > +       int domain;
> >> > +
> >> > +       domain = of_alias_get_id(node, "pci-domain");
> >> > +       if (domain == -ENODEV) {
> >> > +               if (allocate_if_missing)
> >> > +                       domain = atomic_inc_return(&of_domain_nr);
> >> > +               else
> >> > +                       domain = atomic_read(&of_domain_nr);
> >>
> >> This function seems a bit broken to me. It is overloaded with too many
> >> different outcomes. Think about how this would work if you have
> >> multiple PCI buses and a mixture of having pci-domain aliases or not.
> >> Aren't domain numbers global? Allocation should then start outside of
> >> the range of alias ids.
> >>
> >> Rob
> >>
> >
> > Rob,
> >
> > Would this version make more sense?
> 
> No.
> 
> > int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
> > {
> >        int domain;
> >
> >        domain = of_alias_get_id(node, "pci-domain");
> >        if (domain == -ENODEV) {
> >                if (allocate_if_missing)
> >                        domain = atomic_inc_return(&of_domain_nr);
> >                else
> >                        domain = atomic_read(&of_domain_nr);
> >        } else {
> >                /* remember the largest value seen */
> >                int d = atomic_read(&of_domain_nr);
> >                atomic_set(&of_domain_nr, max(domain, d));
> >        }
> >
> >        return domain;
> > }
> > EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);
> >
> > It would still create gaps and possible duplicates, but this is just a number
> > and trying to create a new root bus in an existing domain should fail. I have
> 
> Is failure okay in that case?

Well, you would get the failure at development time and hopefully fix the DT to
eliminate sparseness of domain numbers.

> 
> > no clue on how to generate unique values without parsing the DT and filling
> > a sparse array with values found there and then checking for allocated values
> 
> You really only need to know the maximum value and then start the
> non-DT allocations from there.
> 
> > on new requests. This function gets called quite a lot and I'm trying not to
> > make it too heavy weight.
> 
> Generally, nothing should be accessing the same DT value frequently.
> It should get cached somewhere.
> 

The problem appears for DTs that don't have the pci-domain info. Then the cached
value is left at the default non-valid value and attempts to rescan the DT will
be made every time the function is called.

> I don't really understand how domains are used so it's hard to provide
> a recommendation here. Do domains even belong in the DT?

ACPI calls them segments and the way Bjorn explained it to me at some moment was
that it was an attempt to split up a bus in different groups (or alternatively,
merge a few busses together). To be honest I haven't seen systems where the domain
is anything other than zero, but JasonG (or maybe Benjamin) were floating an
idea of using the domain number to identify physical slots.

> This function
> is just a weird mixture of data retrieval and allocation. I think you
> need to separate it into 2 functions.

It is meant to do allocation with the retrieval being a short-cut (or fine
control if you want).

I need to think a bit more for a better solution.

Best regards,
Liviu

> 
> Rob
>
Jason Gunthorpe Sept. 8, 2014, 4:39 p.m. UTC | #5
On Mon, Sep 08, 2014 at 04:59:31PM +0100, Liviu Dudau wrote:

> > I don't really understand how domains are used so it's hard to provide
> > a recommendation here. Do domains even belong in the DT?
> 
> ACPI calls them segments and the way Bjorn explained it to me at some moment was
> that it was an attempt to split up a bus in different groups (or alternatively,
> merge a few busses together). To be honest I haven't seen systems where the domain
> is anything other than zero, but JasonG (or maybe Benjamin) were floating an
> idea of using the domain number to identify physical slots.

A PCI domain qualifies the bus:device.function addressing so that we
can have duplicate BDFs in the system.

So, yes, they belong in DT - each 'top level' PCI node naturally
represents a unique set of BDF addressing below it, and could alias
other top level nodes. The first step to resolve a BDF to a DT node is
to find the correct 'top level' by mapping the domain number.

In an ideal world no small scale system should ever have domains. They
were primarily introduced for large scale NUMA system that actually
have more than 256 PCI busses.

However, from a DT prespective, we've been saying that if the SOC
presents a PCI-E root port that shares nothing with other root ports
(ie aperture, driver instance, config space) then those ports must be
represented by unique 'top level' DT nodes, and each DT node must be
assigned to a Linux PCI domain.

IMHO, designing such a SOC ignores the API guidelines provides by the
PCI-E spec itself, and I hope such a thing won't be considered SBSA
compatible..

Jason
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
wangyijing Sept. 9, 2014, 5:54 a.m. UTC | #6
>>> on new requests. This function gets called quite a lot and I'm trying not to
>>> make it too heavy weight.
>>
>> Generally, nothing should be accessing the same DT value frequently.
>> It should get cached somewhere.
>>
> 
> The problem appears for DTs that don't have the pci-domain info. Then the cached
> value is left at the default non-valid value and attempts to rescan the DT will
> be made every time the function is called.
> 
>> I don't really understand how domains are used so it's hard to provide
>> a recommendation here. Do domains even belong in the DT?
> 
> ACPI calls them segments and the way Bjorn explained it to me at some moment was
> that it was an attempt to split up a bus in different groups (or alternatively,
> merge a few busses together). To be honest I haven't seen systems where the domain
> is anything other than zero, but JasonG (or maybe Benjamin) were floating an
> idea of using the domain number to identify physical slots.

PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
a auto increment domain value.

PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
......
	status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
				       &segment);
	if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
		dev_err(&device->dev,  "can't evaluate _SEG\n");
		result = -ENODEV;
		goto end;
	}
.......

Thanks!
Yijing.

> 
>> This function
>> is just a weird mixture of data retrieval and allocation. I think you
>> need to separate it into 2 functions.
> 
> It is meant to do allocation with the retrieval being a short-cut (or fine
> control if you want).
> 
> I need to think a bit more for a better solution.
> 
> Best regards,
> Liviu
> 
>>
>> Rob
>>
>
Liviu Dudau Sept. 9, 2014, 8:46 a.m. UTC | #7
On Tue, Sep 09, 2014 at 06:54:21AM +0100, Yijing Wang wrote:
> >>> on new requests. This function gets called quite a lot and I'm trying not to
> >>> make it too heavy weight.
> >>
> >> Generally, nothing should be accessing the same DT value frequently.
> >> It should get cached somewhere.
> >>
> > 
> > The problem appears for DTs that don't have the pci-domain info. Then the cached
> > value is left at the default non-valid value and attempts to rescan the DT will
> > be made every time the function is called.
> > 
> >> I don't really understand how domains are used so it's hard to provide
> >> a recommendation here. Do domains even belong in the DT?
> > 
> > ACPI calls them segments and the way Bjorn explained it to me at some moment was
> > that it was an attempt to split up a bus in different groups (or alternatively,
> > merge a few busses together). To be honest I haven't seen systems where the domain
> > is anything other than zero, but JasonG (or maybe Benjamin) were floating an
> > idea of using the domain number to identify physical slots.
> 
> PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
> by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
> if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
> a auto increment domain value.

Because you can have more than one hostbridge (rare, but not impossible) and unless you
want to join the two segments, you might want to give it a different domain.

Best regards,
Liviu

> 
> PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
> ......
> 	status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
> 				       &segment);
> 	if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
> 		dev_err(&device->dev,  "can't evaluate _SEG\n");
> 		result = -ENODEV;
> 		goto end;
> 	}
> .......
> 
> Thanks!
> Yijing.
> 
> > 
> >> This function
> >> is just a weird mixture of data retrieval and allocation. I think you
> >> need to separate it into 2 functions.
> > 
> > It is meant to do allocation with the retrieval being a short-cut (or fine
> > control if you want).
> > 
> > I need to think a bit more for a better solution.
> > 
> > Best regards,
> > Liviu
> > 
> >>
> >> Rob
> >>
> > 
> 
> 
> -- 
> Thanks!
> Yijing
> 
>
Arnd Bergmann Sept. 9, 2014, 9:16 a.m. UTC | #8
On Tuesday 09 September 2014 09:46:21 Liviu Dudau wrote:
> On Tue, Sep 09, 2014 at 06:54:21AM +0100, Yijing Wang wrote:
> > >>> on new requests. This function gets called quite a lot and I'm trying not to
> > >>> make it too heavy weight.
> > >>
> > >> Generally, nothing should be accessing the same DT value frequently.
> > >> It should get cached somewhere.
> > >>
> > > 
> > > The problem appears for DTs that don't have the pci-domain info. Then the cached
> > > value is left at the default non-valid value and attempts to rescan the DT will
> > > be made every time the function is called.
> > > 
> > >> I don't really understand how domains are used so it's hard to provide
> > >> a recommendation here. Do domains even belong in the DT?
> > > 
> > > ACPI calls them segments and the way Bjorn explained it to me at some moment was
> > > that it was an attempt to split up a bus in different groups (or alternatively,
> > > merge a few busses together). To be honest I haven't seen systems where the domain
> > > is anything other than zero, but JasonG (or maybe Benjamin) were floating an
> > > idea of using the domain number to identify physical slots.
> > 
> > PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
> > by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
> > if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
> > a auto increment domain value.
> 
> Because you can have more than one hostbridge (rare, but not impossible) and unless you
> want to join the two segments, you might want to give it a different domain.

I think you misunderstood the question. The difference is that in ACPI you
are required to specify the domain, while in DT it is optional with your
implementation.

I think in general it would be nice if we could mandate that in DT you also
have to always provide a domain number, however the problem is that we can't
change the existing DTB files that people are using that do not specify a
domain.

We could possibly make this an architecture specific setting though and
mandate that all ARM64 platforms have to set it, while ARM32 does not need
it.

	Arnd

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
wangyijing Sept. 9, 2014, 9:30 a.m. UTC | #9
On 2014/9/9 16:46, Liviu Dudau wrote:
> On Tue, Sep 09, 2014 at 06:54:21AM +0100, Yijing Wang wrote:
>>>>> on new requests. This function gets called quite a lot and I'm trying not to
>>>>> make it too heavy weight.
>>>>
>>>> Generally, nothing should be accessing the same DT value frequently.
>>>> It should get cached somewhere.
>>>>
>>>
>>> The problem appears for DTs that don't have the pci-domain info. Then the cached
>>> value is left at the default non-valid value and attempts to rescan the DT will
>>> be made every time the function is called.
>>>
>>>> I don't really understand how domains are used so it's hard to provide
>>>> a recommendation here. Do domains even belong in the DT?
>>>
>>> ACPI calls them segments and the way Bjorn explained it to me at some moment was
>>> that it was an attempt to split up a bus in different groups (or alternatively,
>>> merge a few busses together). To be honest I haven't seen systems where the domain
>>> is anything other than zero, but JasonG (or maybe Benjamin) were floating an
>>> idea of using the domain number to identify physical slots.
>>
>> PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
>> by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
>> if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
>> a auto increment domain value.
> 
> Because you can have more than one hostbridge (rare, but not impossible) and unless you
> want to join the two segments, you might want to give it a different domain.

OK. Sorry, I have one last question, because domain will be used to calculate the address used to
access PCI hardware config registers. So if DTS file doesn't report the domain, how do we know
we would access the right registers when we use the auto increment domain vaule ?
Has there a mechanism to make sure system can access the correct registers by the domain ?

Thanks!
Yijing.

> 
> Best regards,
> Liviu
> 
>>
>> PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
>> ......
>> 	status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
>> 				       &segment);
>> 	if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
>> 		dev_err(&device->dev,  "can't evaluate _SEG\n");
>> 		result = -ENODEV;
>> 		goto end;
>> 	}
>> .......
>>
>> Thanks!
>> Yijing.
>>
>>>
>>>> This function
>>>> is just a weird mixture of data retrieval and allocation. I think you
>>>> need to separate it into 2 functions.
>>>
>>> It is meant to do allocation with the retrieval being a short-cut (or fine
>>> control if you want).
>>>
>>> I need to think a bit more for a better solution.
>>>
>>> Best regards,
>>> Liviu
>>>
>>>>
>>>> Rob
>>>>
>>>
>>
>>
>> -- 
>> Thanks!
>> Yijing
>>
>>
>
Catalin Marinas Sept. 9, 2014, 11:20 a.m. UTC | #10
On Tue, Sep 09, 2014 at 10:16:04AM +0100, Arnd Bergmann wrote:
> On Tuesday 09 September 2014 09:46:21 Liviu Dudau wrote:
> > On Tue, Sep 09, 2014 at 06:54:21AM +0100, Yijing Wang wrote:
> > > >>> on new requests. This function gets called quite a lot and I'm trying not to
> > > >>> make it too heavy weight.
> > > >>
> > > >> Generally, nothing should be accessing the same DT value frequently.
> > > >> It should get cached somewhere.
> > > >>
> > > > 
> > > > The problem appears for DTs that don't have the pci-domain info. Then the cached
> > > > value is left at the default non-valid value and attempts to rescan the DT will
> > > > be made every time the function is called.
> > > > 
> > > >> I don't really understand how domains are used so it's hard to provide
> > > >> a recommendation here. Do domains even belong in the DT?
> > > > 
> > > > ACPI calls them segments and the way Bjorn explained it to me at some moment was
> > > > that it was an attempt to split up a bus in different groups (or alternatively,
> > > > merge a few busses together). To be honest I haven't seen systems where the domain
> > > > is anything other than zero, but JasonG (or maybe Benjamin) were floating an
> > > > idea of using the domain number to identify physical slots.
> > > 
> > > PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
> > > by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
> > > if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
> > > a auto increment domain value.
> > 
> > Because you can have more than one hostbridge (rare, but not impossible) and unless you
> > want to join the two segments, you might want to give it a different domain.
> 
> I think you misunderstood the question. The difference is that in ACPI you
> are required to specify the domain, while in DT it is optional with your
> implementation.
> 
> I think in general it would be nice if we could mandate that in DT you also
> have to always provide a domain number, however the problem is that we can't
> change the existing DTB files that people are using that do not specify a
> domain.
> 
> We could possibly make this an architecture specific setting though and
> mandate that all ARM64 platforms have to set it, while ARM32 does not need
> it.

We can assume that if a domain is not specified and there is a single
top level PCIe node, the domain defaults to 0. Are there any arm32
platforms that require multiple domains (and do not specify a number in
the DT)?
Liviu Dudau Sept. 9, 2014, 2:11 p.m. UTC | #11
On Tue, Sep 09, 2014 at 10:30:51AM +0100, Yijing Wang wrote:
> On 2014/9/9 16:46, Liviu Dudau wrote:
> > On Tue, Sep 09, 2014 at 06:54:21AM +0100, Yijing Wang wrote:
> >>>>> on new requests. This function gets called quite a lot and I'm trying not to
> >>>>> make it too heavy weight.
> >>>>
> >>>> Generally, nothing should be accessing the same DT value frequently.
> >>>> It should get cached somewhere.
> >>>>
> >>>
> >>> The problem appears for DTs that don't have the pci-domain info. Then the cached
> >>> value is left at the default non-valid value and attempts to rescan the DT will
> >>> be made every time the function is called.
> >>>
> >>>> I don't really understand how domains are used so it's hard to provide
> >>>> a recommendation here. Do domains even belong in the DT?
> >>>
> >>> ACPI calls them segments and the way Bjorn explained it to me at some moment was
> >>> that it was an attempt to split up a bus in different groups (or alternatively,
> >>> merge a few busses together). To be honest I haven't seen systems where the domain
> >>> is anything other than zero, but JasonG (or maybe Benjamin) were floating an
> >>> idea of using the domain number to identify physical slots.
> >>
> >> PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
> >> by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
> >> if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
> >> a auto increment domain value.
> > 
> > Because you can have more than one hostbridge (rare, but not impossible) and unless you
> > want to join the two segments, you might want to give it a different domain.
> 
> OK. Sorry, I have one last question, because domain will be used to calculate the address used to
> access PCI hardware config registers. So if DTS file doesn't report the domain, how do we know
> we would access the right registers when we use the auto increment domain vaule ?

That's a good question and sides with Arnd's suggestion to try to mandate the presence of the PCI
domain in the DTS. However, by grepping through the source code, it looks like the architectures
that use the domain number for reading config registers (x86-based) are non-DT architectures,
while DT-aware arches seem to ignore the domain number except when printing out messages. Is that
another confirmation that most DT-aware architectures have only run with domain_nr = 0?


> Has there a mechanism to make sure system can access the correct registers by the domain ?

Not as such if you look with x86 glasses. With the exception of powerpc all other architecures
seem to happily assume domain_nr = 0 and ignore it in the computation of configuration registers
offsets.

Best regards,
Liviu

> 
> Thanks!
> Yijing.
> 
> > 
> > Best regards,
> > Liviu
> > 
> >>
> >> PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
> >> ......
> >> 	status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
> >> 				       &segment);
> >> 	if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
> >> 		dev_err(&device->dev,  "can't evaluate _SEG\n");
> >> 		result = -ENODEV;
> >> 		goto end;
> >> 	}
> >> .......
> >>
> >> Thanks!
> >> Yijing.
> >>
> >>>
> >>>> This function
> >>>> is just a weird mixture of data retrieval and allocation. I think you
> >>>> need to separate it into 2 functions.
> >>>
> >>> It is meant to do allocation with the retrieval being a short-cut (or fine
> >>> control if you want).
> >>>
> >>> I need to think a bit more for a better solution.
> >>>
> >>> Best regards,
> >>> Liviu
> >>>
> >>>>
> >>>> Rob
> >>>>
> >>>
> >>
> >>
> >> -- 
> >> Thanks!
> >> Yijing
> >>
> >>
> > 
> 
> 
> -- 
> Thanks!
> Yijing
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Bjorn Helgaas Sept. 9, 2014, 2:17 p.m. UTC | #12
On Tue, Sep 9, 2014 at 3:16 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tuesday 09 September 2014 09:46:21 Liviu Dudau wrote:
>> On Tue, Sep 09, 2014 at 06:54:21AM +0100, Yijing Wang wrote:
>> > PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
>> > by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
>> > if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
>> > a auto increment domain value.
>>
>> Because you can have more than one hostbridge (rare, but not impossible) and unless you
>> want to join the two segments, you might want to give it a different domain.
>
> I think you misunderstood the question. The difference is that in ACPI you
> are required to specify the domain, while in DT it is optional with your
> implementation.

_SEG is actually optional, but the spec (ACPI r5.0, sec 6.5.6) says to
assume domain 0 if _SEG is absent.

Yijing mentioned ia64, which is relevant because PCI config accesses
on ia64 are done via a firmware (SAL) interface, and the domain is
part of that interface, so the kernel is actually required to supply
the correct domain number (0 or _SEG value) for each host bridge.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Bjorn Helgaas Sept. 9, 2014, 2:26 p.m. UTC | #13
On Tue, Sep 9, 2014 at 3:30 AM, Yijing Wang <wangyijing@huawei.com> wrote:
> On 2014/9/9 16:46, Liviu Dudau wrote:
>> On Tue, Sep 09, 2014 at 06:54:21AM +0100, Yijing Wang wrote:
>>>>>> on new requests. This function gets called quite a lot and I'm trying not to
>>>>>> make it too heavy weight.
>>>>>
>>>>> Generally, nothing should be accessing the same DT value frequently.
>>>>> It should get cached somewhere.
>>>>>
>>>>
>>>> The problem appears for DTs that don't have the pci-domain info. Then the cached
>>>> value is left at the default non-valid value and attempts to rescan the DT will
>>>> be made every time the function is called.
>>>>
>>>>> I don't really understand how domains are used so it's hard to provide
>>>>> a recommendation here. Do domains even belong in the DT?
>>>>
>>>> ACPI calls them segments and the way Bjorn explained it to me at some moment was
>>>> that it was an attempt to split up a bus in different groups (or alternatively,
>>>> merge a few busses together). To be honest I haven't seen systems where the domain
>>>> is anything other than zero, but JasonG (or maybe Benjamin) were floating an
>>>> idea of using the domain number to identify physical slots.
>>>
>>> PCI domain(or named segment) is provided by firmware, in ACPI system, we evaluated it
>>> by method "_SEG". in IA64 with ACPI, PCI hostbridge driver retrieves the domain from ACPI,
>>> if it's absent, the default domain is zero. So I wonder why in DTS, if it's absent, we get
>>> a auto increment domain value.
>>
>> Because you can have more than one hostbridge (rare, but not impossible) and unless you
>> want to join the two segments, you might want to give it a different domain.
>
> OK. Sorry, I have one last question, because domain will be used to calculate the address used to
> access PCI hardware config registers. So if DTS file doesn't report the domain, how do we know
> we would access the right registers when we use the auto increment domain vaule ?
> Has there a mechanism to make sure system can access the correct registers by the domain ?

I think you're referring to config access via ECAM (PCIe r3.0, sec
7.2.2).  In that case, I don't think the domain should be used to
compute the memory-mapped configuration address.  Each host bridge is
in exactly one domain, and each host bridge has an associated ECAM
area base.  The address calculation uses the bus number, device
number, function number, and register number to compute an offset into
the ECAM area.

So as long as the DT tells you the ECAM information for each host
bridge, that should be sufficient.  The domain number is then just a
Linux convenience and is not tied to the platform as it is on ia64.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Sept. 9, 2014, 3:41 p.m. UTC | #14
On Tue, Sep 09, 2014 at 08:26:19AM -0600, Bjorn Helgaas wrote:

> So as long as the DT tells you the ECAM information for each host
> bridge, that should be sufficient.  The domain number is then just a
> Linux convenience and is not tied to the platform as it is on ia64.

I think this is right for DT systems - the domain is purely internal
to the kernel and userspace, it is used to locate the proper host
bridge driver instance, which contains the proper config accessor (and
register bases, etc).

AFAIK the main reason to have a DT alias to learn the domain number is
to make it stable so things like udev/etc can reliably match on the
PCI location.

This is similar to i2c, etc that use the alias scheme, so IMHO
whatever they do to assign ID's to drivers should be copied for domain
numbers.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
wangyijing Sept. 10, 2014, 1:44 a.m. UTC | #15
>> OK. Sorry, I have one last question, because domain will be used to calculate the address used to
>> access PCI hardware config registers. So if DTS file doesn't report the domain, how do we know
>> we would access the right registers when we use the auto increment domain vaule ?
> 
> That's a good question and sides with Arnd's suggestion to try to mandate the presence of the PCI
> domain in the DTS. However, by grepping through the source code, it looks like the architectures
> that use the domain number for reading config registers (x86-based) are non-DT architectures,
> while DT-aware arches seem to ignore the domain number except when printing out messages. Is that
> another confirmation that most DT-aware architectures have only run with domain_nr = 0?
> 

Arnd's suggestion is make sense to me, thanks for Bjorn's detailed explanation, now I know domain_nr
is purely internal to kernel in DT-aware platform, it's not needed when access PCI config space.

Thanks!
Yijing.

> 
>> Has there a mechanism to make sure system can access the correct registers by the domain ?
> 
> Not as such if you look with x86 glasses. With the exception of powerpc all other architecures
> seem to happily assume domain_nr = 0 and ignore it in the computation of configuration registers
> offsets.
> 
> Best regards,
> Liviu
> 
>>
>> Thanks!
>> Yijing.
>>
>>>
>>> Best regards,
>>> Liviu
>>>
>>>>
>>>> PCI get domain by ACPI "_SEG" in IA64(drivers/acpi/pci_root.c)
>>>> ......
>>>> 	status = acpi_evaluate_integer(handle, METHOD_NAME__SEG, NULL,
>>>> 				       &segment);
>>>> 	if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
>>>> 		dev_err(&device->dev,  "can't evaluate _SEG\n");
>>>> 		result = -ENODEV;
>>>> 		goto end;
>>>> 	}
>>>> .......
>>>>
>>>> Thanks!
>>>> Yijing.
>>>>
>>>>>
>>>>>> This function
>>>>>> is just a weird mixture of data retrieval and allocation. I think you
>>>>>> need to separate it into 2 functions.
>>>>>
>>>>> It is meant to do allocation with the retrieval being a short-cut (or fine
>>>>> control if you want).
>>>>>
>>>>> I need to think a bit more for a better solution.
>>>>>
>>>>> Best regards,
>>>>> Liviu
>>>>>
>>>>>>
>>>>>> Rob
>>>>>>
>>>>>
>>>>
>>>>
>>>> -- 
>>>> Thanks!
>>>> Yijing
>>>>
>>>>
>>>
>>
>>
>> -- 
>> Thanks!
>> Yijing
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
Rob Herring Sept. 10, 2014, 2:44 a.m. UTC | #16
On Tue, Sep 9, 2014 at 10:41 AM, Jason Gunthorpe
<jgunthorpe@obsidianresearch.com> wrote:
> On Tue, Sep 09, 2014 at 08:26:19AM -0600, Bjorn Helgaas wrote:
>
>> So as long as the DT tells you the ECAM information for each host
>> bridge, that should be sufficient.  The domain number is then just a
>> Linux convenience and is not tied to the platform as it is on ia64.
>
> I think this is right for DT systems - the domain is purely internal
> to the kernel and userspace, it is used to locate the proper host
> bridge driver instance, which contains the proper config accessor (and
> register bases, etc).
>
> AFAIK the main reason to have a DT alias to learn the domain number is
> to make it stable so things like udev/etc can reliably match on the
> PCI location.

For what purpose?

> This is similar to i2c, etc that use the alias scheme, so IMHO
> whatever they do to assign ID's to drivers should be copied for domain
> numbers.

IMO they should not. We really want to move away from aliases, not
expand their use. They are used for serial because there was no good
way to not break things like "console=ttyS0". I2C I think was more
internal, but may have been for i2c-dev. What are we going to break if
we don't have consistent domain numbering? If the domain goes into the
DT, I'd rather see it as part of the PCI root node. But I'm not
convinced it is needed.

It doesn't really sound like we have any actual need to solve this for
DT ATM. It's not clear to me if all buses should be domain 0 or a
simple incrementing index for each bus in absence of any firmware set
value.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Sept. 10, 2014, 4:32 p.m. UTC | #17
On Tue, Sep 09, 2014 at 09:44:56PM -0500, Rob Herring wrote:
> On Tue, Sep 9, 2014 at 10:41 AM, Jason Gunthorpe
> <jgunthorpe@obsidianresearch.com> wrote:
> > On Tue, Sep 09, 2014 at 08:26:19AM -0600, Bjorn Helgaas wrote:
> >
> >> So as long as the DT tells you the ECAM information for each host
> >> bridge, that should be sufficient.  The domain number is then just a
> >> Linux convenience and is not tied to the platform as it is on ia64.
> >
> > I think this is right for DT systems - the domain is purely internal
> > to the kernel and userspace, it is used to locate the proper host
> > bridge driver instance, which contains the proper config accessor (and
> > register bases, etc).
> >
> > AFAIK the main reason to have a DT alias to learn the domain number is
> > to make it stable so things like udev/etc can reliably match on the
> > PCI location.
> 
> For what purpose?

udev places PCI D:B:D.F's all over the place, eg:

$ ls /dev/serial/by-path/
pci-0000:00:1a.0-usb-0:1.4.2.5:1.0@
pci-0000:00:1a.0-usb-0:1.4.2.6:1.0@
$ ls /dev/disk/by-path/
pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0@
pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0-part1@
pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0-part2@
pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0-part3@
    ^^^^
domain number

It is part of the stable naming scheme udev uses - and that scheme is
predicated on the PCI location being stable.

> IMO they should not. We really want to move away from aliases, not
> expand their use. They are used for serial because there was no good
> way to not break things like "console=ttyS0". I2C I think was more
> internal, but may have been for i2c-dev. What are we going to break if
> we don't have consistent domain numbering? 

Well, DT needs some kind of solution to produce stable names for
things. It is no good if the names unpredictably randomize.

Lets use I2C as an example. My embedded systems have multiple I2C
busses, with multiple drivers (so load order is not easily ensured).
I am relying on DT aliases to force the bus numbers to stable values,
but if I don't do that - then how do I find things in user space?

$ ls -l /sys/bus/i2c/devices
lrwxrwxrwx    1 root     root            0 Sep 10 16:12 i2c-2 -> ../../../devices/pci0000:00/0000:00:01.0/i2c_qsfp.4/i2c-2

Oh, I have to search based on the HW path, which includes the domain
number.

So that *must* be really stable, or user space has no hope of mapping
real hardware back to an I2C bus number.

The same is true for pretty much every small IDR scheme in the kernel
- they rely on the HW path for stable names.

As an aside, I think for embedded being able to directly specify
things like the bus number for I2C, ethX for ethernet, etc is very
valuable, I would be sad to see that go away.

> DT, I'd rather see it as part of the PCI root node. But I'm not
> convinced it is needed.
 
> It doesn't really sound like we have any actual need to solve this for
> DT ATM. It's not clear to me if all buses should be domain 0 or a
> simple incrementing index for each bus in absence of any firmware set
> value.

There are already ARM DTs in the kernel that require multiple domains,
so that ship has sailed.

I don't think it really matters where the number comes from, so long
as it doesn't change after a kernel upgrade, or with a different
module load order, or if the DT is recompiled, or whatever. It needs
to be stable.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Arnd Bergmann Sept. 10, 2014, 6:19 p.m. UTC | #18
On Tuesday 09 September 2014 12:20:54 Catalin Marinas wrote:
> 
> We can assume that if a domain is not specified and there is a single
> top level PCIe node, the domain defaults to 0. Are there any arm32
> platforms that require multiple domains (and do not specify a number in
> the DT)?

In theory, I think all of them could work with a single domain, but then
you need to partition the bus number space between the host controllers,
so you have the exact same situation that you either need to make up
random bus numbers or put them in DT.

Using multiple domains is way cleaner for this, even if we have to
make up the numbers.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Phil Edworthy Sept. 11, 2014, 2:11 p.m. UTC | #19
Hi,

On 10 September 2014 19:20, Arnd wrote:
> On Tuesday 09 September 2014 12:20:54 Catalin Marinas wrote:
> >
> > We can assume that if a domain is not specified and there is a single
> > top level PCIe node, the domain defaults to 0. Are there any arm32
> > platforms that require multiple domains (and do not specify a number in
> > the DT)?
> 
> In theory, I think all of them could work with a single domain, but then
> you need to partition the bus number space between the host controllers,
> so you have the exact same situation that you either need to make up
> random bus numbers or put them in DT.
> 
> Using multiple domains is way cleaner for this, even if we have to
> make up the numbers.

Maybe this is a stupid question, but why would you want to specify the domain
in the DT at all? Doesn't every instance of a driver imply a separate domain?

Thanks
Phil 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Arnd Bergmann Sept. 11, 2014, 2:49 p.m. UTC | #20
On Thursday 11 September 2014 14:11:05 Phil Edworthy wrote:
> On 10 September 2014 19:20, Arnd wrote:
> > On Tuesday 09 September 2014 12:20:54 Catalin Marinas wrote:
> > >
> > > We can assume that if a domain is not specified and there is a single
> > > top level PCIe node, the domain defaults to 0. Are there any arm32
> > > platforms that require multiple domains (and do not specify a number in
> > > the DT)?
> > 
> > In theory, I think all of them could work with a single domain, but then
> > you need to partition the bus number space between the host controllers,
> > so you have the exact same situation that you either need to make up
> > random bus numbers or put them in DT.
> > 
> > Using multiple domains is way cleaner for this, even if we have to
> > make up the numbers.
> 
> Maybe this is a stupid question, but why would you want to specify the domain
> in the DT at all? Doesn't every instance of a driver imply a separate domain?

See Jason Gunthorpe's latest reply on the topic. In short, the domain is
visible to user space and we want it to be stable across boots when
user configuration refers to devices by domain:bus:device:function
numbers.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 8481996..a107edb 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -89,6 +89,40 @@  int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
 }
 EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
 
+static atomic_t of_domain_nr = ATOMIC_INIT(-1);
+
+/**
+ * This function will try to obtain the host bridge domain number by
+ * using of_alias_get_id() call with "pci-domain" as a stem. If that
+ * fails, a local allocator will be used. The local allocator can
+ * be requested to return a new domain_nr if the information is missing
+ * from the device tree.
+ *
+ * @node: device tree node with the domain information
+ * @allocate_if_missing: if DT lacks information about the domain nr,
+ * allocate a new number.
+ *
+ * Returns the associated domain number from DT, or a new domain number
+ * if DT information is missing and @allocate_if_missing is true. If
+ * @allocate_if_missing is false then the last allocated domain number
+ * will be returned.
+ */
+int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
+{
+	int domain;
+
+	domain = of_alias_get_id(node, "pci-domain");
+	if (domain == -ENODEV) {
+		if (allocate_if_missing)
+			domain = atomic_inc_return(&of_domain_nr);
+		else
+			domain = atomic_read(&of_domain_nr);
+	}
+
+	return domain;
+}
+EXPORT_SYMBOL_GPL(of_pci_get_domain_nr);
+
 #ifdef CONFIG_PCI_MSI
 
 static LIST_HEAD(of_pci_msi_chip_list);
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index dde3a4a..3a3824c 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -15,6 +15,7 @@  struct device_node *of_pci_find_child_device(struct device_node *parent,
 int of_pci_get_devfn(struct device_node *np);
 int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
 int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
+int of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing);
 #else
 static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
 {
@@ -43,6 +44,12 @@  of_pci_parse_bus_range(struct device_node *node, struct resource *res)
 {
 	return -EINVAL;
 }
+
+static inline int
+of_pci_get_domain_nr(struct device_node *node, bool allocate_if_missing)
+{
+	return -1;
+}
 #endif
 
 #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)