diff mbox

[edk2,v2,2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup

Message ID 20161124205652.12113-3-lersek@redhat.com
State New
Headers show

Commit Message

Laszlo Ersek Nov. 24, 2016, 8:56 p.m. UTC
Sometimes a platform knows exactly how many CPUs it has at boot. It should
be able to
- set PcdCpuMaxLogicalProcessorNumber dynamically to this number,
- set PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0),
- and expect that MpInitLib wait exactly as long as necessary for all APs
  to report in.

Other platforms should be able to continue setting a reasonably large
upper bound on supported CPUs, and waiting for a reasonable, fixed amount
of time for all APs to report in.

Add this functionality. The TimedWaitForApFinish() function will return
when all APs have reported in, or the timeout has expired -- whichever
happens first.

(Accessing these PCDs dynamically is safe. The PEI and DXE phase instances
of this library are restricted to PEIM and DXE_DRIVER client modules, thus
the PCD accesses cannot be linked into runtime code.)

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>

---
 UefiCpuPkg/UefiCpuPkg.dec            |  3 +-
 UefiCpuPkg/Library/MpInitLib/MpLib.c | 67 +++++++++++++++++++-
 2 files changed, 68 insertions(+), 2 deletions(-)

-- 
2.9.2


_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Comments

Laszlo Ersek Nov. 28, 2016, 10:15 a.m. UTC | #1
On 11/28/16 06:37, Fan, Jeff wrote:
> Laszlo,

> 

> Setting PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0) is behavior changing. 

> 

> Even there is no any multi-processor supporting platform set this PCD to zero, this PCD maybe set to 0 on single processor platform.


On a uniprocessor platform, PcdCpuApInitTimeOutInMicroSeconds makes no
difference, since the code that uses the PCD is not reached, due to
commit 14e8137c8223 ("UefiCpuPkg/MpInitLib: Do not wakeup AP if only one
processor supported").

> I prefer to setting PcdCpuApInitTimeOutInMicroSeconds to 0xFFFFFFFF (about 1.1 hour) for such purpose.


In practice that is good enough for OVMF too, I guess.

However, I'm unsure how you would like me to update the patch. The patch
does not add any specific code for handling the 0 timeout as infinity.
Instead, it simply documents the existing behavior of CalculateTimeout()
and CheckTimeout(). Those helper functions are just right for
implementing the feature at hand, and they already handle a zero timeout
as infinity.

In fact, my original approach to implement TimedWaitForApFinish() was to
add brand new code for the timer loop, using the performance counter.
However then I noticed CalculateTimeout() and CheckTimeout(), and
thought that you would ask me to implement TimedWaitForApFinish() with
those two functions. This way the code reuse is good, and a zero timeout
happens to mean infinity (which is just right for OVMF, is not used by
any physical multiprocessor platform, and is irrelevant for uniprocessor).

... Hm, wait, do you mean that a uniprocessor platform might leave
PcdCpuMaxLogicalProcessorNumber > 1 (for example, the default 64), and
set *only* the timeout to zero (because they "know" there aren't more CPUs)?

If that's what you mean, I consider it a platform bug: if they know
there's only one CPU, they should set PcdCpuMaxLogicalProcessorNumber to
1 -- for saving memory.

Anyway, if you prefer, I can modify TimedWaitForApFinish() like this:
add an explicit check for TimeLimit==0 in the beginning, and return
immediately if that condition evaluates to true. (Plus, update the
comments, and modify patch #3 to use MAX_UINT32 as TimeLinit.)

What do you think?

Thanks
Laszlo

> 

> Thanks!

> Jeff

> 

> -----Original Message-----

> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek

> Sent: Friday, November 25, 2016 4:57 AM

> To: edk2-devel-01

> Cc: Kinney, Michael D; Fan, Jeff; Justen, Jordan L

> Subject: [edk2] [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup

> 

> Sometimes a platform knows exactly how many CPUs it has at boot. It should be able to

> - set PcdCpuMaxLogicalProcessorNumber dynamically to this number,

> - set PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0),

> - and expect that MpInitLib wait exactly as long as necessary for all APs

>   to report in.

> 

> Other platforms should be able to continue setting a reasonably large upper bound on supported CPUs, and waiting for a reasonable, fixed amount of time for all APs to report in.

> 

> Add this functionality. The TimedWaitForApFinish() function will return when all APs have reported in, or the timeout has expired -- whichever happens first.

> 

> (Accessing these PCDs dynamically is safe. The PEI and DXE phase instances of this library are restricted to PEIM and DXE_DRIVER client modules, thus the PCD accesses cannot be linked into runtime code.)

> 

> Cc: Igor Mammedov <imammedo@redhat.com>

> Cc: Jeff Fan <jeff.fan@intel.com>

> Cc: Jordan Justen <jordan.l.justen@intel.com>

> Cc: Michael Kinney <michael.d.kinney@intel.com>

> Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116

> Contributed-under: TianoCore Contribution Agreement 1.0

> Signed-off-by: Laszlo Ersek <lersek@redhat.com>

> ---

>  UefiCpuPkg/UefiCpuPkg.dec            |  3 +-

>  UefiCpuPkg/Library/MpInitLib/MpLib.c | 67 +++++++++++++++++++-

>  2 files changed, 68 insertions(+), 2 deletions(-)

> 

> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec index ca560398bbef..8607ae8b7ae8 100644

> --- a/UefiCpuPkg/UefiCpuPkg.dec

> +++ b/UefiCpuPkg/UefiCpuPkg.dec

> @@ -174,7 +174,8 @@ [PcdsFixedAtBuild, PcdsPatchableInModule, PcdsDynamic, PcdsDynamicEx]

>    # @Prompt Configure max supported number of Logical Processors

>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64|UINT32|0x00000002

>    ## Specifies timeout value in microseconds for the BSP to detect all APs for the first time.

> -  # @Prompt Timeout for the BSP to detect all APs for the first time.

> +  #  When set to zero, the BSP will wait forever for all (PcdCpuMaxLogicalProcessorNumber-1) APs to report in.

> +  # @Prompt Timeout for the BSP to detect all APs for the first time (zero means await maximum supported AP count).

>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000|UINT32|0x00000004

>    ## Specifies the base address of the first microcode Patch in the microcode Region.

>    # @Prompt Microcode Region base address.

> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c

> index 15dbfa1e7d6c..76de92fa552a 100644

> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c

> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c

> @@ -684,6 +684,22 @@ FillExchangeInfoData (  }

>  

>  /**

> +  Helper function that waits until the finished AP count reaches the 

> + specified  limit, or the specified timeout elapses (whichever comes first).

> +

> +  @param[in] CpuMpData        Pointer to CPU MP Data.

> +  @param[in] FinishedApLimit  The number of finished APs to wait for.

> +  @param[in] TimeLimit        The number of microseconds to wait for. Zero

> +                              means infinity.

> +**/

> +VOID

> +TimedWaitForApFinish (

> +  IN CPU_MP_DATA               *CpuMpData,

> +  IN UINT32                    FinishedApLimit,

> +  IN UINT32                    TimeLimit

> +  );

> +

> +/**

>    This function will be called by BSP to wakeup AP.

>  

>    @param[in] CpuMpData          Pointer to CPU MP Data

> @@ -748,7 +764,11 @@ WakeUpAP (

>        //

>        // Wait for all potential APs waken up in one specified period

>        //

> -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));

> +      TimedWaitForApFinish (

> +        CpuMpData,

> +        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,

> +        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)

> +        );

>      } else {

>        //

>        // Wait all APs waken up if this is not the 1st broadcast of SIPI @@ -895,6 +915,51 @@ CheckTimeout (  }

>  

>  /**

> +  Helper function that waits until the finished AP count reaches the 

> + specified  limit, or the specified timeout elapses (whichever comes first).

> +

> +  @param[in] CpuMpData        Pointer to CPU MP Data.

> +  @param[in] FinishedApLimit  The number of finished APs to wait for.

> +  @param[in] TimeLimit        The number of microseconds to wait for. Zero

> +                              means infinity.

> +**/

> +VOID

> +TimedWaitForApFinish (

> +  IN CPU_MP_DATA               *CpuMpData,

> +  IN UINT32                    FinishedApLimit,

> +  IN UINT32                    TimeLimit

> +  )

> +{

> +  CpuMpData->TotalTime = 0;

> +  CpuMpData->ExpectedTime = CalculateTimeout (

> +                              TimeLimit,

> +                              &CpuMpData->CurrentTime

> +                              );

> +  while (CpuMpData->FinishedCount < FinishedApLimit &&

> +         !CheckTimeout (

> +            &CpuMpData->CurrentTime,

> +            &CpuMpData->TotalTime,

> +            CpuMpData->ExpectedTime

> +            )) {

> +    CpuPause ();

> +  }

> +

> +  if (CpuMpData->FinishedCount >= FinishedApLimit) {

> +    DEBUG ((

> +      DEBUG_VERBOSE,

> +      "%a: reached FinishedApLimit=%u in %Lu microseconds\n",

> +      __FUNCTION__,

> +      FinishedApLimit,

> +      DivU64x64Remainder (

> +        MultU64x32 (CpuMpData->TotalTime, 1000000),

> +        GetPerformanceCounterProperties (NULL, NULL),

> +        NULL

> +        )

> +      ));

> +  }

> +}

> +

> +/**

>    Reset an AP to Idle state.

>  

>    Any task being executed by the AP will be aborted and the AP

> --

> 2.9.2

> 

> 

> _______________________________________________

> edk2-devel mailing list

> edk2-devel@lists.01.org

> https://lists.01.org/mailman/listinfo/edk2-devel

> 


_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Igor Mammedov Nov. 28, 2016, 11:58 a.m. UTC | #2
On Mon, 28 Nov 2016 11:15:15 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> On 11/28/16 06:37, Fan, Jeff wrote:

> > Laszlo,

> > 

> > Setting PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0) is behavior changing. 

> > 

> > Even there is no any multi-processor supporting platform set this PCD to zero, this PCD maybe set to 0 on single processor platform.  

> 

> On a uniprocessor platform, PcdCpuApInitTimeOutInMicroSeconds makes no

> difference, since the code that uses the PCD is not reached, due to

> commit 14e8137c8223 ("UefiCpuPkg/MpInitLib: Do not wakeup AP if only one

> processor supported").

> 

> > I prefer to setting PcdCpuApInitTimeOutInMicroSeconds to 0xFFFFFFFF (about 1.1 hour) for such purpose.  

> 

> In practice that is good enough for OVMF too, I guess.

> 

> However, I'm unsure how you would like me to update the patch. The patch

> does not add any specific code for handling the 0 timeout as infinity.

> Instead, it simply documents the existing behavior of CalculateTimeout()

> and CheckTimeout(). Those helper functions are just right for

> implementing the feature at hand, and they already handle a zero timeout

> as infinity.

> 

> In fact, my original approach to implement TimedWaitForApFinish() was to

> add brand new code for the timer loop, using the performance counter.

> However then I noticed CalculateTimeout() and CheckTimeout(), and

> thought that you would ask me to implement TimedWaitForApFinish() with

> those two functions. This way the code reuse is good, and a zero timeout

> happens to mean infinity (which is just right for OVMF, is not used by

> any physical multiprocessor platform, and is irrelevant for uniprocessor).

> 

> ... Hm, wait, do you mean that a uniprocessor platform might leave

> PcdCpuMaxLogicalProcessorNumber > 1 (for example, the default 64), and

> set *only* the timeout to zero (because they "know" there aren't more CPUs)?

> 

> If that's what you mean, I consider it a platform bug: if they know

> there's only one CPU, they should set PcdCpuMaxLogicalProcessorNumber to

> 1 -- for saving memory.

In my understanding PcdCpuMaxLogicalProcessorNumber is max possible CPUs
number and not currently present CPUs which platform might not know about
in advance, so it sends broadcast INIT/SIPI to find out how many CPUs are
present upto PcdCpuMaxLogicalProcessorNumber limit.

It's just we are cannibalizing PcdCpuMaxLogicalProcessorNumber for present
CPUs number as it were suggested in v1 comments to make some progress
and fix OVMF crash when booting with more than 64 CPU
and optimize memory usage/boot and resume time.

I'd prefer v1 where PDCs were used the way they've been meant to but
this route also sufficient as intermediate step as it allows to boot
with more than 64 CPUs.

> 

> Anyway, if you prefer, I can modify TimedWaitForApFinish() like this:

> add an explicit check for TimeLimit==0 in the beginning, and return

> immediately if that condition evaluates to true. (Plus, update the

> comments, and modify patch #3 to use MAX_UINT32 as TimeLinit.)

> 

> What do you think?

> 

> Thanks

> Laszlo

> 

> > 

> > Thanks!

> > Jeff

> > 

> > -----Original Message-----

> > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek

> > Sent: Friday, November 25, 2016 4:57 AM

> > To: edk2-devel-01

> > Cc: Kinney, Michael D; Fan, Jeff; Justen, Jordan L

> > Subject: [edk2] [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup

> > 

> > Sometimes a platform knows exactly how many CPUs it has at boot. It should be able to

> > - set PcdCpuMaxLogicalProcessorNumber dynamically to this number,

> > - set PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0),

> > - and expect that MpInitLib wait exactly as long as necessary for all APs

> >   to report in.

> > 

> > Other platforms should be able to continue setting a reasonably large upper bound on supported CPUs, and waiting for a reasonable, fixed amount of time for all APs to report in.

> > 

> > Add this functionality. The TimedWaitForApFinish() function will return when all APs have reported in, or the timeout has expired -- whichever happens first.

> > 

> > (Accessing these PCDs dynamically is safe. The PEI and DXE phase instances of this library are restricted to PEIM and DXE_DRIVER client modules, thus the PCD accesses cannot be linked into runtime code.)

> > 

> > Cc: Igor Mammedov <imammedo@redhat.com>

> > Cc: Jeff Fan <jeff.fan@intel.com>

> > Cc: Jordan Justen <jordan.l.justen@intel.com>

> > Cc: Michael Kinney <michael.d.kinney@intel.com>

> > Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116

> > Contributed-under: TianoCore Contribution Agreement 1.0

> > Signed-off-by: Laszlo Ersek <lersek@redhat.com>

> > ---

> >  UefiCpuPkg/UefiCpuPkg.dec            |  3 +-

> >  UefiCpuPkg/Library/MpInitLib/MpLib.c | 67 +++++++++++++++++++-

> >  2 files changed, 68 insertions(+), 2 deletions(-)

> > 

> > diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec index ca560398bbef..8607ae8b7ae8 100644

> > --- a/UefiCpuPkg/UefiCpuPkg.dec

> > +++ b/UefiCpuPkg/UefiCpuPkg.dec

> > @@ -174,7 +174,8 @@ [PcdsFixedAtBuild, PcdsPatchableInModule, PcdsDynamic, PcdsDynamicEx]

> >    # @Prompt Configure max supported number of Logical Processors

> >    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64|UINT32|0x00000002

> >    ## Specifies timeout value in microseconds for the BSP to detect all APs for the first time.

> > -  # @Prompt Timeout for the BSP to detect all APs for the first time.

> > +  #  When set to zero, the BSP will wait forever for all (PcdCpuMaxLogicalProcessorNumber-1) APs to report in.

> > +  # @Prompt Timeout for the BSP to detect all APs for the first time (zero means await maximum supported AP count).

> >    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000|UINT32|0x00000004

> >    ## Specifies the base address of the first microcode Patch in the microcode Region.

> >    # @Prompt Microcode Region base address.

> > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c

> > index 15dbfa1e7d6c..76de92fa552a 100644

> > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c

> > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c

> > @@ -684,6 +684,22 @@ FillExchangeInfoData (  }

> >  

> >  /**

> > +  Helper function that waits until the finished AP count reaches the 

> > + specified  limit, or the specified timeout elapses (whichever comes first).

> > +

> > +  @param[in] CpuMpData        Pointer to CPU MP Data.

> > +  @param[in] FinishedApLimit  The number of finished APs to wait for.

> > +  @param[in] TimeLimit        The number of microseconds to wait for. Zero

> > +                              means infinity.

> > +**/

> > +VOID

> > +TimedWaitForApFinish (

> > +  IN CPU_MP_DATA               *CpuMpData,

> > +  IN UINT32                    FinishedApLimit,

> > +  IN UINT32                    TimeLimit

> > +  );

> > +

> > +/**

> >    This function will be called by BSP to wakeup AP.

> >  

> >    @param[in] CpuMpData          Pointer to CPU MP Data

> > @@ -748,7 +764,11 @@ WakeUpAP (

> >        //

> >        // Wait for all potential APs waken up in one specified period

> >        //

> > -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));

> > +      TimedWaitForApFinish (

> > +        CpuMpData,

> > +        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,

> > +        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)

> > +        );

> >      } else {

> >        //

> >        // Wait all APs waken up if this is not the 1st broadcast of SIPI @@ -895,6 +915,51 @@ CheckTimeout (  }

> >  

> >  /**

> > +  Helper function that waits until the finished AP count reaches the 

> > + specified  limit, or the specified timeout elapses (whichever comes first).

> > +

> > +  @param[in] CpuMpData        Pointer to CPU MP Data.

> > +  @param[in] FinishedApLimit  The number of finished APs to wait for.

> > +  @param[in] TimeLimit        The number of microseconds to wait for. Zero

> > +                              means infinity.

> > +**/

> > +VOID

> > +TimedWaitForApFinish (

> > +  IN CPU_MP_DATA               *CpuMpData,

> > +  IN UINT32                    FinishedApLimit,

> > +  IN UINT32                    TimeLimit

> > +  )

> > +{

> > +  CpuMpData->TotalTime = 0;

> > +  CpuMpData->ExpectedTime = CalculateTimeout (

> > +                              TimeLimit,

> > +                              &CpuMpData->CurrentTime

> > +                              );

> > +  while (CpuMpData->FinishedCount < FinishedApLimit &&

> > +         !CheckTimeout (

> > +            &CpuMpData->CurrentTime,

> > +            &CpuMpData->TotalTime,

> > +            CpuMpData->ExpectedTime

> > +            )) {

> > +    CpuPause ();

> > +  }

> > +

> > +  if (CpuMpData->FinishedCount >= FinishedApLimit) {

> > +    DEBUG ((

> > +      DEBUG_VERBOSE,

> > +      "%a: reached FinishedApLimit=%u in %Lu microseconds\n",

> > +      __FUNCTION__,

> > +      FinishedApLimit,

> > +      DivU64x64Remainder (

> > +        MultU64x32 (CpuMpData->TotalTime, 1000000),

> > +        GetPerformanceCounterProperties (NULL, NULL),

> > +        NULL

> > +        )

> > +      ));

> > +  }

> > +}

> > +

> > +/**

> >    Reset an AP to Idle state.

> >  

> >    Any task being executed by the AP will be aborted and the AP

> > --

> > 2.9.2

> > 

> > 

> > _______________________________________________

> > edk2-devel mailing list

> > edk2-devel@lists.01.org

> > https://lists.01.org/mailman/listinfo/edk2-devel

> >   

> 

> _______________________________________________

> edk2-devel mailing list

> edk2-devel@lists.01.org

> https://lists.01.org/mailman/listinfo/edk2-devel


_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
diff mbox

Patch

diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec
index ca560398bbef..8607ae8b7ae8 100644
--- a/UefiCpuPkg/UefiCpuPkg.dec
+++ b/UefiCpuPkg/UefiCpuPkg.dec
@@ -174,7 +174,8 @@  [PcdsFixedAtBuild, PcdsPatchableInModule, PcdsDynamic, PcdsDynamicEx]
   # @Prompt Configure max supported number of Logical Processors
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64|UINT32|0x00000002
   ## Specifies timeout value in microseconds for the BSP to detect all APs for the first time.
-  # @Prompt Timeout for the BSP to detect all APs for the first time.
+  #  When set to zero, the BSP will wait forever for all (PcdCpuMaxLogicalProcessorNumber-1) APs to report in.
+  # @Prompt Timeout for the BSP to detect all APs for the first time (zero means await maximum supported AP count).
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000|UINT32|0x00000004
   ## Specifies the base address of the first microcode Patch in the microcode Region.
   # @Prompt Microcode Region base address.
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 15dbfa1e7d6c..76de92fa552a 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -684,6 +684,22 @@  FillExchangeInfoData (
 }
 
 /**
+  Helper function that waits until the finished AP count reaches the specified
+  limit, or the specified timeout elapses (whichever comes first).
+
+  @param[in] CpuMpData        Pointer to CPU MP Data.
+  @param[in] FinishedApLimit  The number of finished APs to wait for.
+  @param[in] TimeLimit        The number of microseconds to wait for. Zero
+                              means infinity.
+**/
+VOID
+TimedWaitForApFinish (
+  IN CPU_MP_DATA               *CpuMpData,
+  IN UINT32                    FinishedApLimit,
+  IN UINT32                    TimeLimit
+  );
+
+/**
   This function will be called by BSP to wakeup AP.
 
   @param[in] CpuMpData          Pointer to CPU MP Data
@@ -748,7 +764,11 @@  WakeUpAP (
       //
       // Wait for all potential APs waken up in one specified period
       //
-      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
+      TimedWaitForApFinish (
+        CpuMpData,
+        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
+        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
+        );
     } else {
       //
       // Wait all APs waken up if this is not the 1st broadcast of SIPI
@@ -895,6 +915,51 @@  CheckTimeout (
 }
 
 /**
+  Helper function that waits until the finished AP count reaches the specified
+  limit, or the specified timeout elapses (whichever comes first).
+
+  @param[in] CpuMpData        Pointer to CPU MP Data.
+  @param[in] FinishedApLimit  The number of finished APs to wait for.
+  @param[in] TimeLimit        The number of microseconds to wait for. Zero
+                              means infinity.
+**/
+VOID
+TimedWaitForApFinish (
+  IN CPU_MP_DATA               *CpuMpData,
+  IN UINT32                    FinishedApLimit,
+  IN UINT32                    TimeLimit
+  )
+{
+  CpuMpData->TotalTime = 0;
+  CpuMpData->ExpectedTime = CalculateTimeout (
+                              TimeLimit,
+                              &CpuMpData->CurrentTime
+                              );
+  while (CpuMpData->FinishedCount < FinishedApLimit &&
+         !CheckTimeout (
+            &CpuMpData->CurrentTime,
+            &CpuMpData->TotalTime,
+            CpuMpData->ExpectedTime
+            )) {
+    CpuPause ();
+  }
+
+  if (CpuMpData->FinishedCount >= FinishedApLimit) {
+    DEBUG ((
+      DEBUG_VERBOSE,
+      "%a: reached FinishedApLimit=%u in %Lu microseconds\n",
+      __FUNCTION__,
+      FinishedApLimit,
+      DivU64x64Remainder (
+        MultU64x32 (CpuMpData->TotalTime, 1000000),
+        GetPerformanceCounterProperties (NULL, NULL),
+        NULL
+        )
+      ));
+  }
+}
+
+/**
   Reset an AP to Idle state.
 
   Any task being executed by the AP will be aborted and the AP