mbox series

[00/40] iscsi lock and refcount fix ups

Message ID 20210403232333.212927-1-michael.christie@oracle.com
Headers show
Series iscsi lock and refcount fix ups | expand

Message

Mike Christie April 3, 2021, 11:22 p.m. UTC
The following patches apply over Linus's tree or Martin's staging branch.
They fix up the locking and refcount handling in the iscsi code so for
software iscsi we longer need a lock when going from queuecommand to the
xmit thread and no longer need a common iscsi level lock between the xmit
thread and completion paths.

For simple throughput workloads like

fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=256k \
--ioengine=libaio --iodepth=128 --numjobs=1 --time_based \
--group_reporting --name=throughput --runtime=120

I'm able to get throughput from 24 Gb/s to 28 where I then hit a
bottleneck on the target side.

IOPs might increase by around 10% in some cases with:

fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k \
--ioengine=libaio --iodepth=128 --numjobs=1 --time_based \
--group_reporting --name=throughput --runtime=120

I'm still debugging some target side issues.

A bigger advantage I'm seeing with the patches is that for setups where
you have software iscsi sharing CPUs with other subsystems like vhost
IOPs can increase by up to 20%.

Notes:
- I've tested iscsi_tcp, ib_iser, be2iscsi and qedi. I don't have cxgbi
or bnx2i hardware, but cxbgi changes were API only.

- Lee, the first 2 patches are new bug fixes. The first half are then
similar to what you saw before. I was not sure how far through them you
were. The second half was the part that removed the back lock and frwd
lock from iscsi_queuecommand are new.

Comments

Mike Christie April 8, 2021, 4:34 p.m. UTC | #1
Lee and Manish and others,

Don't review this patchset.

I'm hitting some issues with the code before my patchset. It will be easier
to test/review if I fix them first.

Lee, I'll send patches for the ep_disconnect/iscsi_conn_stop issue.

Manish, I found some bugs in qedi that we might be hitting:

- it shouldn't use iscsi_block_session in the tmf path
- libiscsi and qedi can get out of sync in the tmf paths and cleanup the
wrong cmds.


On 4/3/21 6:22 PM, Mike Christie wrote:
> The following patches apply over Linus's tree or Martin's staging branch.

> They fix up the locking and refcount handling in the iscsi code so for

> software iscsi we longer need a lock when going from queuecommand to the

> xmit thread and no longer need a common iscsi level lock between the xmit

> thread and completion paths.

> 

> For simple throughput workloads like

> 

> fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=256k \

> --ioengine=libaio --iodepth=128 --numjobs=1 --time_based \

> --group_reporting --name=throughput --runtime=120

> 

> I'm able to get throughput from 24 Gb/s to 28 where I then hit a

> bottleneck on the target side.

> 

> IOPs might increase by around 10% in some cases with:

> 

> fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k \

> --ioengine=libaio --iodepth=128 --numjobs=1 --time_based \

> --group_reporting --name=throughput --runtime=120

> 

> I'm still debugging some target side issues.

> 

> A bigger advantage I'm seeing with the patches is that for setups where

> you have software iscsi sharing CPUs with other subsystems like vhost

> IOPs can increase by up to 20%.

> 

> Notes:

> - I've tested iscsi_tcp, ib_iser, be2iscsi and qedi. I don't have cxgbi

> or bnx2i hardware, but cxbgi changes were API only.

> 

> - Lee, the first 2 patches are new bug fixes. The first half are then

> similar to what you saw before. I was not sure how far through them you

> were. The second half was the part that removed the back lock and frwd

> lock from iscsi_queuecommand are new.

> 

> 

> 

>