Message ID | 20200928043329.606781-1-mark.mielke@gmail.com |
---|---|
State | New |
Headers | show |
Series | iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername | expand |
On 9/27/20 11:33 PM, Mark Mielke wrote: > Kernel may fail to boot or devices may fail to come up when > initializing iscsi_tcp devices starting with Linux 5.8. > > Marc Dionne identified the cause in RHBZ#1877345. > > Commit a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param > libiscsi function") introduced getpeername() within the session spinlock. > > Commit 1b66d253610c ("bpf: Add get{peer, sock}name attach types for > sock_addr") introduced BPF_CGROUP_RUN_SA_PROG_LOCK() within getpeername, > which acquires a mutex and when used from iscsi_tcp devices can now lead > to "BUG: scheduling while atomic:" and subsequent damage. > > This commit ensures that the spinlock is released before calling > getpeername() or getsockname(). sock_hold() and sock_put() are > used to ensure that the socket reference is preserved until after > the getpeername() or getsockname() complete. > > Reported-by: Marc Dionne <marc.c.dionne@gmail.com> > Link: https://urldefense.com/v3/__https://bugzilla.redhat.com/show_bug.cgi?id=1877345__;!!GqivPVa7Brio!IykqCqCEtE_EyrhXerYzj_cIlmkenkaAVddyoEOw9T6n4nExSaGFHkn0ZQrLBVSUtxQ_$ > Link: https://urldefense.com/v3/__https://lkml.org/lkml/2020/7/28/1085__;!!GqivPVa7Brio!IykqCqCEtE_EyrhXerYzj_cIlmkenkaAVddyoEOw9T6n4nExSaGFHkn0ZQrLBRT9NL69$ > Link: https://urldefense.com/v3/__https://lkml.org/lkml/2020/8/31/459__;!!GqivPVa7Brio!IykqCqCEtE_EyrhXerYzj_cIlmkenkaAVddyoEOw9T6n4nExSaGFHkn0ZQrLBfxZYLKs$ > Fixes: a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi function") > Fixes: 1b66d253610c ("bpf: Add get{peer, sock}name attach types for sock_addr") > Signed-off-by: Mark Mielke <mark.mielke@gmail.com> > Cc: stable@vger.kernel.org > --- > drivers/scsi/iscsi_tcp.c | 22 +++++++++++++++------- > 1 file changed, 15 insertions(+), 7 deletions(-) > > diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c > index b5dd1caae5e9..d10efb66cf19 100644 > --- a/drivers/scsi/iscsi_tcp.c > +++ b/drivers/scsi/iscsi_tcp.c > @@ -736,6 +736,7 @@ static int iscsi_sw_tcp_conn_get_param(struct iscsi_cls_conn *cls_conn, > struct iscsi_tcp_conn *tcp_conn = conn->dd_data; > struct iscsi_sw_tcp_conn *tcp_sw_conn = tcp_conn->dd_data; > struct sockaddr_in6 addr; > + struct socket *sock; > int rc; > > switch(param) { > @@ -747,13 +748,17 @@ static int iscsi_sw_tcp_conn_get_param(struct iscsi_cls_conn *cls_conn, > spin_unlock_bh(&conn->session->frwd_lock); > return -ENOTCONN; > } > + sock = tcp_sw_conn->sock; > + sock_hold(sock->sk); > + spin_unlock_bh(&conn->session->frwd_lock); > + > if (param == ISCSI_PARAM_LOCAL_PORT) > - rc = kernel_getsockname(tcp_sw_conn->sock, > + rc = kernel_getsockname(sock, > (struct sockaddr *)&addr); > else > - rc = kernel_getpeername(tcp_sw_conn->sock, > + rc = kernel_getpeername(sock, > (struct sockaddr *)&addr); > - spin_unlock_bh(&conn->session->frwd_lock); > + sock_put(sock->sk); > if (rc < 0) > return rc; > > @@ -775,6 +780,7 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host *shost, > struct iscsi_tcp_conn *tcp_conn; > struct iscsi_sw_tcp_conn *tcp_sw_conn; > struct sockaddr_in6 addr; > + struct socket *sock; > int rc; > > switch (param) { > @@ -789,16 +795,18 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host *shost, > return -ENOTCONN; > } > tcp_conn = conn->dd_data; > - > tcp_sw_conn = tcp_conn->dd_data; > - if (!tcp_sw_conn->sock) { > + sock = tcp_sw_conn->sock; > + if (!sock) { > spin_unlock_bh(&session->frwd_lock); > return -ENOTCONN; > } > + sock_hold(sock->sk); > + spin_unlock_bh(&session->frwd_lock); > > - rc = kernel_getsockname(tcp_sw_conn->sock, > + rc = kernel_getsockname(sock, > (struct sockaddr *)&addr); > - spin_unlock_bh(&session->frwd_lock); > + sock_put(sock->sk); > if (rc < 0) > return rc; > > Reviewed-by: Mike Christie <michael.christie@oracle.com>
On Mon, Sep 28, 2020 at 1:34 AM Mark Mielke <mark.mielke@gmail.com> wrote: > > Kernel may fail to boot or devices may fail to come up when > initializing iscsi_tcp devices starting with Linux 5.8. > > Marc Dionne identified the cause in RHBZ#1877345. > > Commit a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param > libiscsi function") introduced getpeername() within the session spinlock. > > Commit 1b66d253610c ("bpf: Add get{peer, sock}name attach types for > sock_addr") introduced BPF_CGROUP_RUN_SA_PROG_LOCK() within getpeername, > which acquires a mutex and when used from iscsi_tcp devices can now lead > to "BUG: scheduling while atomic:" and subsequent damage. > > This commit ensures that the spinlock is released before calling > getpeername() or getsockname(). sock_hold() and sock_put() are > used to ensure that the socket reference is preserved until after > the getpeername() or getsockname() complete. > > Reported-by: Marc Dionne <marc.c.dionne@gmail.com> > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1877345 > Link: https://lkml.org/lkml/2020/7/28/1085 > Link: https://lkml.org/lkml/2020/8/31/459 > Fixes: a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi function") > Fixes: 1b66d253610c ("bpf: Add get{peer, sock}name attach types for sock_addr") > Signed-off-by: Mark Mielke <mark.mielke@gmail.com> > Cc: stable@vger.kernel.org > --- > drivers/scsi/iscsi_tcp.c | 22 +++++++++++++++------- > 1 file changed, 15 insertions(+), 7 deletions(-) > > diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c > index b5dd1caae5e9..d10efb66cf19 100644 > --- a/drivers/scsi/iscsi_tcp.c > +++ b/drivers/scsi/iscsi_tcp.c > @@ -736,6 +736,7 @@ static int iscsi_sw_tcp_conn_get_param(struct iscsi_cls_conn *cls_conn, > struct iscsi_tcp_conn *tcp_conn = conn->dd_data; > struct iscsi_sw_tcp_conn *tcp_sw_conn = tcp_conn->dd_data; > struct sockaddr_in6 addr; > + struct socket *sock; > int rc; > > switch(param) { > @@ -747,13 +748,17 @@ static int iscsi_sw_tcp_conn_get_param(struct iscsi_cls_conn *cls_conn, > spin_unlock_bh(&conn->session->frwd_lock); > return -ENOTCONN; > } > + sock = tcp_sw_conn->sock; > + sock_hold(sock->sk); > + spin_unlock_bh(&conn->session->frwd_lock); > + > if (param == ISCSI_PARAM_LOCAL_PORT) > - rc = kernel_getsockname(tcp_sw_conn->sock, > + rc = kernel_getsockname(sock, > (struct sockaddr *)&addr); > else > - rc = kernel_getpeername(tcp_sw_conn->sock, > + rc = kernel_getpeername(sock, > (struct sockaddr *)&addr); > - spin_unlock_bh(&conn->session->frwd_lock); > + sock_put(sock->sk); > if (rc < 0) > return rc; > > @@ -775,6 +780,7 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host *shost, > struct iscsi_tcp_conn *tcp_conn; > struct iscsi_sw_tcp_conn *tcp_sw_conn; > struct sockaddr_in6 addr; > + struct socket *sock; > int rc; > > switch (param) { > @@ -789,16 +795,18 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host *shost, > return -ENOTCONN; > } > tcp_conn = conn->dd_data; > - > tcp_sw_conn = tcp_conn->dd_data; > - if (!tcp_sw_conn->sock) { > + sock = tcp_sw_conn->sock; > + if (!sock) { > spin_unlock_bh(&session->frwd_lock); > return -ENOTCONN; > } > + sock_hold(sock->sk); > + spin_unlock_bh(&session->frwd_lock); > > - rc = kernel_getsockname(tcp_sw_conn->sock, > + rc = kernel_getsockname(sock, > (struct sockaddr *)&addr); > - spin_unlock_bh(&session->frwd_lock); > + sock_put(sock->sk); > if (rc < 0) > return rc; > > -- > 2.28.0 > Works for me and prevents the iscsid crash. Tested-by: Marc Dionne <marc.c.dionne@gmail.com>
On Mon, 28 Sep 2020 00:33:29 -0400, Mark Mielke wrote: > Kernel may fail to boot or devices may fail to come up when > initializing iscsi_tcp devices starting with Linux 5.8. > > Marc Dionne identified the cause in RHBZ#1877345. > > Commit a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param > libiscsi function") introduced getpeername() within the session spinlock. > > [...] Applied to 5.9/scsi-fixes, thanks! [1/1] scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername() https://git.kernel.org/mkp/scsi/c/bcf3a2953d36
diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index b5dd1caae5e9..d10efb66cf19 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -736,6 +736,7 @@ static int iscsi_sw_tcp_conn_get_param(struct iscsi_cls_conn *cls_conn, struct iscsi_tcp_conn *tcp_conn = conn->dd_data; struct iscsi_sw_tcp_conn *tcp_sw_conn = tcp_conn->dd_data; struct sockaddr_in6 addr; + struct socket *sock; int rc; switch(param) { @@ -747,13 +748,17 @@ static int iscsi_sw_tcp_conn_get_param(struct iscsi_cls_conn *cls_conn, spin_unlock_bh(&conn->session->frwd_lock); return -ENOTCONN; } + sock = tcp_sw_conn->sock; + sock_hold(sock->sk); + spin_unlock_bh(&conn->session->frwd_lock); + if (param == ISCSI_PARAM_LOCAL_PORT) - rc = kernel_getsockname(tcp_sw_conn->sock, + rc = kernel_getsockname(sock, (struct sockaddr *)&addr); else - rc = kernel_getpeername(tcp_sw_conn->sock, + rc = kernel_getpeername(sock, (struct sockaddr *)&addr); - spin_unlock_bh(&conn->session->frwd_lock); + sock_put(sock->sk); if (rc < 0) return rc; @@ -775,6 +780,7 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host *shost, struct iscsi_tcp_conn *tcp_conn; struct iscsi_sw_tcp_conn *tcp_sw_conn; struct sockaddr_in6 addr; + struct socket *sock; int rc; switch (param) { @@ -789,16 +795,18 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host *shost, return -ENOTCONN; } tcp_conn = conn->dd_data; - tcp_sw_conn = tcp_conn->dd_data; - if (!tcp_sw_conn->sock) { + sock = tcp_sw_conn->sock; + if (!sock) { spin_unlock_bh(&session->frwd_lock); return -ENOTCONN; } + sock_hold(sock->sk); + spin_unlock_bh(&session->frwd_lock); - rc = kernel_getsockname(tcp_sw_conn->sock, + rc = kernel_getsockname(sock, (struct sockaddr *)&addr); - spin_unlock_bh(&session->frwd_lock); + sock_put(sock->sk); if (rc < 0) return rc;
Kernel may fail to boot or devices may fail to come up when initializing iscsi_tcp devices starting with Linux 5.8. Marc Dionne identified the cause in RHBZ#1877345. Commit a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi function") introduced getpeername() within the session spinlock. Commit 1b66d253610c ("bpf: Add get{peer, sock}name attach types for sock_addr") introduced BPF_CGROUP_RUN_SA_PROG_LOCK() within getpeername, which acquires a mutex and when used from iscsi_tcp devices can now lead to "BUG: scheduling while atomic:" and subsequent damage. This commit ensures that the spinlock is released before calling getpeername() or getsockname(). sock_hold() and sock_put() are used to ensure that the socket reference is preserved until after the getpeername() or getsockname() complete. Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1877345 Link: https://lkml.org/lkml/2020/7/28/1085 Link: https://lkml.org/lkml/2020/8/31/459 Fixes: a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi function") Fixes: 1b66d253610c ("bpf: Add get{peer, sock}name attach types for sock_addr") Signed-off-by: Mark Mielke <mark.mielke@gmail.com> Cc: stable@vger.kernel.org --- drivers/scsi/iscsi_tcp.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)