diff mbox series

[v3,10/14] vt: pad double-width code points with a zero-width space

Message ID 20250417184849.475581-11-nico@fluxnic.net
State New
Headers show
Series vt: implement proper Unicode handling | expand

Commit Message

Nicolas Pitre April 17, 2025, 6:45 p.m. UTC
From: Nicolas Pitre <npitre@baylibre.com>

In the Unicode screen buffer, we follow double-width code points with a
space to maintain proper column alignment. This, however, creates
semantic problems when e.g. using cut and paste.

Let's use a better code point for the column padding's purpose i.e. a
zero-width space rather than a full space. This way the combination
retains a width of 2.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
---
 drivers/tty/vt/vt.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)
diff mbox series

Patch

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 76554c2040..1bd1878094 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -2923,6 +2923,7 @@  static void vc_con_rewind(struct vc_data *vc)
 	vc->vc_need_wrap = 0;
 }
 
+#define UCS_ZWS		0x200b	/* Zero Width Space */
 #define UCS_VS16	0xfe0f	/* Variation Selector 16 */
 
 static int vc_process_ucs(struct vc_data *vc, int *c, int *tc)
@@ -2941,8 +2942,8 @@  static int vc_process_ucs(struct vc_data *vc, int *c, int *tc)
 		/*
 		 * Let's merge this zero-width code point with the preceding
 		 * double-width code point by replacing the existing
-		 * whitespace padding. To do so we rewind one column and
-		 * pretend this has a width of 1.
+		 * zero-width space padding. To do so we rewind one column
+		 * and pretend this has a width of 1.
 		 * We give the legacy display the same initial space padding.
 		 */
 		vc_con_rewind(vc);
@@ -3065,7 +3066,11 @@  static int vc_con_write_normal(struct vc_data *vc, int tc, int c,
 		tc = conv_uni_to_pc(vc, ' ');
 		if (tc < 0)
 			tc = ' ';
-		next_c = ' ';
+		/*
+		 * Store a zero-width space in the Unicode screen given that
+		 * the previous code point is semantically double width.
+		 */
+		next_c = UCS_ZWS;
 	}
 
 out: