feat:countcharsinsteadofbytesinstr_width

作者：1098502132_027279 | 来源：互联网 | 2023-06-25 01:17

While

str_width

doesn’t attempt to handle all Unicode characters, it can handle some characters by counting

char

s instead of bytes.

This allows it handle characters like “æøå” (Danish), “äöü” (German), and other 1-column characters with a precomposed representation in Unicode. Where characters take up multiple bytes in the UTF-8 encoding, though they still decode to a single

char

. Wikipedia of course has a handy list:

https://en.wikipedia.org/wiki/List_of_precomposed_Latin_characters_in_Unicode

Emojis are also encoded as multiple bytes in UTF-8. With this change, they will also be counted as having a width of 1 column, whereas they’re often displayed using 2 columns.

The existing code can thus be said to take a conservative approach since it will only ever over-estimate the width of a string.

该提问来源于开源项目：clap-rs/clap

Okay, I've played around this and it is doable to pick a cut-off like suggested above. This is now implemented in textwrap as the

1	textwrap::core::display_width

function. With such a simple function, Latin-1 plus emojis seems to be handled quite well. I am less sure about the coverage for East-Asian languages like Chinese or Japanese (but they were also not the target of this hack).

I will make a new release of textwrap soon (1-2 weeks perhaps) and if there is interest, then I could update this PR to use the

1	display_width

function. This would remove the direct dependency on unicode-width from clap and so to speak export this responsibility to textwrap.

推荐阅读

java
jq实现定时弹出广告

首页#father{border:0pxso ... [详细]

蜡笔小新 2024-09-27 17:56:58
java
下载进度的制作

这两天做了一个小项目，里面有个下载进度的进度条需要制作。先看呈现的效果：点击进度，然后依次递增，直到递增到百分之百。现在把这部分代码分享下来。<!DOCTYPEhtml><html ... [详细]

蜡笔小新 2024-09-27 14:38:43
java
jQuery过滤器（子元素过滤器、表单对象属性过滤器）详解

子元素过滤器在页面设计过程中需要突出某些行时，可以通过基本过滤选择器中的:eq()来实现表单中行的凸显，但不能同时让多个表具有相同的效果。在jQuer ... [详细]

蜡笔小新 2024-09-27 04:00:36
include
UNP总结 Chapter 12~14 IPv4与IPv6的互操作性、守护进程和inet超级服务器、高级I/O函数

一、IPv4与IPv6的互操作性1.IPv4客户与IPv6服务器拥有双重协议栈的主机的一个基本特性就是：其上运行的IPv6服务器既能应付IPv4客户，又能应付IPv6客户。这是通过使用IPv4映射 ... [详细]

蜡笔小新 2024-09-30 18:55:51
include
填充字节[]到16字节倍数用于AES加密 - Pad byte[] to 16-byte multiple for AES Encryption

Icurrentlyhaveafunction[C#]whichtakesabyte[]andanalignmenttosetitto,butduringencr ... [详细]

蜡笔小新 2024-09-30 17:44:36
include
REST webService 用jquery ajax post方式提交 json 格式参数 webService 拿不到值

REST webService 用jquery ajax post方式提交 json 格式参数 webService 拿不到值 ... [详细]

蜡笔小新 2024-09-30 16:27:58
java
如何判断当前浏览器是不是微信浏览器

如何判断当前浏览器是不是微信浏览器主要代码块functionisWeiXin(){varuawindow.navigator.userAgent.toLowerCase();i ... [详细]

蜡笔小新 2024-09-30 10:41:15
java
JS动态生成表格案例

JS动态生成表格案例 ... [详细]

蜡笔小新 2024-09-30 10:33:54
java
Flex中使用filter过滤数据

Flex中使用filter过滤数据 ... [详细]

蜡笔小新 2024-09-29 14:51:58
input
React系列：Babel编译JSX生成代码

上次我们总结了React代码构建后的webpack模块组织关系，今天来介绍一下Babel编译JSX生成目标代码的一些规则，并且写一个简单的解析器，模拟整个生成的过程。我们还是拿最简 ... [详细]

蜡笔小新 2024-09-29 14:42:23
input
Proof (of knowledge) of exponentiation

1.ProofofexponentiationProofofexponentiation是基于adaptiverootassumption（充分必要条件࿰ ... [详细]

蜡笔小新 2024-09-27 15:32:38
input
计算机网络四

大三上结束之际，从网上找来一些关于计算机网络的知识作为总结，本文四篇笔记全部转自猪头任（博客地址：http:www.cnbl ... [详细]

蜡笔小新 2024-09-26 20:26:13
search
如何对三重嵌套循环进行矢量化？ - How to vectorize triple nested loops?

IvedonesearchingsimilarproblemsandIhaveavagueideaaboutwhatshouldIdo:tovectorizeev ... [详细]

蜡笔小新 2024-09-26 19:03:41
php
搜索+剪枝 POJ 1416 Shredding Company

POJ1416ShreddingCompanyTimeLimit: 1000MSMemoryLimit: 10000KTotalSubmissions: 5231Accepted: ... [详细]

蜡笔小新 2024-09-24 19:49:56
search
LwIP系列内存管理（堆内存）详解

一、目的小型嵌入式系统中的内存资源（SRAM）一般都比较有限，LwIP的运行平台一般都是资源受限的MCU。为了能够更加高效的运行ÿ ... [详细]

蜡笔小新 2024-09-25 18:34:18

1098502132_027279

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章