When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.
在我自己的代码中使用整数值时,我总是试着考虑签名,问自己整数应该是有符号还是无符号。
When I'm sure the value will never need to be negative, I then use an unsigned integer.
And I have to say this happen most of the time.
当我确定该值永远不需要为负数时,我会使用无符号整数。而且我不得不说这种情况大多数时间都会发生。
When reading other peoples' code, I rarely see unsigned integers, even if the represented value can't be negative.
在阅读其他人的代码时,我很少看到无符号整数,即使代表的值不能为负数。
So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?
所以我问自己:“这有充分的理由吗,或者人们只是使用签名整数,因为不关心»?
I've search on the subject, here and in other places, and I have to say I can't find a good reason not to use unsigned integers, when it applies.
我在这里和其他地方搜索了这个主题,我不得不说,当它适用时,我找不到使用无符号整数的充分理由。
I came across those questions: «Default int type: Signed or Unsigned?», and «Should you always use 'int' for numbers in C, even if they are non-negative?» which both present the following example:
我遇到了这些问题:«默认int类型:签名或无符号?»和«你应该总是使用'int'表示C中的数字,即使它们是非负数吗?»这两个都提供了以下示例:
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
To me, this is just bad design. Of course, it may result in an infinite loop, with unsigned integers.
But is it so hard to check if foo.Length()
is 0, before the loop?
对我来说,这只是糟糕的设计。当然,它可能会导致无限循环,无符号整数。但是在循环之前检查foo.Length()是否为0是如此困难?
So I personally don't think this is a good reason for using signed integers all the way.
所以我个人认为这不是一直使用有符号整数的好理由。
Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1
.
有些人也可能会说有符号整数可能很有用,即使对于非负值,也可能提供错误标记,通常为-1。
Ok, that's good to have a specific value that means «error».
But then, what's wrong with something like UINT_MAX
, for that specific value?
好吧,拥有一个特定值意味着“错误”是件好事。但是,对于那个具体的价值,像UINT_MAX这样的东西有什么问题呢?
I'm actually asking this question because it may lead to some huge problems, usually when using third-party libraries.
我实际上是在问这个问题,因为它可能会导致一些巨大的问题,通常在使用第三方库时。
In such a case, you often have to deal with signed and unsigned values.
在这种情况下,您经常需要处理有符号和无符号值。
Most of the time, people just don't care about the signedness, and just assign a, for instance, an unsigned int
to a signed int
, without checking the range.
大多数情况下,人们只是不关心签名,只是将一个unsigned int分配给signed int,而不检查范围。
I have to say I'm a bit paranoid with the compiler warning flags, so with my setup, such an implicit cast will result in a compiler error.
我不得不说我对编译器警告标志有点偏执,所以在我的设置中,这样的隐式转换将导致编译器错误。
For that kind of stuff, I usually use a function or macro to check the range, and then assign using an explicit cast, raising an error if needed.
对于那种东西,我通常使用函数或宏来检查范围,然后使用显式转换分配,如果需要则引发错误。
This just seems logical to me.
这对我来说似乎合乎逻辑。
As a last example, as I'm also an Objective-C developer (note that this question is not related to Objective-C only):
作为最后一个例子,因为我也是Objective-C开发人员(请注意,此问题仅与Objective-C无关):
- ( NSInteger )tableView: ( UITableView * )tableView numberOfRowsInSection: ( NSInteger )section;
For those not fluent with Objective-C, NSInteger
is a signed integer.
This method actually retrieves the number of rows in a table view, for a specific section.
对于那些不熟悉Objective-C的人,NSInteger是一个有符号整数。对于特定部分,此方法实际上检索表视图中的行数。
The result will never be a negative value (as the section number, by the way).
结果永远不会是负值(顺便说一下,作为节号)。
So why use a signed integer for this?
I really don't understand.
那么为什么要使用有符号整数呢?我真的不明白。
This is just an example, but I just always see that kind of stuff, with C, C++ or Objective-C.
这只是一个例子,但我总是看到那种东西,包括C,C ++或Objective-C。
So again, I'm just wondering if people just don't care about that kind of problems, or if there is finally a good and valid reason not to use unsigned integers for such cases.
所以,我只是想知道人们是否只是不关心那种问题,或者是否最终有一个良好而有效的理由不对这种情况使用无符号整数。
Looking forward to hear your answers : )
期待听到您的答案:)
5
a signed
return value might yield more information (think error-numbers, 0
is sometimes a valid answer, -1
indicates error, see man read
) ... which might be relevant especially for developers of libraries.
有符号的返回值可能会产生更多信息(想想错误数字,0有时是有效答案,-1表示错误,请参阅man read)......这可能与图书馆的开发人员特别相关。
if you are worrying about the one extra bit you gain when using unsigned
instead of signed
then you are probably using the wrong type anyway. (also kind of "premature optimization" argument)
如果你担心使用unsigned而不是signed而获得的额外一点,那么你可能正在使用错误的类型。 (也有种“过早优化”的说法)
languages like python, ruby, jscript etc are doing just fine without signed
vs unsigned
. that might be an indicator ...
python,ruby,jscript等语言在没有signed和unsigned的情况下做得很好。这可能是一个指标......
2
There is one heavy-weight argument against widely unsigned integers:
对于广泛无符号整数,有一个重量级参数:
Premature optimization is the root of all evil.
过早优化是万恶之源。
We all have at least on one occasion been bitten by unsigned integers. Sometimes like in your loop, sometimes in other contexts. Unsigned integers add a hazard, even though a small one, to your program. And you are introducing this hazard to change the meaning of one bit. One little, tiny, insignificant-but-for-its-sign-meaning bit. On the other hand, the integers we work with in bread and butter applications are often far below the range of integers, more in the order of 10^1 than 10^7. Thus, the different range of unsigned integers is in the vast majority of cases not needed. And when it's needed, it is quite likely that this extra bit won't cut it (when 31 is too little, 32 is rarely enough) and you'll need a wider or an arbitrary-wide integer anyway. The pragmatic approach in these cases is to just use the signed integer and spare yourself the occasional underflow bug. Your time as a programmer can be put to much better use.
我们都至少有一次被无符号整数咬了。有时像在你的循环中,有时在其他情况下。无符号整数会给你的程序增加一个危险,即使它很小。而你正在引入这种危险来改变一点的含义。一个小的,微小的,微不足道的 - 但是它的符号意义的位。另一方面,我们在面包和黄油应用中使用的整数通常远低于整数范围,更多的是10 ^ 1而不是10 ^ 7。因此,在绝大多数不需要的情况下,无符号整数的不同范围。当它需要时,很可能这个额外的位不会削减它(当31太小,32很少就足够了)并且你无论如何都需要更宽或任意宽的整数。在这些情况下,实用的方法是只使用有符号整数,并避免偶尔出现下溢错误。您作为程序员的时间可以更好地利用。
1
When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.
在我自己的代码中使用整数值时,我总是试着考虑签名,问自己整数应该是有符号还是无符号。
When I'm sure the value will never need to be negative, I then use an unsigned integer. And I have to say this happen most of the time.
当我确定该值永远不需要为负数时,我会使用无符号整数。而且我不得不说这种情况大多数时间都会发生。
To carefully consider which type that is most suitable each time you declare a variable is very good practice! This means you are careful and professional. You should not only consider signedness, but also the potential max value that you expect this type to have.
每次声明变量时要仔细考虑哪种类型最合适是非常好的做法!这意味着你要小心谨慎。您不仅应该考虑签名,还应该考虑您希望此类型具有的潜在最大值。
The reason why you shouldn't use signed types when they aren't needed have nothing to do with performance, but with type safety. There are lots of potential, subtle bugs that can be caused by signed types:
不需要使用签名类型的原因与性能无关,但与类型安全无关。签名类型可能导致许多潜在的,微妙的错误:
The various forms of implicit promotions that exist in C can cause your type to change signedness in unexpected and possibly dangerous ways. The integer promotion rule that is part of the usual arithmetic conversions, the lvalue conversion upon assignment, the default argument promotions used by for example VA lists, and so on.
C中存在的各种形式的隐式促销可能会导致您的类型以意外和可能危险的方式更改签名。整数提升规则,它是通常的算术转换的一部分,赋值时的左值转换,例如VA列表使用的默认参数提升,等等。
When using any form of bitwise operators or similar hardware-related programming, signed types are dangerous and can easily cause various forms of undefined behavior.
当使用任何形式的按位运算符或类似的硬件相关编程时,有符号类型是危险的,并且很容易导致各种形式的未定义行为。
By declaring your integers unsigned, you automatically skip past a whole lot of the above dangers. Similarly, by declaring them as large as unsigned int
or larger, you get rid of lots of dangers caused by the integer promotions.
通过声明未签名的整数,您可以自动跳过上述大量危险。类似地,通过将它们声明为unsigned int或更大,可以消除由整数提升引起的大量危险。
Both size and signedness are important when it comes to writing rugged, portable and safe code. This is the reason why you should always use the types from stdint.h
and not the native, so-called "primitive data types" of C.
在编写坚固耐用,可移植且安全的代码时,尺寸和符号都很重要。这就是为什么你应该总是使用stdint.h中的类型而不是C的原生的所谓“原始数据类型”的原因。
So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?
所以我问自己:“这有充分的理由吗,或者人们只是使用签名整数,因为不关心»?
I don't really think it is because they don't care, nor because they are lazy, even though declaring everything int
is sometimes referred to as "sloppy typing" - which means sloppily picked type more than it means too lazy to type.
我并不认为这是因为他们不关心,也不是因为他们懒惰,即使宣称所有内容有时也被称为“草率打字” - 这意味着笨拙地挑选类型而不是意味着懒得打字。
I rather believe it is because they lack deeper knowledge of the various things I mentioned above. There's a frightening amount of seasoned C programmers who don't know how implicit type promotions work in C, nor how signed types can cause poorly-defined behavior when used together with certain operators.
我宁愿相信这是因为他们对我上面提到的各种事情缺乏更深入的了解。有一些令人恐惧的经验丰富的C程序员不知道C中的隐式类型促销如何工作,也不知道签名类型与某些运算符一起使用时如何导致定义不明确的行为。
This is actually a very frequent source of subtle bugs. Many programmers find themselves staring at a compiler warning or a peculiar bug, which they can make go away by adding a cast. But they don't understand why, they simply add the cast and move on.
这实际上是一个非常频繁的微妙错误来源。许多程序员发现自己正盯着编译器警告或特殊的bug,他们可以通过添加一个演员来消除它。但他们不明白为什么,他们只是添加演员并继续前进。
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
for(unsigned int i = foo.Length() - 1; i> = 0; --i){}
To me, this is just bad design
对我来说,这只是糟糕的设计
Indeed it is.
它的确是。
Once upon a time, down-counting loops would yield more effective code, because the compiler pick add a "branch if zero" instruction instead of a "branch if larger/smaller/equal" instruction - the former is faster. But this was at a time when compilers were really dumb and I don't believe such micro-optimizations are relevant any longer.
曾几何时,向下计数循环会产生更有效的代码,因为编译器选择添加“分支如果为零”指令而不是“分支如果更大/更小/相等”指令 - 前者更快。但这是在编译器真的很愚蠢的时候,我不认为这种微观优化会再次相关。
So there is rarely ever a reason to have a down-counting loop. Whoever made the argument probably just couldn't think outside the box. The example could have been rewritten as:
因此,很少有理由进行减计数循环。无论谁提出这个论点,都可能无法想到这个问题。该示例可能已被重写为:
for(unsigned int i=0; i
This code should not have any impact on performance, but the loop itself turned a whole lot easier to read, while at the same time fixing the bug that your example had.
这段代码不应该对性能产生任何影响,但是循环本身变得更容易阅读,同时修复了你的例子所带来的bug。
As far as performance is concerned nowadays, one should probably spend the time pondering about which form of data access that is most ideal in terms of data cache use, rather than anything else.
就目前的性能而言,人们应该花时间思考哪种形式的数据访问在数据缓存使用方面是最理想的,而不是其他任何东西。
Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.
有些人也可能会说有符号整数可能很有用,即使对于非负值,也可能提供错误标记,通常为-1。
That's a poor argument. Good API design uses a dedicated error type for error reporting, such as an enum.
这是一个不好的论点。良好的API设计使用专用错误类型进行错误报告,例如枚举。
Instead of having some hobbyist-level API like
而不是像一些业余爱好者级别的API
int do_stuff (int a, int b); // returns -1 if a or b were invalid, otherwise the result
you should have something like:
你应该有类似的东西:
err_t do_stuff (int32_t a, int32_t b, int32_t* result);
// returns ERR_A is a is invalid, ERR_B if b is invalid, ERR_XXX if... and so on
// the result is stored in [result], which is allocated by the caller
// upon errors the contents of [result] remain untouched
The API would then consistently reserve the return of every function for this error type.
然后,API将始终为此错误类型保留每个函数的返回值。
(And yes, many of the standard library functions abuse return types for error handling. This is because it contains lots of ancient functions from a time before good programming practice was invented, and they have been preserved the way they are for backwards-compatibility reasons. So just because you find a poorly-written function in the standard library, you shouldn't run off to write an equally poor function yourself.)
(是的,许多标准库函数滥用返回类型进行错误处理。这是因为它包含了许多古老的函数,这些函数来自于良好的编程实践发明之前的时间,并且由于向后兼容的原因它们被保留了它们的方式因为你在标准库中找到写得不好的函数,所以你不应该自己编写一个同样糟糕的函数。)
Overall, it sounds like you know what you are doing and giving signedness some thought. That probably means that knowledge-wise, you are actually already ahead of the people who wrote those posts and guides you are referring to.
总的来说,听起来你知道自己在做什么,并且给了签名一些想法。这可能意味着知识方面,你实际上已经领先于你所指的那些帖子和指南的人。
The Google style guide for example, is questionable. Similar could be said about lots of other such coding standards that use "proof by authority". Just because it says Google, NASA or Linux kernel, people blindly swallow them no matter the quality of the actual contents. There are good things in those standards, but they also contain subjective opinions, speculations or blatant errors.
例如,谷歌风格指南是值得怀疑的。关于许多使用“权威证明”的其他此类编码标准也可以说类似。仅仅因为它说谷歌,美国国家航空航天局或Linux内核,无论实际内容的质量如何,人们都盲目地吞下它们。这些标准中有好的东西,但它们也包含主观意见,推测或明显的错误。
Instead I would recommend referring to real professional coding standards instead, such as MISRA-C. It enforces lots of thought and care for things like signedness, type promotion and type size, where less detailed/less serious documents just skip past it.
相反,我建议改用真正的专业编码标准,例如MISRA-C。它对签名,类型提升和类型大小等内容强制执行了大量的思考和关注,其中不太详细/不太严肃的文档只是跳过它。
There is also CERT C, which isn't as detailed and careful as MISRA, but at least a sound, professional document (and more focused towards desktop/hosted development).
还有CERT C,它不像MISRA那样详细和细致,但至少是一个健全的专业文档(更侧重于桌面/托管开发)。
1
From the C FAQ:
来自C FAQ:
The first question in the C FAQ is which integer type should we decide to use?
C FAQ中的第一个问题是我们应该决定使用哪种整数类型?
If you might need large values (above 32,767 or below -32,767), use long. Otherwise, if space is very important (i.e. if there are large arrays or many structures), use short. Otherwise, use int. If well-defined overflow characteristics are important and negative values are not, or if you want to steer clear of sign-extension problems when manipulating bits or bytes, use one of the corresponding unsigned types.
如果您可能需要较大的值(大于32,767或小于-32,767),请使用long。否则,如果空间非常重要(即,如果有大型阵列或许多结构),请使用short。否则,使用int。如果明确定义的溢出特性很重要而负值不重要,或者如果要在操作位或字节时避免出现符号扩展问题,请使用相应的无符号类型之一。
Another question concerns types conversions:
另一个问题涉及类型转换:
If an operation involves both signed and unsigned integers, the situation is a bit more complicated. If the unsigned operand is smaller (perhaps we're operating on unsigned int and long int), such that the larger, signed type could represent all values of the smaller, unsigned type, then the unsigned value is converted to the larger, signed type, and the result has the larger, signed type. Otherwise (that is, if the signed type can not represent all values of the unsigned type), both values are converted to a common unsigned type, and the result has that unsigned type.
如果操作涉及有符号和无符号整数,则情况稍微复杂一些。如果无符号操作数较小(可能我们在unsigned int和long int上操作),那么较大的有符号类型可以表示较小的无符号类型的所有值,那么无符号值将转换为较大的有符号类型,结果有更大的签名类型。否则(即,如果有符号类型不能表示无符号类型的所有值),则两个值都将转换为公共无符号类型,结果具有无符号类型。
You can find it here. So basically using unsigned integers, mostly for arithmetic conversions can complicate the situation since you'll have to either make all your integers unsigned, or be at the risk of confusing the compiler and yourself, but as long as you know what you are doing, this is not really a risk per se. However, it could introduce simple bugs.
你可以在这里找到它。因此,基本上使用无符号整数,主要用于算术转换可能会使情况复杂化,因为您必须使所有整数无符号,或者冒着使编译器和您自己混淆的风险,但只要您知道自己在做什么,这本身并不是一个风险。但是,它可能会引入简单的错误。
And when it is a good to use unsigned integers? one situation is when using bitwise operations:
什么时候使用无符号整数是一件好事?一种情况是使用按位运算:
The <
> operator shifts its first operand right. If the first operand is unsigned, >> fills in 0 bits from the left, but if the first operand is signed, >> might fill in 1 bits if the high-order bit was already 1. (Uncertainty like this is one reason why it's usually a good idea to use all unsigned operands when working with the bitwise operators.) <<运算符将其第一个操作数左移第二个操作数给出的位数,在右侧填入新的0位。类似地,> >运算符将其第一个操作数右移。如果第一个操作数是无符号的,>>从左边填充0位,但如果第一个操作数是有符号的,如果高位已经是1,则>>可能填充1位。(这样的不确定性是一个原因在使用按位运算符时,使用所有无符号操作数通常是个好主意。)
taken from here And I've seen this somewhere:
从这里开始我已经看到了这个地方:
If it was best to use unsigned integers for values that are never negative, we would have started by using unsigned int in the main function
int main(int argc, char* argv[])
. One thing is sure, argc is never negative.如果最好使用无符号整数来表示从不为负的值,我们就可以在main函数int main中使用unsigned int(int argc,char * argv [])。有一件事是肯定的,argc永远不会消极。
EDIT:
As mentioned in the comments, the signature of main
is due to historical reasons and apparently it predates the existence of the unsigned keyword.
正如评论中所提到的,main的签名是由于历史原因,显然它早于unsigned关键字的存在。
0
Unsigned intgers are an artifact from the past. This is from the time, where processors could do unsigned arithmetic a little bit faster.
无符号整数是过去的工件。这是从处理器可以更快地执行无符号算术的时间开始。
This is a case of premature optimization which is considered evil.
这是一个被认为是邪恶的过早优化的情况。
Actually, in 2005 when AMD introduced x86_64 (or AMD64, how it was then called), the 64 bit architecture for x86, they brought the ghosts of the past back: If a signed integer is used as an index and the compiler can not prove that it is never negative, is has to insert a 32 to 64 bit sign extension instruction - because the default 32 to 64 bit extension is unsigned (the upper half of a 64 bit register gets cleard if you move a 32 bit value into it).
实际上,2005年当AMD推出x86_64(或AMD64,它是如何被称为),x86的64位架构时,它们带来了过去的重影:如果有符号整数用作索引而编译器无法证明它永远不会是负数,必须插入一个32到64位的符号扩展指令 - 因为默认的32到64位扩展是无符号的(如果你将32位值移入其中,64位寄存器的上半部分会被清除) 。
But I would recommend against using unsigned in any arithmetic at all, being it pointer arithmetic or just simple numbers.
但我建议不要在任何算术中使用无符号,无论是指针算术还是简单的数字。
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
for(unsigned int i = foo.Length() - 1; i> = 0; --i){}
Any recent compiler will warn about such an construct, with condition ist always true or similar. With using a signed variable you avoid such pitfalls at all. Instead use ptrdiff_t
.
任何最近的编译器都会警告这样的构造,条件总是为真或类似。使用带符号的变量可以避免这些陷阱。而是使用ptrdiff_t。
A problem might be the c++ library, it often uses an unsigned type for size_t
, which is required because of some rare corner cases with very large sizes (between 2^31 and 2^32) on 32 bit systems with certain boot switches ( /3GB windows).
一个问题可能是c ++库,它经常使用size_t的无符号类型,这是必需的,因为在具有某些启动开关的32位系统上有一些非常大的大小(在2 ^ 31和2 ^ 32之间)的罕见极端情况(/ 3GB的窗户)。
There are many more, comparisons between signed and unsigned come to my mind, where the signed value automagically gets promoted to a unsigned and thus becomes a huge positive number, when it has been a small negative before.
还有更多,我认为签名和无符号之间的比较,其中签名值自动提升为无符号,因此在之前是一个小的负数时变成一个巨大的正数。
One exception for using unsigned
exists: For bit fields, flags, masks it is quite common. Usually it doesn't make sense at all to interpret the value of these variables as a magnitude, and the reader may deduce from the type that this variable is to be interpreted in bits.
使用无符号存在的一个例外是:对于位字段,标志,掩码,这是很常见的。通常,将这些变量的值解释为幅度根本没有意义,并且读者可以从该类型中推断出该变量将以位来解释。
The result will never be a negative value (as the section number, by the way). So why use a signed integer for this?
结果永远不会是负值(顺便说一下,作为节号)。那么为什么要使用有符号整数呢?
Because you might want to compare the return value to a signed value, which is actually negative. The comparison should return true in that case, but the C standard specifies that the signed get promoted to an unsigned in that case and you will get a false instead. I don't know about ObjectiveC though.
因为您可能希望将返回值与有符号值进行比较,这实际上是负值。在这种情况下,比较应该返回true,但是C标准指定在这种情况下签名的get被提升为unsigned而你将获得false。我不知道ObjectiveC。